


String outfilePDFExcel1 = fileNames.Replace((System.IO.Path.GetFileName(fileNames)), string outfilePDF1 = fileNames.Replace((System.IO.Path.GetFileName(fileNames)), (System.IO.Path.GetFileName(fileNames).Replace(".pdf", "") + "_rez" + ".csv")) Excel.Application app = new Excel.Application() ĬSVExtractor extractor = new CSVExtractor() String ext = System.IO.Path.GetExtension(fileNames) Var matches = emailRegex.Matches(content) Public IEnumerable ExtractEmails(string content) Here is example (not perfect) of reading PDF with iTextSharp and extracting emails: public string PdfToString(string fileName)įor (int page = 1 page. If you don't need rock solid solution, it would be easiest to read all PDF to a string and then retrieve emails using regular expression. I recommend iTextSharp as it is popular and easy to use. If it contains textual data, then you are right about using "some free dll". You absolutely do not have to convert PDF to Excel.įirst of all, please determine whether your PDF contains textual data, or it is scanned image.
