I want to read the some content pdf files. I just started before getting into the stuff I just want to know what the right approach to do so. ItextSharp reader may be helpful in that case, so I converted the pdf into text using:
public static string pdfText(string path)
{
PdfReader reader = new PdfReader(path);
string text = string.Empty;
for(int page = 1; page <= reader.NumberOfPages; page++)
{
text += PdfTextExtractor.GetTextFromPage(reader,page);
}
reader.Close();
return text;
}
I'm still wondering if this approach seems OK, or if I should convert this pdf into excel and then read the content which I want instead.
Professionals thoughts will be appreciated.