With libraries like iTextSharp or iText you can extract metadata from PDF documents via a PdfReader:
using (var reader = new PdfReader(pdfBytes))
{
return reader.Metadata == null ? null : Encoding.UTF8.GetString(reader.Metadata);
}
These kind of libraries completely parse the PDF document before being able to soup up the metadata. This will, in my case, lead to high usage of system resources since we get many requests per second, with large PDF's.
Is there a way to extract the metadata from the PDF without completely loading it in memory first?