im trying to parse a pdf file and get its metadata and text.I still don't get the wanted results. I am sure it is a silly mistake, but i cant see it.The file d.pdf exists and it is located in the project's root folder.The imports are also correct.
public class MultiParse {
public static void main(final String[] args) throws IOException,
SAXException, TikaException {
Parser parser = new AutoDetectParser();
File f = new File("d.pdf");
System.out.println("------------ Parsing a PDF:");
extractFromFile(parser, f);
}
private static void extractFromFile(final Parser parser,
final File f ) throws IOException, SAXException,
TikaException {
BodyContentHandler handler = new BodyContentHandler(10000000);
Metadata metadata = new Metadata();
InputStream is = TikaInputStream.get(f);
parser.parse(is, handler, metadata, new ParseContext());
for (String name : metadata.names()) {
System.out.println(name + ":\t" + metadata.get(name));
}
}
}
OUTPUT:No errors, but ..not much either:(
------------ Parsing a PDF:
Content-Type: application/pdf