I'm currently writing a program in Java to extract metadata from multiple document type. At the moment I'm trying to extract metadata from .vsd files using Apache Tika. I previously tried using Apache POI directly, but the fact is it's very hard to find any documentation on this unusued part of the library, so I decided to go with Tika.
Ok, so here is the code sample I'm crashing on ( crash at line : 7) :
ParseContext context = new ParseContext();
Metadata metadata = new Metadata();
WriteOutContentHandler handler = new WriteOutContentHandler(10 * 1024 * 1024);
try {
FileInputStream fis = new FileInputStream(fileName);
OfficeParser officeParser = new OfficeParser();
officeParser.parse(fis, handler, metadata, context);
String[] metadataNames = metadata.names();
// Display all metadata
for (String name : metadataNames) {
System.out.println(name + ": " + metadata.get(name));
}
} catch (FileNotFoundException E) {
System.out.println("No such files : " + fileName);
}
And here is the stacktrace :
Exception in thread "main" java.lang.RuntimeException: TODO at org.apache.poi.hdgf.pointers.PointerFactory.createPointer(PointerFactory.java:45) at org.apache.poi.hdgf.HDGFDiagram.(HDGFDiagram.java:99) at org.apache.poi.hdgf.extractor.VisioTextExtractor.(VisioTextExtractor.java:55) at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:200) at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:161) at VsdFile.displayMetadata(VsdFile.java:43) at main.main(main.java:26) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)
I'm pretty rusty in Java, so I hope my question is not too obvious to answer to.
Thank you.
Regards,
Bdloul