1

I'm using Tika to extract Metadata from many types (images, video, etc...) using the AutoDetectParser. It works quite well and returns some metadata in a fully qualified naming style "tiff:XResolution" (xmp).

But if I compare a Metadata extraction from Tika and a rdf extraction from exiftool for instance, I can see that Tika doesn't return as many metadata.

Is there a programmatic way to retrieve every XMP metadata with Tika, as exiftool does?

Edited : With AutoDetectParser, I get this:

"date" : "2019-10-11T23:19:50Z",
"X-Parsed-By" : "org.apache.tika.parser.DefaultParser",
"created" : "2019-10-11T23:19:50Z",
"ImageLength" : "540",
"Last-Modified" : "2019-10-11T23:19:50Z",
"Last-Save-Date" : "2019-10-11T23:19:50Z",
"audioSampleRate" : "25",
"save-date" : "2019-10-11T23:19:50Z",
"duration" : "6.84",
"ImageWidth" : "960",
"creation-date" : "2019-10-11T23:19:50Z",
"modified" : "2019-10-11T23:19:50Z",
"Content-Type" : "video/mp4"

But, I need to get these tags:

XPAuthor
Artist
Copyright
Keywords
XPKeywords

When I test with exifTool, I get:

enter image description here


Edit: The version of Apache Tika I'm using is v1.24.

RobC
  • 22,977
  • 20
  • 73
  • 80

0 Answers0