1

The Apache Tika REST server provides for a PDF document with password status code 422 (Unprocessable Entity). If the file format is unsupported, 422 is sent as well.

Unfortunately, it is not ppssible to distinguish whether the metadata of a file could not be determined due to the encryption or the format.

When I call the file through the Tika app, I get either the message "encrypted file" or "format not valid" in the console.

Unfortunately, the result header also contains no additional information.

Example:

HTTP / 1.1 422 Unprocessable entity
Date: Fri, 11 May 2018 12:21:28 GMT
Content-Length: 0
Server: Jetty (8.y.z-SNAPSHOT)

Is there a way to get an additional description of error 422 after a REST call? Preferably via an extension of the header data.

Many Thanks, greetings Oliver

Oliver
  • 11
  • 2
  • Run the Tika Server with the `--includeStack` option and check the first line of the 422 response body? – Gagravarr May 11 '18 at 13:25
  • @Gagravarr: Thank you very much. That helped a lot! I am wondering if there is no hint on the apache tika documentation. Knowing the parameter I found it on Github as well. Bye! – Oliver May 14 '18 at 14:38
  • The `--help` option will tell you about it! I'm wondering if, since password protected is a *well known and expected* exception, it might be good to give a different response code. Probably one to post on the Tika Dev list and see what everyone thinks - https://lists.apache.org/list.html?dev@tika.apache.org – Gagravarr May 14 '18 at 14:55

0 Answers0