Tika returning empty string when deploying on wildfly

Question

I am using tika-parsers as part of a web application

<groupId>org.apache.tika</groupId>
<artifactId>tika-parsers</artifactId>
<version>1.11</version>

and had problems deploying it on wildyfly (8.2.1 and 10.0.0.RC4). This was resolved by adding a jboss-all.xml containing:

<jboss xmlns="urn:jboss:1.0">
    <weld xmlns="urn:jboss:weld:1.0" require-bean-descriptor="true"/>
</jboss>

But now tika returns empty Strings for e.g. pdf or ms office files. I assume it is falling back to the EmptyParser. Text files seem to work.

This is my simple test code that works correctly when being run as a junit test.

AutoDetectParser parser = new AutoDetectParser();
BodyContentHandler handler = new BodyContentHandler(9000000);
Metadata metadata = new Metadata();
parser.parse(entry.getValue(), handler, metadata);
String s = handler.toString();

Did you try following the [Apache Tika Troubleshooting guide for "No Content"](http://wiki.apache.org/tika/Troubleshooting%20Tika#No_Content_Extracted)? How far did you get through that? — Gagravarr, Dec 23 '15 at 13:53
It is showing the correct version ("Apache Tika 1.11") and detects the mimetype of my files correctly but still uses the org.apache.tika.parser.EmptyParser for e.g. pdf and doc. — Philipp, Dec 23 '15 at 15:13
What about the parser checks - did they show you as having all the parsers you'd expect as available? — Gagravarr, Dec 23 '15 at 20:01
Multiple parsers show including those i expected, but still the EmptyParser is chosen when testing. The only strange thing i found is, that every parsers is listed twice while iterating... — Philipp, Jan 11 '16 at 19:00
I got it. Seems the way i iterated through my streams did not work correctly. Thanks for your help though! — Philipp, Jan 14 '16 at 08:33

Tika returning empty string when deploying on wildfly

0 Answers0