I am using solr3.6 with tika1.2 but I can't upload pdf files.
First I install solr and upload some *.xml files from the exampledocs.
This files I could search with this URL http://localhost:8983/solr/select/?q=solr
.
And in the next step I install tika to upload pdf and doc files but it doesn't function.
The following content is in the "example/solr/conf/solrconf.xml" file.
<requestHandler name="/update/extract" startup="lazy" class="solr.extraction.ExtractingRequestHandler" >
<lst name="defaults"><str name="fmap.content">text</str><str name="lowernames">true</str>
<str name="uprefix">ignored_</str>
<str name="tika.config">tika-data-config.xml</str>
<str name="captureAttr">true</str>
<str name="fmap.a">links</str>
<str name="fmap.div">ignored_</str>
</lst>
</requestHandler>`
And in the file "example/solr/conf/tika-data-config.xml" I have this content:
<dataConfig>
<dataSource name="bin" type="BinFileDataSource" />
<document>
<entity name="f" dataSource="null" rootEntity="false" processor="FileListEntityProcessor" transformer="TemplateTransformer" baseDir="/home/ubuntu-user/Documents" fileName=".*\.(DOC)|(PDF)|(pdf)|(doc)|(docx)|(ppt)" onError="skip" recursive="true">
<field column="fileAbsolutePath" name="path" />
<field column="fileSize" name="size" />
<field column="fileLastModified" name="lastmodified" /><entity name="tika-test" dataSource="bin" processor="TikaEntityProcessor" url="${f.fileAbsolutePath}" format="text" onError="skip">
<field column="Author" name="author" meta="true"/>
<field column="title" name="title" meta="true"/>
</entity>
If I put this lines in the console
curl http://localhost:8983/solr/update/extract?literal.id=doc2&uprefix=attr_&fmap.content=attr_content&commit=true" -F "myfile=@test.pdf"
I get this output
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">183</int>
</lst>
</response>
But I can't search the content with solr. If I browse to this url: http://localhost:8983/solr/browse
, I see a new entry but no content.
Also I started the solr and tika server:
java -jar start.jar
java -jar tika-server-1.2.jar
Can anyone help me ?