1

i've been working with Solr for a while, i recently tried the solr-cell component and i'm indexing some PDFs, however im having the exact same problem presented in this thread.

When I search for *:* in the admin console, the PDFs are listed. However when I search for content within the PDF I get no results.

I already tried the command from the answer given there with no luck, im still having the same problem, i've tried with different Solr versions (i'm using 3.5 btw), different PDFs, i've changed the fields in the schema.xml, i've modified the RequestHandlers in solrconfig.xml but nothing seems to work. Any help would be any appreciated.

Community
  • 1
  • 1
jag
  • 11
  • 1
  • 3
    please post your schema, the command or code you're using to index, and the query. – Mauricio Scheffer Feb 07 '12 at 01:48
  • 1
    "i've changed the fields in the schema.xml" The schema Solr ships with includes the correct fields for Solr CEL. As for the `q=*:*`, can you search inside the fields returned by the output? – Jesvin Jose Feb 07 '12 at 09:55

1 Answers1

0

I got it working finally. It turns out it was a problem with the fmap.content input parameter. I didn't declare it directly on the RequestHandler in the solrconfig.xml file, instead I was passing it in the curl command I was using to index the PDF file:

curl 'http://localhost:8080/solr/solrcell/update/extract?map.content=text&map.stream_name=id&commit=true' -F "file=@mccm.pdf"

I know this way should work too but as you can see there was a 'map' instead of 'fmap' (I was using a book example from a previous version of solr).

I opted for leave the fmap input parameter explicitly declared in the solrconfig.xml file to save me any problems:

<str name="fmap.content">text</str>


Thanks for your help.

logancautrell
  • 8,762
  • 3
  • 39
  • 50
jag
  • 11
  • 1