1

I am indexing posts in SOLR with "name", "title", and "description" fields. I'd like to later be able to add a file (like a Word doc or a PDF) using Tika / the ExtractingRequestHandler.

I know I can add documents like so: (or through other interfaces)

curl 'http://localhost:8983/solr/update/extract?literal.id=post1&commit=true' -F "myfile=@tutorial.html"

But this replaces the correct post (post1 above) -- is there a parameter I can pass to have it only add to the record?

javanna
  • 59,145
  • 14
  • 144
  • 125
Matt Hampel
  • 5,088
  • 12
  • 52
  • 78

1 Answers1

6

In Solr (ver < 4.0) you can't modify fields in a document. You can only delete or add/replace whole documents. Therefore, when "appending" a file to the Solr document you have to rebuild your document from its current values (using literal), i.e. query for the document and then:

http://localhost:8983/solr/update/extract?literal.id=post1&literal.name=myName&literal.title=myTitle&literal.description=myDescription&commit=true
Brad
  • 229
  • 2
  • 9
Mauricio Scheffer
  • 98,863
  • 23
  • 192
  • 275
  • This curl request may be too long (there may be many field values that I want to append alongwith the file contents). Is there a way to get the contents of the file and then add it to the solr document and then commit the whole document? – xan Jul 12 '13 at 07:37
  • @ptokya that's a question about `curl` rather than Solr. You should create a new, specific question about that. – Mauricio Scheffer Jul 12 '13 at 14:11
  • @MauricioScheffer: Here's my full specific question: http://stackoverflow.com/questions/17609690/index-pdf-file-content-using-apache-solr – xan Jul 12 '13 at 16:29