I would like to index files in Solr. I have already made an "output script" with PHP, but my project leader has given me the task of displaying the page number of the found text.
So: - I am searching for the Word "Foo". - Solr returns the results and also the highlighted text. - Now I would like to know on which page this highlighted text is, to find it.
The files are *.pdf files.
One solution I have thought of would be to import the Text of the PDF Files in different fields? Or maybe in this one multivalued field named "content".
Maybe like this:
Json:
content:
1: "page one text",
2: "page two text"
and so on?
Is this possible? Or is there a better way to find this information out? Thanks for your help! :-)