0

I am new to Solr and I am trying to understand its behavior during a re-index. I have a batch process running that selects data from a relational table and adds it to a Solr index.

From what I understand reading about Solr, there are two cases when you need to do a re-index

Case 1: When new rows gets inserted into your table (source data) Case 2: When any of column type changes and you have to change the schema accordingly.

Does the old data remain available in Case 1 for users to search against while the re-index is happening?

What happens during a schema change as the old data will no longer match the new schema ? What kind of behavior will users experience when they perform a search ?

I could not find any clear answers to these questions online. Any clarification is appreciated.

vish
  • 5
  • 2

1 Answers1

0

Case 1. Solr marks the document as deleted but it stays in the index, it adds a new doc with the same document id. So, yes, the data is available until the new document is committed.

Case 2. If you update the schema the documents from the old data will still be available but any deleted fields will not be visible and any new fields will be missing. If you think about it, an indexed field is just a series of tokens, so these fields will still be searchable but there may be in inconsistency in the new query analysis and the tokens in the index giving surprising results and the scoring may also be affected. Basically your results could be inconsistent.

To give an example: say you do a phonetic filter on a word: Fox and it produces the tokens: fux | foks in your index.

You then remove the phonetic filter and type fox - there will be no matches with what is in your index.

Say you have another field with a Porter Stemmer: The term indexed gets stemmed to : index

You remove the PorterStemmer: index will still match, indexed won't.

David George
  • 3,693
  • 1
  • 18
  • 22
  • for Case 2, if that particular field has it's type changed due to a schema change and that field is setup for display in the search results ... will the results return null for that field or not return the field at all. – vish Mar 20 '17 at 17:29
  • It depends on the field types. Changing the fundamental type from say a string to an integer will give you a ERROR:SCHEMA-INDEX-MISMATCH. Changing from a string analyzed one way to something different will still pull the string value back because the basic type is the same. – David George Mar 22 '17 at 09:13