1

Could someone give me step by step instructions as to how to re-index data from Solr after I changed the schema field types, without deleting the data ?

E.g. is there a way of copying the data of a Solr core and reindexing the same data after changing the schema?

I have around 60k documents in Solr.

blah
  • 674
  • 3
  • 17
  • 1
    https://lucene.apache.org/solr/guide/8_7/reindexing.html - _There is no process in Solr for programmatically reindexing data. When we say "reindex", we mean, literally, "index it again". However you got the data into the index the first time, you will run that process again. It is strongly recommended that Solr users index their data in a repeatable, consistent way, so that the process can be easily repeated when the need for reindexing arises._ – MatsLindh Feb 10 '21 at 13:45
  • 1
    Also see https://lucene.apache.org/solr/guide/8_7/reindexing.html#reindexing-strategies - usually you'll either index to a new collection and swap them over afterwards, or delete everything and index into the previous core. – MatsLindh Feb 10 '21 at 13:46
  • Yes, I read the documentation. The issue here is that I cannot delete what has been indexed already as the work has been passed on to me from someone else who only stored that copy of the data into Solr. Would exporting a core data into a JSON or XML file and indexing this file back into an empty core work ? – blah Feb 10 '21 at 13:51
  • 2
    Yes, that's what you'll have to do, as long as every field necessary has been set as `stored` in the index. Otherwise you might not have any way to get the original data back (if it's only used for searching). Export it and store it somewhere proper - either in an RDBMS (if it's going to be changed) or as JSON documents (if it's more static). You'll probably want to look at the `/export` endpoint. – MatsLindh Feb 10 '21 at 13:52
  • @MatsLindh your comments summarise the possibilities. I have been through this process (reindexing) and we use an RDBMS as main source and have been able to reindex the entire solr core containing millions of docs from the source without losing the ability to query solr. The reindexing took a while though and required is to write a custom coffee implementation. If you can write up s proper answer with your points, I can upvote – vvs Feb 11 '21 at 19:38

0 Answers0