3

I have a local solr cloud cluster running on three separate nodes: 33.33.3[3-5]:8080 This cluster is managed by a local 3 node zookeeper ensemble that lives at: 33.33.3[0-2]:2181

I am trying to experiment with schema modifications - however, I'm having trouble getting SOLR to pickup the new changes. Here is what I'm doing

First I upload one config set to zookeeper:

/opt/src/solr/scripts/cloud-scripts/zkcli.sh -zkhost 33.33.33.30:2181,33.33.33.31:2181,33.33.33.32:2181 -cmd upconfig -confdir /opt/src/solr/solr/conf/ -confname test_conf

Then I create a collection in SOLR:

http://33.33.33.33:8080/solr/admin/collections?action=CREATE&name=test_collection&numShards=1&replicationFactor=3

This all works fine. Since there is only one config in zookeeper, this is automatically mapped to the collection on creation. Pretty cool.

But now I want to modify the the schema for test_collection. So, I ssh into one of my SOLR boxes, browse to /opt/src/solr/solr/conf/ open schema.xml in vim, and remove a field. Then I upload the config again (using the same name so it overwrites the old config):

/opt/src/solr/scripts/cloud-scripts/zkcli.sh -zkhost 33.33.33.30:2181,33.33.33.31:2181,33.33.33.32:2181 -cmd upconfig -confdir /opt/src/solr/solr/conf/ -confname test_conf

Now I reload the core:

http://33.33.33.33:8080/solr/admin/collections?action=RELOAD&name=test_collection

And zookeeper picks up the changes. I can download the file from zookeeper and the changes are there. I can browse the config in SOLR admin (cloud>tree>configs>schema.xml AND test_collection>files>schema.xml) and the changes are reflected. However, if I hit this route: http://33.33.33.33:8080/solr/test_collection/schema/fields the field is still there. Also, if I go to test_collection>schema browser in the SOLR admin the field is still listed there as well.

What's going on here?

EDIT:

If I look at the logs in SOLR admin I see the following which must be related...

2/23/2015, 3:06:46 PM
WARN
OverseerCollectionProcessor
OverseerCollectionProcessor.processMessage : reloadcollection ,​ {
2/23/2015, 3:06:46 PM
WARN
ManagedIndexSchemaFactory
The schema has been upgraded to managed,​ but the non-managed schema schema.xml is still loadable. PLEASE REMOVE THIS FILE.
2/23/2015, 3:06:46 PM
WARN
RequestHandlers
Multiple requestHandler registered to the same name: /update/json ignoring: org.apache.solr.handler.UpdateRequestHandler
2/23/2015, 3:06:46 PM
WARN
RequestHandlers
Multiple requestHandler registered to the same name: /update ignoring: org.apache.solr.handler.UpdateRequestHandler
2/23/2015, 3:06:46 PM
WARN
RequestHandlers
Multiple requestHandler registered to the same name: /replication ignoring: org.apache.solr.handler.ReplicationHandler
tknickman
  • 4,285
  • 3
  • 34
  • 47

3 Answers3

3

I eventually figured this out after spending so much time with SOLR over the past few months.

Let's break down the problem that I was seeing.

I was uploading a config to zookeeper, creating a collection in solr, and linking the two together. Then I would change the schema - upload it again, reload the solr core - and nothing would happen!

This was, at its core - user error and a misunderstanding of one main feature.

I was using a managed schema within SOLR. This means that I could take advantage of the schema API within the newer versions of SOLR. For anyone who is interested - when you use a managed schema - SOLR actually makes a copy of your schema that it edits - and THIS is where the changes go. Not to your original schema (which is still exposed at http://33.33.33.33:8080/solr/test_collection/schema/fields).

If you want to see that your most recent changes are taking effect. Take a look at the managed-schema file within your config folder in zookeeper.

Thanks everyone for your help.

tknickman
  • 4,285
  • 3
  • 34
  • 47
  • Can you please indicate specifically what it means to 'take advantage of the schema API ' to "reload" the schema? Thanks! – kellyfj Nov 03 '15 at 14:22
  • Actually I figured it out myself from here https://cwiki.apache.org/confluence/display/solr/Schema+API#SchemaAPI-APIEntryPoints Just basically need to use "addFields" REST API endpoint – kellyfj Nov 03 '15 at 15:31
2

I think you are missing the step linkconfig, which links the config set to the collection.

So at the beginning, after upconfig, and before creating the collection, you need to do linkconfig as the following:

/opt/src/solr/scripts/cloud-scripts/zkcli.sh -zkhost 33.33.33.30:2181,33.33.33.31:2181,33.33.33.32:2181 -cmd linkconfig -collection test_collection -confname test_conf

And after that to update the config, you don't have to do linkconfig again, it is enough only to do upconfig, then reload collection as you do. It's just this step is missing at the beginning before creating the collection.

For a complete reference of collection API, you can look here: https://cwiki.apache.org/confluence/display/solr/Collections+API

Emad
  • 544
  • 2
  • 6
  • And for command utility reference, check here: https://cwiki.apache.org/confluence/display/solr/Command+Line+Utilities – Emad Feb 25 '15 at 06:35
  • The config gets linked fine since I only upload one. If you look in the documentation it defaults to existing config IF there is only one. If there is more than one it defaults to the name of the collection. That being said, I have still tried that multiple times, both after uploading and on creation of the collection. It should be noted that the command you posted is for linking an EXISTING collection to the uploaded config – tknickman Feb 25 '15 at 06:43
  • It doesn't link existing collection, if you look at the quick guide here: http://wiki.apache.org/solr/SolrCloudTomcat it says in step 8 before creating the collection that you should linkconfigs. And I always do linkconfig before creating the collection, and updates works fine for me. I still believe that this is the missing step – Emad Feb 25 '15 at 22:55
  • 1
    I will give that another try. But here: https://wiki.apache.org/solr/SolrCloud it says that "if only one 'conf set' exists, a collection will auto link to it." – tknickman Feb 26 '15 at 18:58
0

Probably you have data in your SOLR 'test_collection' that uses the field that you deleted.

Try to clear your test-collection.

user3245803
  • 41
  • 1
  • 3
  • Unfortunately no, I can create a brand new collection and index no data at all but the problem still occurs. – tknickman Feb 25 '15 at 17:51