1

I am new to using Solr , and I have made a new core and copied the default schema.xml to the conf/ folder. The changes I have made is very trivial .

<field name="id" type="string" indexed="true" stored="false" required="true" multiValued="false" /> 

As you can see, I set the id field to stored=false. As per my understanding, the field id should not be displayed now when I do a query search. But that is not happening. I have tried restarting solr instance, and did the query to index the file again.

curl 'http://localhost:8983/solr/TwitterCore/update/json?commit=true' 
  --data-binary @$(echo TwitterData_Core_Conf/TwitterText_en_demo.json) 
  -H 'Content-type:application

As per Solr Wiki , this should have re-indexed my file. However when I run my query again, I still see the Id .

An example of the document returned (this is not the complete JSON node , I just copied some parts ) :

"text": [
      "RT @FollowTrainTV: Moonseternity just joined #FollowTrainTV - Watch them stream on http://t.co/oMcOGA51kT"
    ],
    "lang": [
      "en"
    ],
    "id": "0a8edfea-68f7-4b05-b370-27b5aba640b7", // I dont want to see this
    "_version_": 1512067627994841000

Maybe someone can give me detailed steps on re-indexing.

YoungHobbit
  • 13,254
  • 9
  • 50
  • 73
CyprUS
  • 4,159
  • 9
  • 48
  • 93

2 Answers2

1

When you change the schema.xml file and restart the solr-server, the changes only apply for new documents. This means you have to clear the index and re-index all documents (Except at query tokenizer, these changes are active immediately after server restart, but this is not the case here). After re-indexing, the id field should not be visible any more.

Another remark: You don't have to test your queries with curl. When you connect to http://localhost:8983/solr with your web-browser you should find an admin interface there. There you can select a core and test your queries.

phylib
  • 148
  • 2
  • 10
  • I tried to do this `curl http://:/solr/update?strea m.body=%3Cdelete%3E%3Cquery%3E*:*%3C/query%3E%3C/delete%3E` but I keep getting 404 error. Exact error is `Problem accessing /solr/update` . I don't know what is wrong now. – CyprUS Sep 12 '15 at 15:22
  • The easiest thing would be to delete the `data/` folder from the index on the `solr-server`. After deleting this folder and restarting the server you can reindex your documents. Then you shouldn't see the ID any more. – phylib Sep 12 '15 at 20:20
  • @CyprUS You can use this command to clear the solr indexes: `curl http://localhost:8983/solr/core_name/update?commit=true -H 'Content-Type:text/xml' --data-binary '*:*'` – YoungHobbit Sep 13 '15 at 08:18
  • 1
    my mistake was not to add the core name. thanks @abhishekbafna – CyprUS Sep 13 '15 at 21:02
1

Refer to this https://lucene.apache.org/solr/guide/6_6/docvalues.html document.

Non-stored docValues fields will be also returned along with other stored fields when all fields are specified to be returned (e.g. “fl=*”) for search queries depending on the effective value of the useDocValuesAsStored parameter for each field. For schema versions >= 1.6, the implicit default is useDocValuesAsStored="true".

The String field type has docValues="true" . That is the reason why it is appearing in the search response.

You can either add the useDocValuesAsStored="false" parameter to the field or you can use a different fieldType, say text_general.