1

I wanted to use the 'Boost-Function' feature on my search (which in other cases and fields works just fine) in order to boost documents with higher 'timestamp value', in order to improve the relevancy score for these documents but not necessary, absolutely sort by timestamp. Unfortunately I encountered this error while trying to do so:

"error":{
    "msg":"Type mismatch: timestamp was indexed as SORTED_NUMERIC",
    "trace":"java.lang.IllegalStateException: Type mismatch: timestamp was indexed as SORTED_NUMERIC

The 'bf' query was:

sum(div(timestamp,100000000))

(Kind of silly, but just wanted to see if it even works).

Thanks for the help!

YotamL
  • 43
  • 5
  • Did you change the field type after indexing any documents without deleting the index content first? (for example changing multiValued to `false`). I'd also expect the `sum` function to require at least two arguments. – MatsLindh Apr 26 '20 at 18:05
  • Sorry for the mistake with the sum function :) anyway, yes, i added a "multivalued=false" attribute to this field on my managed-schema file, since my initial error was: "msg":"can not use FieldCache on multivalued field: timestamp", I've removed that attribute and it's back to this message now. any idea what I should do to fix this? thanks! – YotamL Apr 30 '20 at 07:31
  • 1
    You can do what you did - but making that change requires you to delete everything in your index and then re-index your content. That way the expected type (from your schema) matches what's in the index, and the function will work. Most changes to existing types in the schema require you to reindex your content (except for `query` and `multiterm` sections in analysis chains). – MatsLindh Apr 30 '20 at 08:07
  • OK, that makes sense. Thanks! What is the best and safest approach for re-indexing that data? – YotamL Apr 30 '20 at 08:23
  • That depends on how you got the data into the index in the first place. Solr isn't well suited as a primary data store, so reindexing should always be done from the primary source as necessary. – MatsLindh Apr 30 '20 at 08:43
  • Your answers are always helpful, thanks! Would you say that creating a replication, then deleting the entire data, then restoring that replication would do the trick? (as detailed here: https://lucene.apache.org/solr/guide/8_4/making-and-restoring-backups.html) – YotamL Apr 30 '20 at 08:50
  • No, I don't think that would change anything. That would restore the index files and configuration as they were, it would not run the original data through the analysis chain again. There are many field configurations where the original data is not available at all (i.e. where `stored="false"`) and in those cases (for text, in particular) it might not be possible to know what the original data before processing was. Tokenization and filtering is often a lossy process. – MatsLindh Apr 30 '20 at 09:02
  • The only solution that worked for me was definitely more of a workaround than anything else but for whoever this may be useful; it's easier to simply create a completely new field (i.e. "final_timestamp") and define it in the managed schema with 'multiValued="false" ' , and for every document, give that field the value of the original field "timestamp". – YotamL May 13 '20 at 05:03

0 Answers0