0

I'm moving from one server to another, and I've installed a fresh copy of Solr 6.6.0. I have it all working, apart from the synonyms. This is an example of what I have in my synonyms.txt file:

cartoon, comic, cartoons, funny, drawing, sketch, draw, drawings, draw

I have restarted solr, and then tested with:

((keywords:"cartoon") OR (description:"cartoon"))

However, it gives no results. If I search for:

((keywords:"cartoons") OR (description:"cartoons"))

...then I get results. Do I need to do something else to enable the synonyms?

Here is the schema contents: https://pastebin.com/eV3emAjv

Here is my synonyms.txt file: https://pastebin.com/TjYxEfbi

Interestingly, it DOES seem to work on a much smaller scale. If I just put this in the file:

cartoon, comic, cartoons, funny, drawing, sketch, draw, drawings, draw

...restart Solr, and voila it works (31,000 results). However, as soon as I put the rest of the contents back in I get nothing. There must be something in my synonyms.txt file that is causing it to not parse it correctly (or something like that). Is there no way to debug that file? I have over 1000 rules that would need checking one by one otherwise - not something I'm too keep on the idea of!

UPDATE: I have tracked it down to one line. If I comment this out, it works fine (took a lot of removing, reloading, testing, etc etc):

clipart, clip-art, image, art, graphics, clip, images, picture, pictures, vemultimedia, cartoon, royalty+free, royalty-free

Any ideas why it wouldn't like that one?

UPDATE 2: I have found the problem - but now I'm not too sure what the solution is. Basically, we had 2 lines that have the word "cartoon" in:

cartoon, comic, cartoons, funny, drawing, sketch, draw, drawings, draw

clipart, clip-art, image, art, graphics, clip, images, picture, pictures, vemultimedia, royalty+free, royalty-free, cartoon

After a bit more debugging - I've found that it doesn't seem to like + or - in the words:

royalty+free
royalty-free

Surely this must be possible? Can we use dashes and spaces between words? :/

Andrew Newby
  • 4,941
  • 6
  • 40
  • 81
  • Yes, please show us your schema.xml – Oyeme Aug 16 '17 at 10:45
  • @Oyeme thanks - I've updated my post with a pastebin ( https://pastebin.com/eV3emAjv ) – Andrew Newby Aug 16 '17 at 12:42
  • Both of the syntax you mentioned in your post looks same. But you told that one gives you result when other doesnt. – Jeeppp Aug 16 '17 at 16:08
  • @Jeyaprakash - sorry, the first one should be "cartoon" (no results), and then 2nd one is correct with "cartoons". I have updated the question :) – Andrew Newby Aug 16 '17 at 16:21
  • @Oyeme - I have updated my question with my schema and synonyms content. Any ideas? It seems to work when I just put the cartoon test line in, but then stops when I put all the rest back in :/ – Andrew Newby Aug 17 '17 at 06:02

1 Answers1

0

Does the old server and the new server have same solr versions. If not then you might have to reindex the data based on what was the old version of solr .

jsp
  • 2,546
  • 5
  • 36
  • 63
  • Thanks. The versions are quite different (the other one was setup a good 3-4 years ago, and the new version is one of the latest and much nicer to use!). However, all the data was imported fresh. Do you have to re-index it to make it pick up the changes? If so, that's not a problem - but I just assumed it would pick that up even after the data has been indexed? – Andrew Newby Aug 16 '17 at 18:42
  • If it was reindexed in the new solr then it should work. Can you verify that the configs are correct & your collection is using the configs that have synonyms. – jsp Aug 16 '17 at 22:36
  • thats my problem, I don't know how to verify it :) Is there a tool to validate your configs / xml / synonyms.txt files? – Andrew Newby Aug 17 '17 at 05:47
  • I have updated my question with my schema and synonyms content. Any ideas? It seems to work when I just put the cartoon test line in, but then stops when I put all the rest back in :/ – Andrew Newby Aug 17 '17 at 06:02
  • I have found the problem. Basically we have 2 rules that have "cartoon" in, and the 2nd one seems to overwrite it. Is there any way around this? (I have updated my question) – Andrew Newby Aug 17 '17 at 09:10