Solr, better to store IDs for facets, or text?

Question

I'm storing a relational database as an index in solr, and wondering how to store the categories, or facets of my database?

Currently it works like this in mysql:

Category1 (ID:3) -> Category2 (ID:53) -> Product

I'm wondering would it be more efficient to store the ID, or store the actual text of the Category1 name in the Solr index?

When SOLR gives me back the facets I would have a list of IDs instead of text, so I could easily work out from my mysql store which Category related to which ID and build a navigation menu that way?

My feeling is that it is better to use IDs because these might be quicker in some way (storing less information), but I'm not sure, how do other people do it?

Check this answer http://stackoverflow.com/a/43328483/2320144 — Ashraful Islam, Apr 11 '17 at 09:43
But my point is that if solr is just acting as a giant cache for mysql, why would I bother caching the actual text in SOLR? Surely it would be better to cache the ID's in SOLR and then retrieve the text (Category1, from MYSQL) the ID relates to when the page is loaded? I mean Solr has to count all the occurances, so surely it will be quicker with numbers than text? SOLR seems to be useful here only for retrieving the list of product data, not for storing the text contained in all the relational tables. — Ke., Apr 11 '17 at 09:58
If you want to search category you need to store it, otherwise you can store only the category id, faceting with int will surely be faster than string — Ashraful Islam, Apr 11 '17 at 10:11
Ahh I see, so it's a trade-off? I guess I could use the ID for navigation, and then add another text field which contains the string (which is only used for full text search)? I guess will be slower to add both on input, but quicker for user requests to the interface? — Ke., Apr 11 '17 at 10:17
Is there any reason you want to store Ids there ? Are you using any kind of internationalization ? Which faceting algorithm are you using ? I am not that sure that applying a mapping after Solr returned the faceting results, will be faster than just rely on internal solr string faceting. — Alessandro Benedetti, Apr 11 '17 at 12:20
Internally in Solr, faceting on `int`and `String` are handled differently, so conceivable there could exist a setup where it would be fastest to facet on `int`, then look up the `String` values in MySQL. But I doubt it. The `String`s are de-duplicated when indexed in Solr, so it will take about the same space as an `int`-solution and it will probably be way faster overall to let Solr return the right `String`s directly, instead of using the convoluted secondary lookup. — Toke Eskildsen, Apr 11 '17 at 13:29

Solr, better to store IDs for facets, or text?

0 Answers0