solr facets vs mysql relation schema

Question

I am interested to find out what is the best/fastest (most efficient) way to query solr in a solr/mysql/app set up. I have a mysql DB that has one large main table and several smaller tables in a relational schema. I am also building an app which uses the hierarchy and builds menus based upon the data in the related tables.

I started by doing this only in mysql, but quickly found (with gigs of data) that mysql can be quite slow calculating counts on this related data when using joins with the main table etc (even when using mysql indexes). At the moment the approach i am taking is to index my main table with solr and keep the smaller related tables in mysql. For each menu item i will query solr for the count at runtime, which feels like it will be slow..

Is it faster/better to:

1.) Keep the related tables in mysql, also set up facets for each row in the related tables. Link them together somehow when i query the maintable? This sounds like the fastest option, but might be tricky (will have to match 2 different arrays) in my app.

2) Keep the related tables in mysql and call/count the data in the main indexed table for each of the related items at runtime. e.g. for a menu of brands I will need to get a count of each brand, requiring me to send each menu item as a query to solr (to get the counts). I realise each query is quite quick, but there could be a few hundred or thousand brands.

3) Just put all the data into solr and use facets? - but how do I identify each facet and determine the related information in mysql tables show for each facet? Each entry in the related mysql tables has a title, description, formatted url, meta data, should the related info be stored in solr as well? in a different index? in this case, should i get rid of mysql alltogether?

Any ideas on the best (practice) option will be much appreciated, or also any suggestions that i havent thought of would be great.

Cheers Ke

score 5 · Accepted Answer · edited Oct 10 '12 at 11:54

Solr is built for searching - it is optimized for facets and the like, using it's own query language and caching mechanisms. It would seriously surprise me if you would be able to code something (MySQL based) that can beat Solr in terms of the performance of search queries/results.

Have a look at how Solr indexes her data (or actually what you need to feed it in order to get it indexed properly. There is good starter documentation on this. Perhaps you'll end up with 60% Solr indexed and 40% MySQL - either way, from what I read on your plans Solr is very much so worth exploring.

Regular approach for webbased apps:

1) create an index scheme that matches your data and search-desires for Solr

2) create an import of the MySQL data to Solr (there are tools too)

3) create a frontend application that translates user-queries to Solr queries

4) send these queries to Solr

5) parse the result and adapt the queries to create new facets

6) cache whatever you can

HTH, Matt

Hi Matt, yep, ive done all of this. My main question relates to no.5 creating the facets. e.g. Related mysql tables vs Facets. I just cant see much point in using facets if I already have this info stored in related mysql tables. The way im doing it is to get the counts of each related table data from solr. Is there a better way of doing this using facets? — Ke., Mar 15 '10 at 13:58
It's rather hard to say without actually seeing the application and data. Perhaps looking at other MySQL/Solr applications can give you some pointers. Have a look at http://drupal.org/project/apachesolr (the Drupal Solr module) - they work with local (MySQL) data that builds Solr facet queries. — MattW, Mar 16 '10 at 09:57

solr facets vs mysql relation schema

1 Answers1