3

I am new user to SOLR. I am working on an E-commerce web application which have SQL database. I want to implement SOLR for my "category page" in application where we will show products of that category with specific information like available stock , price and few more details. Also we want to restrict product display on basis of stock availability, if there is no stock then we wont display those products.

I am trying to implement SOLR with Delta import queries to make my category pages faster. And my concern is about performance of page while getting data from SOLR and accuracy of real time data like Stock and Price.

actually my database queries to get product data are little bit complicated and have several joins, so i have to make several related entities in SOLR database. So due to this data upload to SOLR is slow. that makes difficult to upload data (even with delta import query) frequently, so my application lacking real time data like stock of products.

basically i want to know best practice approach to implement SOLR. I am confused with 1. Should i export my all data to SOLR and then get all details from SOLR? (I am worried about performance and real time data) 2. should i get only index data (say id of products) from SOLR and then get othere details from my SQL database? (not sure about this approach perfomance).

So please help me and suggest me that how can i implement SOLR for my app in best way.

all help is appretiated!!

K24
  • 53
  • 2
  • 9

2 Answers2

1

You can set up a database trigger or data access layer event to send data to Solr whenever there's a change, and configure autoCommit to control freshness.

See also:

Mauricio Scheffer
  • 98,863
  • 23
  • 192
  • 275
  • thanks Mauricio, i will check those links. But i can't do any changes in my database schema or data. So have to deal with whatever it is. Also product details are divided in several tables, so i would have to create many triggers on different tables. and i think it will again a performance factor for database. – K24 Aug 31 '11 at 04:32
  • @K24: Sorry, I meant the **Solr** schema, not the database schema. Even if you have joins in your relational database, you have to flatten the data when populating the Solr index. – Mauricio Scheffer Aug 31 '11 at 05:00
1

One good case practice I encountered while developing e-commerce solution and SOLR as the search provider is to retrieve from SOLR only the IDs and get the data from SQL server.

I created a single schema that was updating every time some new products were added in the database.(In my case products were added by a user in an admin console, in your case you can use the @Mauricio Scheffer comment to have latest updates)

One field in the schema was the ID - representing the ID of the product in the database. After querying SOLR I was receiving N documents suiting the query and with the ID field I was getting all the information from my database and display it to the user.

So the good part is that the user will always get the data from your database ( real time data), and will have his search results displayed very fast because of SOLR.

You can add to the schema different fields that you can use to filter your results such as

  • category
  • date
  • price
  • size
  • etc...

And also differnet fields that you need to query upon:

  • headline
  • description
  • name
  • etc...

And also add the product ID.

So after making the query to SOLR you have a list of product IDs, now the only thing you need to do is to implement a function that will get the products from database based on the ID and display it on the search results page.

This approach performance is pretty good because selecting from database elements based on primary key is the fastest way to retrieve data from sql.

I worked on a website with 1.000.000 products and searching time was always under 0.2 seconds and page loading time in the browser was under 0.6 seconds on single user queries. ( the server where SOLR and SQL was running was 8Gb ram and quad core 1.8 gb as I remember)

Hope this type of implementation is useful for you.

Dorin
  • 2,482
  • 2
  • 22
  • 38
  • This solution **does not (by itself) yield real time data** as you claim, you still run queries against Solr which may be outdated w.r.t. the data in the relational database. – Mauricio Scheffer Aug 29 '11 at 17:07
  • Yes it does, read again the second paragraph. Also when commiting data to solr I was updating all the data that was modified since the last commit ( Updated, deleted, added ) – Dorin Aug 30 '11 at 06:58
  • "real time" in this context means data freshness in Solr, and I don't see where you address that issue. – Mauricio Scheffer Aug 30 '11 at 13:15
  • _I created a single schema that was **updating every time** some new products were added in the database_... Why to you think the data in SOLR is not fresh if I update it every time products are added to SQL database ??? (not the schema I update but the index of SOLR core using that schema ... maybe this is the misunderstanding) – Dorin Aug 30 '11 at 15:19
  • I missed that part, my bad! I apologize, bad day. – Mauricio Scheffer Aug 30 '11 at 15:23
  • Thanks Dorin for your detailed answer. Seems a good approach to use SOLR and database both. will check it. – K24 Aug 31 '11 at 04:36