2

I'm trying to implement a search for an online store, the requirements are the following:

  1. If the user only searches a category name, return the category's page
  2. If the user searches both a category and brand, return a search page with the category and brand filter applied
  3. If the user searches for a value that matches a product exactly, return the product's page
  4. If we matched multiple products across multiple categories and brands, return the results.

My question is, it is possible to accomplish this using a single Lucene index or should I use multiple indexes and search in all of them? As far as I understood, Lucene has no relationships so I can't represent something like category -> brand -> model.

Thank you!

Zephy
  • 144
  • 3
  • 12
  • 1
    These should be two separate questions as they are not related except underlying technology. Just edit this one to be one of the questions and move the other content into a new question. That will increase the likelihood that people will answer them :-) – RonC Feb 25 '21 at 15:02
  • @RonC Thank you for your suggestion, I have split it into two different questions. – Zephy Feb 26 '21 at 11:03

1 Answers1

2

My question is, it is possible to accomplish this using a single Lucene index or should I use multiple indexes and search in all of them?

You can definitely accomplish this in a single LuceneNet index. Be aware that what is typically referred to as a "Lucene Index" is really a collection of indexes given that multiple fields can be indexed.

Another thing to know is that Lucene indexes "documents" and it imposes no common structure on those documents. One document may have 2 fields (lets say categoryId, categoryName) and another document may have 4 fields, (let's say productId, productName, productCategory, allProductFields). Lucene is totally fine with that. And if categoryName is an indexed field then you can search by that field and will only get back documents that contain that field and match the query. Ditto if querying allProductFields.

Developers may think of these documents as being two types of documents, a category document and product document. To Lucene they are all just documents. But it's sometimes useful to add to all documents a field that indicates its "document type" as you see it. So for example you could choose to add a docType field to every document and when creating a document from a product you might set that fields value to "product" and when creating a document from a category you might set its value to "category".

Having such a field then makes it possible to query only product documents or to query only category document. If there are otherwise no field names shared between the documents then having such a field is not strictly necessary. But let's say both category documents and product documents had a field named name then a search on the name field could pull up either type of document and having a docType field could prove useful to distinguish the types of results returned, or it could be used as part of the search criteria to search only one type of document.

Hopefully this will give you some ideas of how a single "Lucene Index" can be used to accomplish the various tasks you desire.

As far as I understood, Lucene has no relationships so I can't represent something like category -> brand -> model.

Well, it's true that Lucene documents do not inherently have relationships with other documents. But you can certainly chose to put unique keys on your documents and a docType then you can create your own relationships by, for example, putting the categoryId on in the product document and later using that to pull back the product's related category document for each product returned in a search. So it's kind of a roll your own sort of thing.

There is also a thing called BlockJoinQuery which is a bit more complicated and has some limitations. You an learn about it a bit from this SO Question and Answers and google around on the internet about it.

And finally, Lucene has faceting support. Actually it has two implementations of faceting support. One of them uses a "side-car" index, (ie a sister index) and this implementation supports hierarchical facets. Hierarchical facets would be a much more advanced way to represent your category -> brand -> model hierarchy. If that's of interest to you the conference session Faceted Search with Lucene is something you are probably going to want to watch.

RonC
  • 31,330
  • 19
  • 94
  • 139