3

I am creating price comparison website in php and mysql. so i need to find similar products and map it to create price comparison table. In my database 10,000+ products are available, so to find similar products i have create mysql fulltext index in product name and description columns and compare product with other products, but i can't get desire result, for example if i Search

Search for "Samsung Galaxy A7 (2017) 32gb", get results:

  1. Samsung Galaxy A7 32gb (2017)
  2. Samsung Galaxy A7 smartphone
  3. Samsung Galaxy J7 (2017) 32gb Black
  4. Samsung Galaxy A7 64gb (NEW)
  5. Samsung Galaxy J7 Prime 16gb
  6. Samsung Galaxy J7 (16gb, Black)
  7. Samsung Galaxy J2 Pro (16gb)
  8. Samsung Galaxy A5 (32gb)
  9. Samsung Galaxy S7 Edge 32gb Blue Coral

Note: In result set only first two and fourth product is similar, not other are

Now, i have question:

How to find important keyword from search query, so that keyword are required at least in other product to know similarity? for example in above search query "Samsung Galaxy A7" is important keyword, and that should be required in other product in result. We Human can know, "Samsung Galaxy A7" is important keyword from "Samsung Galaxy A7 (2017) 32gb" to compare with other products, but how can know computer program?

There is any algorithm to compare similar product by important keyword as like sift algorithm in image comparison.

If you have any good method to find similar products, please suggest me

jay padaliya
  • 624
  • 6
  • 12

3 Answers3

1

Take a look around Elasticsearch which is a search engine.

It can do keyword search and return result by score.

You can boost or minus score in some ways, but it is not smart enough to determine the Keywords like Human.

However, it can determine common terms in your documents (such as 'the', 'a'...or '32gb' if there are many models using this term.) and adjust the score of those common terms to lower score. (ES common terms query)

Elasticsearch is powerful (so as its complication), it could have many better solutions to this subject.

Edit: A good article about "common terms query" and "stopwords list". stopwords can exclude the specs permanently if that's the case you want. Sometimes, "common terms query" won't work if the record data is not broad enough. (ex: Samsung Galaxy is more frequent than 32gb)

Goodcake
  • 323
  • 3
  • 10
0

Sounds like a simple LIKE:

SELECT name FROM products WHERE name LIKE 'Samsung Galaxy A7%';

This will select all products which start with "Samsung Galaxy A7". If you want to select also those, who don't start with but just contain that name, change it to

WHERE name LIKE '%Samsung Galaxy A7%'
Zyrim
  • 1
  • 1
  • This is not simple as like you think. how to computer program know to use "Samsung Galaxy A7" keyword for search from query "Samsung Galaxy A7 (2017) 32gb"? – jay padaliya Jul 29 '17 at 11:31
0

i think i would split them by spaces, remove the spaces, and wrap all the words around LIKE %str% AND - something like this:

$search='Samsung Galaxy A7 (2017) 32gb';
$split=preg_split('/\s+/u',$search);
$query='SELECT * FROM tbl WHERE ';
foreach($split as $str){
    $query.='`name` LIKE \'%'.$db->real_escape_string($str).'%\' AND ';
}
$query=substr($query,0,-strlen(' AND '));
var_dump($query);
  • which outputs something like SELECT * FROM tbl WHERE `name` LIKE '%Samsung%' AND `name` LIKE '%Galaxy%' AND `name` LIKE '%A7%' AND `name` LIKE '%(2017)%' AND `name` LIKE '%32gb%' (test code: https://3v4l.org/30llU )

but i bet there's a dedicated search library out there somehere for php, where these problems are already solved... probably

hanshenrik
  • 19,904
  • 4
  • 43
  • 89
  • No, this is not work by any simple sql query by AND or OR. because you are also using "(2017) 32gb" keyword in your sql condition. so product "Samsung Galaxy A7 smartphone" will not get in result, but that product is similar. and you can not use OR in condition because product "Samsung Galaxy J7 Prime 16gb" is not similar. – jay padaliya Jul 30 '17 at 02:36