1

I am building a website with articles section, I am looking for a search functionality that will search the article title and the article content for results that contain all search term, and then will return article result that contains the search term words such as the following example: Search Term: "Facebook tweaks News Feed to clamp down on 'spammy' and misleading advertising".

It will be ideal to return all result that matches the exact term, then search by more relevant result such as "Facebook tweaks News Feed" etc.

The following is the code I did so far:

"p.Title,p.ID,p.Publish_Date, p.Sponsored, p.Featured, p.Seo_Link, p.Content, 
                (SELECT GROUP_CONCAT(t.Tag_ID)
                FROM Tag_Post_Relationship t
                WHERE p.ID  =  t.Post_ID) AS Tags, 
                MATCH (Title, Content) AGAINST ('".$search_term."'IN BOOLEAN MODE) AS Relevance FROM Posts p WHERE NOT Post_Type = 'p'   AND Publish_Date < '{$dateNow}'   AND Visibility = 'p' AND
                (MATCH (p.Title, p.Content) AGAINST ('".$search_term."' IN BOOLEAN MODE))

                ORDER BY Relevance DESC"
Belal Almassri
  • 119
  • 1
  • 8

1 Answers1

0

To do this effectively in an SQL database you need to implement a fuzzy text-comparison algorithm such as Levenshtein distance as a function or proc you can call. This will allow you to rank your articles based on how closely they match your search term. An example of Levenshtein distance in SQL is here: http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=66781.

You use the algorithm in a stored procedure which is executed when a search is requested. The stored proc would compare the search term to each title and article and order the results by the closest match, which would be 1 for an exact match.

Outside of SQL there are also numerous packages for advanced text analysis that can be used at the API level. I've used Python's Jellyfish and NLTK libraries with good results.

Another option is to use something like Apache Solr (http://lucene.apache.org/solr/features.html) or Elastic Search which provides a ready-to-go API for full text search across documents and data. This will add complexity to your architecture though.

dataHead
  • 61
  • 4
  • Thank you, no big budget has been assigned to this project, therefore, I am looking for a simple and quick SQL solution – Belal Almassri Jun 06 '17 at 12:09
  • Then it will need to be the fuzzy text comparison function. The link I posted is a decent example. Just be aware the performance will likely be poor with a large data set because the functions are pretty complex and SQL engines are generally not optimized for high-volume partial text searching. – dataHead Jun 06 '17 at 12:41