0

Problem: I need to do a full text search on a field that has multiple rows (one to many).


I have these schema:

Company (15 MILLION+++)

-id

-companyname


Category

-companyid

-categoryname (W/ FULLTEXT INDEX)


Sample Data:

Company

id | companyname

1 | CompanyA

Category

companyid | categoryname

1 | banking 1 | finance


Goal: I need to get all companies that offers "banking finance" but by using IN BOOLEAN MODE and EXACT MATCH.

The query below will not have a result since "banking" and "finance" are on separate rows:

SELECT c.companyname FROM Company c
INNER JOIN Categoy cat ON c.id = cat.companyid
WHERE MATCH (categoryname) AGAINST ('+Banking +Finance' IN BOOLEAN MODE)
LIMIT 100


My Solution: I restructured the schema, I added a new table named "Categories" which has all the concatenated category from "Category" (with fulltext index) and off course a companyid. This does the job and this is the best / efficient way I can think of.

I have already tried different approaches like: - group_concat + like - union + full text

I am still wondering if there is a better way than my solution?

john
  • 93
  • 9

1 Answers1

0

This may work. You can use a having clause to count how many times each word has a match. It will show only ones that have at least one match for each word.

SELECT c.companyname 

FROM   Company c
       INNER JOIN Categoy cat ON c.id = cat.companyid

GROUP BY c.id

HAVING SUM(CASE WHEN MATCH (categoryname) AGAINST ('+Banking' IN BOOLEAN MODE) THEN 1 ELSE 0 END) > 1
       AND
       SUM(CASE WHEN MATCH (categoryname) AGAINST ('+Finance' IN BOOLEAN MODE) THEN 1 ELSE 0 END) > 1

LIMIT 100
Tom
  • 6,593
  • 3
  • 21
  • 42
  • Hi thank you for your response. However, I have 15 million+ records/companies and this will be very slow. My solution to create a new table w/ csv categories is very much faster than this. – john Jun 21 '13 at 07:33