0

Using PHP and MySQL I need to find all "two word" combinations ordered by count from a table that has over a million rows.

The search needs to find how many instances of each two word combination, like "hardware store", "to the", "second chance", "for those", "senior citizens", etc.

Examples of text to search:

id | content
1  | The senior citizens went to the hardware store from the community center.

2  | The hardware store is offering a second chance drawing for those senior citizens who did not win.

3  | Many senior citizens go to the town's community center to play bingo.

Sample results:

senior citizens - 3
to the - 2
hardware store - 2
community center - 2
second chance - 1
The senior - 1
center to - 1
the town's - 1
etc ...and so on.

Results need to include all "two word" combinations. "The senior", went to", "the hardware", citizens went", etc., and the count of how many times found.

I'm guessing this might be a multiple query solution with sub queries, but my query building expertise is minimal. I have tried some basic queries, but am thinking the solution is going to be a bit more complex than my skill set.

Similar question with different data source.

Community
  • 1
  • 1
Mark
  • 647
  • 2
  • 9
  • 27
  • Splitting up strings like this is really hard in MySQL. You're better off just retrieving everything into PHP, using `explode()` to split it into words, putting all the two-word combinations into an array, and using `array_count_values()`. – Barmar Jul 02 '15 at 23:21
  • @Barmar - Yes I was thinking the same, but did not know about array_count_values() ...that makes it much simpler. I have a rough coding of it working now. If you post your comment as an answer then I will accept it. – Mark Jul 03 '15 at 21:06

1 Answers1

0

Try a Union All join:

SELECT count(*) FROM your_table WHERE content LIKE '%senior citizens%'
UNION ALL
SELECT count(*) FROM your_table WHERE content LIKE '%to the%'
UNION ALL
SELECT count(*) FROM your_table WHERE content LIKE '%hardware store%'
Daniel Li
  • 287
  • 2
  • 17
  • Thanks, but with over a million rows, that's a lot of UNIONs. I got a working solution now partly based on Barmar's comment. – Mark Jul 03 '15 at 21:07