Group by based on specific column contains value from list of values

Question

I've one table myTable

ID	Content
1	Hello, this is the test content
2	Hi, test content.

I have one list having different values = ["Hello","Hi","Yes","content"]
Now I have to find occurrence of value in myTable-> content column & resultant table have value & count of that value in myTable-> content column (one row of myTable table can have more than one values & use case-insensitive search). Output be like:

Value	Count
Hello	1
Hi	1
Yes	0
content	2

I want to make optimal SQL server query.

@Het Shah There were like 2 questions about word split in sql server tag YESTERDAY. So why not first try browsing the existing answers. Hints: https://stackoverflow.com/questions/75315453/extract-words-from-a-column-and-count-frequency , split, words, group by. — siggemannen, Feb 03 '23 at 11:47
While there is a solution using STRING_SPLIT and a wildcard search, I would seriously consider using the full text search feature of SQL Server for this. Especially as you asked about the "optimal" query. https://learn.microsoft.com/en-us/sql/relational-databases/search/full-text-search?view=sql-server-ver16 — Daniel Bürckner, Feb 03 '23 at 12:00

score 0 · Answer 1 · answered Feb 03 '23 at 14:03

Assuming you are using SQL Server 2016 or above, you could try converting your list to a table like structure, and perform a left join and count on your table.

For instance :

CREATE TABLE MyTable (
  ID INT CONSTRAINT PK_MyTable PRIMARY KEY,
  Content NVARCHAR(MAX)
  );

INSERT INTO MyTable (ID,CONTENT) VALUES
  (1,'Hello, this is the test content'),
  (2,'Hi, test content.');



DECLARE @MyList NVARCHAR(MAX)

SET @MyList='["Hello","Hi","Yes","content"]';

SELECT 
  List.Value,
  COUNT(MyTable.Content) Count 
FROM OPENJSON(@MyList) List --Convert the list to a json
LEFT JOIN MyTable ON '.' + UPPER(MyTable.Content) + '.' LIKE '%[^a-z]' + UPPER(List.Value) +'[^a-z]%'
GROUP BY List.Value;

You can try it on this fiddle.

Please do note that there is margin for improvement, such as full text index instead of this ugly regular expression clause.

See also : Search for “whole word match” with SQL Server LIKE pattern

Group by based on specific column contains value from list of values

1 Answers1