0

I've one table myTable

ID Content
1 Hello, this is the test content
2 Hi, test content.

I have one list having different values = ["Hello","Hi","Yes","content"]
Now I have to find occurrence of value in myTable-> content column & resultant table have value & count of that value in myTable-> content column (one row of myTable table can have more than one values & use case-insensitive search). Output be like:

Value Count
Hello 1
Hi 1
Yes 0
content 2

I want to make optimal SQL server query.

Thom A
  • 88,727
  • 11
  • 45
  • 75
  • 1
    Honestly, T-SQL isn't the right tool for this. – Thom A Feb 03 '23 at 11:37
  • @Het Shah There were like 2 questions about word split in sql server tag YESTERDAY. So why not first try browsing the existing answers. Hints: https://stackoverflow.com/questions/75315453/extract-words-from-a-column-and-count-frequency , split, words, group by. – siggemannen Feb 03 '23 at 11:47
  • While there is a solution using STRING_SPLIT and a wildcard search, I would seriously consider using the full text search feature of SQL Server for this. Especially as you asked about the "optimal" query. https://learn.microsoft.com/en-us/sql/relational-databases/search/full-text-search?view=sql-server-ver16 – Daniel Bürckner Feb 03 '23 at 12:00

1 Answers1

0

Assuming you are using SQL Server 2016 or above, you could try converting your list to a table like structure, and perform a left join and count on your table.

For instance :

CREATE TABLE MyTable (
  ID INT CONSTRAINT PK_MyTable PRIMARY KEY,
  Content NVARCHAR(MAX)
  );

INSERT INTO MyTable (ID,CONTENT) VALUES
  (1,'Hello, this is the test content'),
  (2,'Hi, test content.');



DECLARE @MyList NVARCHAR(MAX)

SET @MyList='["Hello","Hi","Yes","content"]';

SELECT 
  List.Value,
  COUNT(MyTable.Content) Count 
FROM OPENJSON(@MyList) List --Convert the list to a json
LEFT JOIN MyTable ON '.' + UPPER(MyTable.Content) + '.' LIKE '%[^a-z]' + UPPER(List.Value) +'[^a-z]%'
GROUP BY List.Value;

You can try it on this fiddle.

Please do note that there is margin for improvement, such as full text index instead of this ugly regular expression clause.

See also : Search for “whole word match” with SQL Server LIKE pattern

Random User
  • 341
  • 3
  • 6