4

I'm writing an application in C# using Full Text Search queries in SQL Server 2014. The user needs to be able to search the database using a very large query containing up to 100 000 parameters. For instance:

FTS Query:

SELECT * FROM Client WHERE contains(Name, '"John" OR "Sarah"') 

LIKE query:

SELECT * FROM Client WHERE Name in ('John', 'Sarah')

Except instead of 2 names, we have 100 000 names. Obviously, this exceeds the maximum query length supported by SQL. The database contains a bit over 1 million records. In some cases I will need to find client names starting with a specified string, which is why FTS is preferable over "in" or "like" clauses. Is there some clever way of bypassing that other than batching the requests into many chunks? Are there any better alternative to using an SQL database?

I've also tried implementing the search using Lucene.NET, but generating an index for over 1 million records takes many hours, which is not acceptable, as the data will be synchronized daily.

  • 1
    Why not extract all the data inside a datatable and searching the data polling in the datatable rows? A 1KK records select shouldn't take to mutch to be execute. After that search your keys in Rows.Itemarray – Legion May 07 '19 at 14:45
  • It's generally a bad idea to have so many parameters inserted into a query, doing this will harm your disks, tempdb, memory and cause millions of unnecessary reads and writes and possibly cause fragmented databases on the server. Why cant you use JOINS to connect the Client table to a Name table? I may have misunderstood? – Richard May 07 '19 at 14:52
  • @Richard There is no Name table. The client names I'm searching for come from user input. I could create a temporary table to store them, but FTS cannot take a table column as a search parameter. – Andrzej Wieśliwski May 07 '19 at 16:23
  • @Legion It might be a bit better than batching the requests, I guess I will try it out and see how much memory overhead it will produce. Thanks! – Andrzej Wieśliwski May 07 '19 at 16:27
  • @Andrzej OK how about: Create a permanent table for names as (Name [nvarchar],[SearchInstance UNIQUEIDENTIFIER]) on executing the search, generate a Guid, insert all the names into the Names table with your Guid. Then use the Guid to join your Client table to the Names table WHERE Names.SearchInstance = @ Guid. Effectively adding all the parameters into a separate table first, then joining to it to produce the result? Happy to explain this in a full answer if your think it would solve – Richard May 08 '19 at 10:12

1 Answers1

1

When using plain SQL, I suggest to use sp_executesql. This allows you to pass in up to about 2100 parameters, one for each name. SQL Server will be able to re-use the query plan for that query - no recompile needed - so it's fast.

By creating batches of 2100 names you need 48 batches; the latency for that should be reasonably low. Plus you can start them in parallel if you use multiple connections.

https://learn.microsoft.com/en-us/sql/relational-databases/system-stored-procedures/sp-executesql-transact-sql?view=sql-server-2017

Consider snapshot isolation or 'WITH NOLOCK' to prevent delays during data-reload.

https://learn.microsoft.com/en-us/dotnet/framework/data/adonet/sql/snapshot-isolation-in-sql-server

https://www.mssqltips.com/sqlservertip/2470/understanding-the-sql-server-nolock-hint/

Finally make sure you set the proper indexes and use SQL Profiler to inspect the actual query plans to verify that is the case.

Ries Vriend
  • 977
  • 7
  • 9
  • Wouldn't that require to execute multiple FTS search queries? Unlike IN query, the contains query takes in all search parameters as a single string that cannot exceed 8000 characters. – Andrzej Wieśliwski May 07 '19 at 16:33
  • I’m presuming your statement would be structured like this: SELECT * FROM Client WHERE Name in (@p1, @p2) – Ries Vriend May 07 '19 at 18:45
  • The statement can be 2GB long, see https://learn.microsoft.com/en-us/sql/relational-databases/system-stored-procedures/sp-executesql-transact-sql?view=sql-server-2017 – Ries Vriend May 07 '19 at 19:04