We have a SQL Server database table with about 60million records. These are records of names and addresses of specific entities. Table contains following columns:
[Name] [nvarchar](425) NOT NULL,
[Street] [nvarchar](900) NULL,
[City] [nvarchar](900) NULL,
[State] [nvarchar](900) NULL,
[PostalCode] [nvarchar](100) NULL
What we have to achive is to be able to perform specific select statement under 1 second.
We should be able to select records based on whether '[Name]' contains one or several entered words (not 'exact match' and not 'starts with') and then apply next prioritization logic:
- Display on top records that are located in given [State] and [City]
- Display items that are in given [State] but another city
- Display items that are located in other states
Here is what we have tried:
- We tried to rebuild table in variety ways, extracting different column in different table, different sets of indexes, extracting every word as a token in separate folder
- SQL Server full text search. (to match records with 'Contains' function)
- Azure Cosmos DB. We migrated data there to evaluate if we could perform selects efficient enough
The problem is always to prioritize records based on state+city
The question is how we could achieve ability to perform selects under 1 second for this case using SQL Server or any other Data Source (preferably available on Azure)