I need to find similar records in SQL Server 2016, and it would be great if I could order result by how much they match. Match criteria for similar are 2 detail tables.
My table structures are:
//Basic:
Id Title
1 Title 1
2 Title 2
3 Title 3
4 Title 4
5 Title 5
//Parameters:
Id Title
1 Param 1
2 Param 2
3 Param 3
4 Param 4
5 Param 5
//Values:
Id Value
1 Val 1
2 Val 2
3 Val 3
4 Val 4
5 Val 5
And connection tables:
BasicId ParameterId ValueId
1 1 2
1 3 1
1 4 5
2 1 1
2 2 4
2 3 2
3 1 2
3 3 1
3 4 4
4 1 2
4 4 2
4 5 2
5 1 1
5 2 5
5 3 3
At some point in my program I create new Basic row (in memory) with Parameters and values. Now I need to know:
- Is there any record in database which match exactly the same parameters and values.
- Similar records ordered by how many parameters and values matches.
For first point I do it like this: 2 rows are equal if all parameters and there values matches. For now I calculate string with Ids of all parameters and values and add it to Basic table.
Basic becomes:
Id Title EqualString (this comes from connection table)
1 Title 1 1:2;3:1;4:5
2 Title 2 1:1;2:4;3:2
3 Title 3 1:2;3:1;4:4
4 Title 4 1:2;4:2;5:2
5 Title 5 1:1;2:5;3:3
This works great, since order is defined by ParameterId ASC and I have index on EqualString column. At this point I should say, that my basic table have arround 5 million records and growing.
But for second problem, I don't have a clue, how this kind of problem could be solved.
For example:
Input Title X1 1:1;2:4;3:2 100% Title 2
33% Title 5
Input Title X2 1:1;2:4;3:1 66% Title 2
33% Title 5
Input Title X3 1:2;3:1;5:5 66% Title 1
66% Title 3
Input Title X4 1:2;3:1;5:3 66% Title 3
33% Title 5
I use SQL Server 2016 and have complete control of computer on which database is installed.
There are around 100 parameters-values to check if they match (in upper example only 3 - they are separated with ;)
I think full text search is not the answer (or maybe it is), because I need exact match between 2 commas (2:5 for example).
Or I should go with completely different approach without calculating EqualString (on insert witch is rarely).
Use case:
I am building web page where product that sell is dynamic. That means, that user configure it before it buys.
For example: set height, width, colour, choose material ...
Now I need to know if this product already exists (someone else choose exact same parameters (width, height, colour) and values (100, 120, green) or this is completely new product.
Parameters which was configured are in parameters table and values are in values table.