TLDR
How to match a set of sets against a single set and bind that to the corresponding rows?
Given a row that has a linked summary table with key/value pairs describing properties of that row, and a bunch of search-descriptions (target) that describes how to sum up stuff from that row, how can I find which search-descriptions match a given row, based on matching the properties-table against the key/value pairs in the search-description?
Simplified example:
CREATE TABLE TargetKeyValue(TargetId INT, TargetKey NVARCHAR(50), TargetValue NVARCHAR(50))
CREATE TABLE OriginalRows(Id INT, Cost DECIMAL, BunchOfOtherCols NVARCHAR(500),
CONSTRAINT [PK_Id] PRIMARY KEY CLUSTERED ([Id] ASC))
CREATE TABLE RowKeyValue(RowId INT, KeyPart NVARCHAR(50), ValuePart NVARCHAR(50),
CONSTRAINT [FK_RowId_Id] FOREIGN KEY (RowId) REFERENCES OriginalRows(Id))
INSERT INTO OriginalRows VALUES
(1, 55.5, 'Some cool red coat'),
(2, 80.0, 'Some cool green coat XL'),
(3, 250.00, 'Some cool green coat L'),
(4, 100.0, 'Some whiskey'),
(5, 42.0, 'This is not a match')
INSERT INTO RowKeyValue VALUES
(1, 'Color', 'Red'),
(1, 'Size', 'XL'),
(1, 'Kind', 'Coat'),
(2, 'Color', 'Green'),
(2, 'Size', 'XL'),
(2, 'Kind', 'Coat'),
(3, 'Color', 'Green'),
(3, 'Size', 'L'),
(3, 'Kind', 'Coat'),
(4, 'Color', 'Green'),
(4, 'Size', 'Medium'),
(4, 'Kind', 'Whiskey')
INSERT INTO TargetKeyValue VALUES
(55, 'Color', 'Red'),
(56, 'Color', 'Green'),
(56, 'Size', 'XL'),
(57, 'Kind', 'Coat'),
(58, 'Color', 'Green'),
(58, 'Size', 'Medium'),
(58, 'Kind', 'Whiskey')
This gives the following tables:
-- table OriginalRows
Id Cost BunchOfOtherCols
1 56 Some cool red coat
2 80 Some cool green coat XL
3 250 Some cool green coat L
4 100 Some whiskey
5 42 This is not a match
-- table RowKeyValue
RowId KeyPart ValuePart
1 Color Red
1 Size XL
1 Kind Coat
2 Color Green
2 Size XL
2 Kind Coat
3 Color Green
3 Size L
3 Kind Coat
4 Color Green
4 Size Medium
4 Kind Whiskey
-- table TargetKeyValue
TargetId TargetKey TargetValue
55 Color Red
56 Color Green
56 Size XL
57 Kind Coat
58 Color Green
58 Size Medium
58 Kind Whiskey
Expected outcome
The function below will give this correct outcome:
Id Cost BunchOfOtherCols IsTargetMatch TargetKeyId
1 56 Some cool red coat 1 55
2 80 Some cool green coat XL 1 56
3 250 Some cool green coat L 1 57
4 100 Some whiskey 1 58
5 42 This is not a match 0 NULL
In other words:
- Bind the original row id to the target-id that it first matches (I'm ok with a join returning multiple times if that's easier)
- Show the original row whenever it does not match
- A match is true if a group belonging to one target-ids matches the same value of a given original row
Current approach with cursors... alas
The code below uses cursors, but this proves slow (understandably so since it's basically just a non-indexed table scan over and over).
Another approach I tried was using XML PATH queries, but that turned out to be a non-starter (it was easy, but too slow as well).
I'm aware this is a non-trivial task in relation databases, but I hope there's still a rather straightforward solution. What I have below kinda works, and I might just use a batch process to store the results or something, unless there's a better way to do this using SET
operations or, idunno, FULL JOIN
?
Any solution that can be used in a view (i.e., not involving dynamic SQL or calling SP's) is fine. We used to have an SP-based solution but since data needs to be analyzed in PowerBI and other systems, SQL Views and determinism are the way to go.
Here's a fully working minimal example of what I'm after. The function is the part that I'm looking at to replace with a less procedural and more functional, i.e. set-based approach:
CREATE TABLE TargetKeyValue(TargetId INT, TargetKey NVARCHAR(50), TargetValue NVARCHAR(50))
CREATE TABLE OriginalRows(Id INT, Cost DECIMAL, BunchOfOtherCols NVARCHAR(500),
CONSTRAINT [PK_Id] PRIMARY KEY CLUSTERED ([Id] ASC))
CREATE TABLE RowKeyValue(RowId INT, KeyPart NVARCHAR(50), ValuePart NVARCHAR(50),
CONSTRAINT [FK_RowId_Id] FOREIGN KEY (RowId) REFERENCES OriginalRows(Id))
INSERT INTO OriginalRows VALUES
(1, 55.5, 'Some cool red coat'),
(2, 80.0, 'Some cool green coat XL'),
(3, 250.00, 'Some cool green coat L'),
(4, 100.0, 'Some whiskey'),
(5, 42.0, 'This is not a match')
INSERT INTO RowKeyValue VALUES
(1, 'Color', 'Red'),
(1, 'Size', 'XL'),
(1, 'Kind', 'Coat'),
(2, 'Color', 'Green'),
(2, 'Size', 'XL'),
(2, 'Kind', 'Coat'),
(3, 'Color', 'Green'),
(3, 'Size', 'L'),
(3, 'Kind', 'Coat'),
(4, 'Color', 'Green'),
(4, 'Size', 'Medium'),
(4, 'Kind', 'Whiskey')
INSERT INTO TargetKeyValue VALUES
(55, 'Color', 'Red'),
(56, 'Color', 'Green'),
(56, 'Size', 'XL'),
(57, 'Kind', 'Coat'),
(58, 'Color', 'Green'),
(58, 'Size', 'Medium'),
(58, 'Kind', 'Whiskey')
GO
CREATE FUNCTION dbo.MatchTargetAgainstKeysFromRow
(
@rowid INT
)
RETURNS @MatchResults TABLE(
IsTargetMatch BIT,
TargetKeyId INT)
AS
BEGIN
--
-- METHOD (1) (faster, by materializing the xml field into a cross-over lookup table)
--
-- single row from activities as key/value pairs multi-row
DECLARE @rowAsKeyValue AS TABLE(KeyPart NVARCHAR(1000), ValuePart NVARCHAR(MAX))
INSERT INTO @rowAsKeyValue (KeyPart, ValuePart)
SELECT KeyPart, ValuePart FROM RowKeyValue WHERE RowId = @rowid
DECLARE @LookupColumn NVARCHAR(100)
DECLARE @LookupValue NVARCHAR(max)
DECLARE @TargetId INT
DECLARE @CurrentTargetId INT
DECLARE @IsMatch INT
DECLARE key_Cursor CURSOR
LOCAL STATIC FORWARD_ONLY READ_ONLY
FOR SELECT TargetKey, TargetValue, TargetId FROM TargetKeyValue ORDER BY TargetId
OPEN key_Cursor
FETCH NEXT FROM key_Cursor INTO @LookupColumn, @LookupValue, @TargetId
WHILE @@FETCH_STATUS = 0
BEGIN
SET @IsMatch = (SELECT COUNT(*) FROM @rowAsKeyValue WHERE KeyPart = @LookupColumn AND ValuePart = @LookupValue)
IF(@IsMatch = 0)
BEGIN
-- move to next key that isn't the current key
SET @CurrentTargetId = @TargetId
WHILE @@FETCH_STATUS = 0 AND @CurrentTargetId = @TargetId
BEGIN
FETCH NEXT FROM key_Cursor INTO @LookupColumn, @LookupValue, @TargetId
END
END
ELSE
BEGIN
SET @CurrentTargetId = @TargetId
WHILE @@FETCH_STATUS = 0 AND @IsMatch > 0 AND @CurrentTargetId = @TargetId
BEGIN
FETCH NEXT FROM key_Cursor INTO @LookupColumn, @LookupValue, @TargetId
IF @CurrentTargetId = @TargetId
SET @IsMatch = (SELECT COUNT(*) FROM @rowAsKeyValue WHERE KeyPart = @LookupColumn AND ValuePart = @LookupValue)
END
IF @IsMatch > 0
BEGIN
-- we found a positive matching key, nothing more to do
BREAK
END
END
END
DEALLOCATE key_Cursor -- deallocating a cursor also closes it
INSERT @MatchResults
SELECT
(CASE WHEN (SELECT COUNT(*) FROM @rowAsKeyValue) > 0 THEN 1 ELSE 0 END),
(CASE WHEN @IsMatch > 0 THEN @CurrentTargetId ELSE NULL END)
RETURN
END
GO
-- function in action
select * from OriginalRows
cross apply dbo.MatchTargetAgainstKeysFromRow(Id) fn
-- cleanup
drop function dbo.MatchTargetAgainstKeysFromRow
drop table TargetKeyValue
drop table RowKeyValue
drop table OriginalRows