0

Say I have an Image table with a meta column in JSON:

Id Meta
1 { "size": 80, "effect": "blur" }
2 { "size": 200, "optimize": true }
3 { "color": "#abcdef", "ext": ".jpg" }

And I have a dynamic param of table type like so

Key Value
size 200
optimize true

How should I write my query to filter the rows in which the Meta column's key-value pairs matched all the values in the param table?

SELECT Id
FROM Image
WHERE (
  --?? all keys and values matched the param table
)
Mark Rotteveel
  • 100,966
  • 191
  • 140
  • 197
icube
  • 2,578
  • 1
  • 30
  • 63
  • Have you looked at these links: https://learn.microsoft.com/en-us/sql/t-sql/functions/json-value-transact-sql?view=sql-server-ver15 and https://learn.microsoft.com/en-us/sql/relational-databases/json/json-path-expressions-sql-server?view=sql-server-ver15 – wosi Jan 27 '22 at 10:31
  • Do all the key/values have to match the input KVs? Or is it only that all KVs in the input have to be matched but there may also be other keys in the JSON which do not? – Charlieface Jan 27 '22 at 10:47
  • @Charlieface yeah for now, my use-case is to match all KVs but I do see a potential use-case where we also want to partially match all the KVs in the input table param. thanks for your answer! – icube Jan 27 '22 at 12:03
  • Sorry wasn't making clear: option 1: match all rows which have `size=200` and `optimize=true`, ignoring any other KVs. Option 2: match all rows which *only* have exactly those KVs and no others – Charlieface Jan 27 '22 at 12:18
  • @Charlieface yeah sorry I was also confused myself, we are going for Option 1, how do we handle Option 2 then? – icube Jan 27 '22 at 13:04
  • For option 2, you can change my first query in the `HAVING` clause to `HAVING COUNT(*) = (SELECT COUNT(*) FROM @tmp)` and change the `LEFT JOIN` to `INNER JOIN` – Charlieface Jan 27 '22 at 14:15

1 Answers1

0

This is a type of relational division (with remainder) question, with the extra twist of shredding JSON at the same time.

There are a number of solutions to this type of question. One common solution is to LEFT JOIN the divisor to the dividend, group it and check for any non-matches:

DECLARE @tmp TABLE (
  "Key" NVARCHAR(8) COLLATE Latin1_General_BIN2,
  "Value" NVARCHAR(4) COLLATE Latin1_General_BIN2
);

INSERT INTO @tmp
  ("Key", "Value")
VALUES
  ('size', '200'),
  ('optimize', 'true');
  
SELECT *
FROM Image i
WHERE EXISTS (SELECT 1
    FROM @tmp t
    LEFT JOIN OPENJSON(i.Meta) j ON t.[Key] = j.[key] AND t.Value = j.value
    HAVING COUNT(j.value) = COUNT(*)  -- all match
);

Another solution is to use a double NOT EXISTS: there are no key/value input pairs which do not have a match

DECLARE @tmp TABLE (
  "Key" NVARCHAR(8) COLLATE Latin1_General_BIN2,
  "Value" NVARCHAR(4) COLLATE Latin1_General_BIN2
);

INSERT INTO @tmp
  ("Key", "Value")
VALUES
  ('size', '200'),
  ('optimize', 'true');
  
SELECT *
FROM Image i
WHERE NOT EXISTS (SELECT 1
    FROM @tmp t
    WHERE NOT EXISTS (SELECT 1
        FROM OPENJSON(i.Meta) j
        WHERE t.[Key] = j.[key] AND t.Value = j.value
    )
);

db<>fiddle

YMMV as to which solution is faster.

Charlieface
  • 52,284
  • 6
  • 19
  • 43
  • for the enforcing of COLLATE Latin1_General_BIN2 in the tmp table, is it because of the deserialization of OPENJSON function? – icube Jan 27 '22 at 13:26
  • Yes correct, because otherwise you get a collation error. You can force the collation within the query, but that is less efficient. It's better to keep the table type in the right collation to start iwth – Charlieface Jan 27 '22 at 14:14