1

I'm trying to write an sql query that would find the rows in a table that match any value of the provided json array.

To put it more concretely, I have the following db table:

CREATE TABLE mytable (
    name text,
    id SERIAL PRIMARY KEY,
    config json,
    matching boolean
);

INSERT INTO "mytable"(
  "name", "id", "config", "matching"
) 
VALUES 
  (
    E 'Name 1', 50,
    E '{"employees":[1,7],"industries":["1","3","4","13","14","16"],"levels":["1110","1111","1112","1113","1114"],"revenue":[0,5],"states":["AK","Al","AR","AZ","CA","CO","CT","DC","DE","FL","GA","HI","IA","ID","IL"]}', 
    TRUE
  ), 
  (
    E 'Name 2', 63,  
    E '{"employees":[3,5],"industries":["1"],"levels":["1110"],"revenue":[2,5],"states":["AK","AZ","CA","CO","HI","ID","MT","NM","NV","OR","UT","WA","WY"]}', 
    TRUE,
  ), 
  (
    E 'Name 3', 56,
    E '{"employees":[0,0],"industries":["14"],"levels":["1111"],"revenue":[7,7],"states":["AK","AZ","CA","CO","HI","ID","MT","NM","NV","OR","UT","WA","WY"]}', 
    TRUE,
  ),  
  (
    E 'Name 4', 61, 
    E '{"employees":[3,8],"industries":["1"],"levels":["1110"],"revenue":[0,5],"states":["AK","AZ","CA","CO","HI","ID","WA","WY"]}', 
    FALSE
  );

I need to perform search queries on this table with the given filtering params. The filtering params basically correspond to the json keys in config field. They come from the client side and can look something like this:

{"employees": [1, 8], "industries": ["12", "5"]}
{"states": ["LA", "WA", "CA"], "levels": ["1100", "1100"], "employees": [3]}

And given such filters, I need to find the rows in my table that include any of the array elements from the corresponding filter key for every filter key provided.

So given the filter {"employees": [1, 8], "industries": ["12", "5"]} the query would have to return all the rows where (employees key in config field contains either 1 or 8 AND where industries key in config field contains either 12 or 5);

I need to generate such a query dynamically from the javascript code so that I could include/exclude filtering by a certain parameter bu adding/removing the AND operator.

What I have so far is a super long-running query that generates all possible combinations of array elements in config field which feels very wrong:

select * from mytable
    cross join lateral json_array_elements(config->'employees') as e1
    cross join lateral json_array_elements(config->'states') as e2
    cross join lateral json_array_elements(config->'levels') as e3
    cross join lateral json_array_elements(config->'revenue') as e4;

I've also tried to do something like this:

select * from mytable
    where
        matching = TRUE
        and (config->'employees')::jsonb @> ANY(ARRAY ['[1, 7, 8]']::jsonb[])
        and (config->'states')::jsonb @> ANY(ARRAY ['["AK", "AZ"]']::jsonb[])
        and ........;

however this didn't work, although looked promising.

Also, I've tried playing with ?| operator but to no avail.


Basically, what I need is: given an array key in a json field check if this field contains any of the provided values in another array (which is my filtering parameter); and I have to do this for multiple filtering parameters dynamically.

So the logic is the following:

select all rows from the table
   *where*
   matching = TRUE
   *and* config->key1 includes any of the keys from [5,6,8,7]
   *and* config->key2 includes any of the keys from [8,6,2]
   *and* so forth;

Could you help me with implementing such an sql query?

Or maybe such sql queries will always be extremely slow and its best to do such filtering outside of the database level?

Denis Yakovenko
  • 3,241
  • 6
  • 48
  • 82

1 Answers1

2

I'd try with something like that. I guess there are certain side effects (e.g. What if the comparison data is empty?) and I didn't test it on larger data sets... It was just the first which came to my mind... :

demo:db<>fiddle

SELECT 
    *
FROM
    mytable t
JOIN (SELECT '{"states": ["LA", "WA", "CA"], "levels": ["1100", "1100"], "employees": [3]}'::json as data) c 
ON 
  CASE WHEN c.data -> 'employees' IS NOT NULL THEN
     ARRAY(SELECT json_array_elements_text(t.config -> 'employees')) && ARRAY(SELECT json_array_elements_text(c.data -> 'employees'))
  ELSE TRUE END
  
  AND
  
  CASE WHEN c.data -> 'industries' IS NOT NULL THEN
     ARRAY(SELECT json_array_elements_text(t.config -> 'industries')) && ARRAY(SELECT json_array_elements_text(c.data -> 'industries'))
  ELSE TRUE END
  
  AND
  
  CASE WHEN c.data -> 'states' IS NOT NULL THEN
     ARRAY(SELECT json_array_elements_text(t.config -> 'states')) && ARRAY(SELECT json_array_elements_text(c.data -> 'states'))
  ELSE TRUE END
  
  AND
  
  CASE WHEN c.data -> 'revenue' IS NOT NULL THEN
     ARRAY(SELECT json_array_elements_text(t.config -> 'revenue')) && ARRAY(SELECT json_array_elements_text(c.data -> 'revenue'))
  ELSE TRUE END
  
  AND
  
  CASE WHEN c.data -> 'levels' IS NOT NULL THEN
     ARRAY(SELECT json_array_elements_text(t.config -> 'levels')) && ARRAY(SELECT json_array_elements_text(c.data -> 'levels'))
  ELSE TRUE END

Explanation of the join condition:

      CASE WHEN c.data -> 'levels' IS NOT NULL THEN
         ARRAY(SELECT json_array_elements_text(t.config -> 'levels')) && ARRAY(SELECT json_array_elements_text(c.data -> 'levels'))
      ELSE TRUE END

If your comparision data does not contain the specific attribute, the condition is true and therefore will be ignored. If it contains an attribute, compare the table and comparision arrays for this attribute by transforming both JSON arrays into simple Postgres arrays

S-Man
  • 22,521
  • 7
  • 40
  • 63