2

A table says products have a JSONB column called identifiers that stores an array of JSON objects.

Sample data in products

 id  |     name    |      identifiers
-----|-------------|---------------------------------------------------------------------------------------------------------------
  1  | umbrella    | [{"id": "productID-umbrella-123", "domain": "ecommerce.com"}, {"id": "amzn-123", "domain": "amzn.com"}]
  2  | ball        | [{"id": "amzn-234", "domain": "amzn.com"}]
  3  | bat         | [{"id": "productID-bat-234", "domain": "ecommerce.com"}]

Now, I have to write a query that sorts the elements in the table based on the "id" value for the domain "amzn.com"

Expected result

 id   |     name     |      identifiers
----- |--------------|---------------------------------------------------------------------------------------------------------------
  3   | bat          |  [{"id": "productID-bat-234", "domain": "ecommerce.com"}]
  1   | umbrella     |  [{"id": "productID-umbrella-123", "domain": "ecommerce.com"}, {"id": "amzn-123", "domain": "amzn.com"}]
  2   | ball         |  [{"id": "amzn-234", "domain": "amzn.com"}]

ids of amzn.com are "amzn-123" and "amzn-234". When sorted by ids of amzn.com "amzn-123" appears first, followed by "amzn-234"

Ordering the table by values of "id" for the domain "amzn.com", record with id 3 appears first since the id for amzn.com is NULL, followed by a record with id 1 and 2, which has a valid id that is sorted.

I am genuinely clueless as to how I could write a query for this use case. If it were a JSONB and not an array of JSON I would have tried.

Is it possible to write a query for such a use case in PostgreSQL? If yes, please at least give me a pseudo code or the rough query.

Surya
  • 2,429
  • 1
  • 21
  • 42
  • Will there ever be more than one "amazon ID"? –  Aug 26 '20 at 12:47
  • No, there will not. But we don't have any schema restrictions. Even if there are multiple ids, I don't mind taking in one and ignoring the others. – Surya Aug 26 '20 at 12:50

2 Answers2

1

As you don't know the position in the array, you will need to iterate over all array elements to find the amazon ID.

Once you have the ID, you can use it with an order by. Using nulls first puts those products at the top that don't have an amazon ID.

select p.*, a.amazon_id
from products p
   left join lateral (
      select item ->> 'id' as amazon_id
      from jsonb_array_elements(p.identifiers) as x(item)
      where x.item ->> 'domain' = 'amzn.com'
      limit 1 --<< safe guard in case there is more than one amazon id
   ) a on true --<< we don't really need a join condition
order by a.amazon_id nulls first;

Online example


With Postgres 12 this would be a bit shorter:

select p.*
from products p
order by jsonb_path_query_first(identifiers, '$[*] ? (@.domain == "amzn.com").id') nulls first
1

After few tweaks, this is the query that finally made it,

select p.*, amzn -> 'id' AS amzn_id
from products p left join lateral JSONB_ARRAY_ELEMENTS(p.identifiers) amzn ON amzn->>'domain' = 'amzn.com' 
order by amzn_id nulls first
Surya
  • 2,429
  • 1
  • 21
  • 42