1

I don't know why, but probably PHP persisted some of my data as object and some of them as array. My table looks something like:

seller_info_address Table:

 ID (INT)  |  address (JSONB)                                  |
------------+--------------------------------------------------|
     1      | {"addressLines":{"0":"Technology Park",...},...} |
     2      | {"addressLines":["Technology Park",...],...}     |

Some addressLines are objects:

{
  "addressLines": {
    "0": "Technology Park",
    "1": "Blanchard Road",
    "3": "Dublin",
    "4": "2"
  },
  "companyName": "...",
  "emailAddress": [],
  "...": "..."
}

Some addressLines are arrays:

{
  "addressLines": [
    "Technology Park",
    "Blanchard Road",
    "Dublin",
    "2"
  ],
  "companyName": "...",
  "emailAddress": [],
  "...": "..."
}

I would like to equalize the data with a SQL query, but I'm not sure how to do it. All addressLines persisted as object should be updated to array form.

I am grateful for help, thanks!

crazyyou
  • 581
  • 1
  • 4
  • 15
  • 1
    I just had this exact same problem, and came here looking to clean up the data. The reason your data is sometimes an array, and sometimes an object is because the php array must be 0 based to be treated as an array. see https://stackoverflow.com/a/66644255/3790921 – Chad Mar 15 '21 at 19:43

2 Answers2

1

You can convert the objects to an array using this:

select id, (select jsonb_agg(e.val order by e.key::int) 
            from jsonb_each(sia.address -> 'addressLines') as e(key,val))
from seller_info_address sia
where jsonb_typeof(address -> 'addressLines') = 'object'

The where condition makes sure we only do this for addresslines that are not an array.

The aggregation used can also be used inside an UPDATE statement:

update seller_info_address
  set address = jsonb_set(address, '{addressLines}', 
                          (select jsonb_agg(e.val order by e.key::int) 
                           from jsonb_each(address -> 'addressLines') as e(key,val))
                          )
where jsonb_typeof(address -> 'addressLines') = 'object';
  • Fascinating, almost simultaneously the same solution. I guess we're soul mates. Thank you for your trouble! – crazyyou Aug 20 '20 at 11:37
0

Ok, I have now found a solution myself. Definitely not the most eloquent solution. I'm sure there's a nicer and more efficient one, but it works...

DROP FUNCTION update_address_object_to_array(id INTEGER);
CREATE OR REPLACE FUNCTION
    update_address_object_to_array(id INTEGER)
    RETURNS VOID AS
$UPDATE_ADDRESS_OBJECT_TO_ARRAY$
BEGIN
    UPDATE seller_info_address
    SET address = jsonb_set(address, '{addressLines}', (
        SELECT CASE
                   WHEN jsonb_agg(addressLines) IS NOT NULL THEN jsonb_agg(addressLines)
                   ELSE '[]'
               END
        FROM seller_info_address sia,
             jsonb_each(address #> '{addressLines}') as t(key, addressLines)
        WHERE jsonb_typeof(sia.address -> 'addressLines') = 'object'
          AND seller_info_id = update_address_object_to_array.id
    ), true)
    WHERE seller_info_id = update_address_object_to_array.id
      AND jsonb_typeof(address -> 'addressLines') = 'object';
END
$UPDATE_ADDRESS_OBJECT_TO_ARRAY$
    LANGUAGE 'plpgsql';
SELECT update_address_object_to_array(sia.seller_info_id)
  FROM seller_info_address sia
 WHERE jsonb_typeof(address -> 'addressLines') = 'object';

The inner SELECT fetches all lines in the addressLines object using jsonb_each and then aggregates them into an array using jsonb_agg. The conditional expressions is to prevented null cases.

The result is then stored in the UPDATE via jsonb_set to the required position in the json. The WHERES should be self-explanatory.

crazyyou
  • 581
  • 1
  • 4
  • 15