5

I'm trying to update the data stored in a json column in MariaDB (libmysql version - 5.6.43 , Server: 10.3.34-MariaDB-cll-lve - MariaDB Server).

My data is structured like this:

ID json_data
1 {....}
2 {....}

where json_data is structured as follows:

{
    "company": {
        "id": "",
        "name": "",
        "address": ""
    },
    "info_company": {
          "diff_v": "1",
          "grav_v": "",
          "diff_s": "2",
          "grav_s": "",
          "diff_g": "3",
          "grav_g": "",
          "diff_ri": "4",
          "grav_ri": "2"
    }
}

I'm trying to update data inside info_company replacing:

  • "1" with "<50%"
  • "2" with "<50%"
  • "3" with ">50%"
  • "4" with ">50%"

so the result should be:

{
    "company": {
        "id": "",
        "name": "",
        "address": ""
    },
    "info_company": {
          "diff_v": "<50%",
          "grav_v": "",
          "diff_s": "<50%",
          "grav_s": "",
          "diff_g": ">50%",
          "grav_g": "",
          "diff_ri": ">50%",
          "grav_ri": "<50%"
    }
}

By writing this query, I can retrieve the info_company data, but then for each key contained I cannot update the data following the new value.

SELECT new_t.id, JSON_EXTRACT(new_t.json_data, “$.info_company“) FROM (SELECT * FROM `my_table` WHERE json_data LIKE “%info_company%”) new_t

Output:

ID json_data
1 {"diff_v": "1","grav_v": "","diff_s": "2","grav_s": "","diff_g": "3","grav_g": "","diff_ri": "4","grav_ri": "2"}

Thank you for your help.

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
Jean
  • 453
  • 4
  • 16
  • Thanks, I was wrong in writing. I corrected. I have the libmysql version - 5.6.43 – Jean Mar 28 '22 at 11:12
  • 1
    The version of the server is more relevant than the version of the client. Can you check with `SELECT VERSION();`? That will check the version of the server. – Bill Karwin Mar 28 '22 at 13:42
  • Here: 10.3.34-MariaDB-cll-lve - MariaDB Server – Jean Mar 28 '22 at 14:59
  • 1
    Okay, I've edited the question tags and references to MySQL, since you are not using MySQL Server. You should not think of MariaDB as being compatible with MySQL. They have a lot of code in common, but MariaDB has been diverging since it forked from MySQL in 2010. In particular, its support for JSON is implemented independently, and is not compatible with MySQL. I don't use MariaDB, so I can't help answer this question. – Bill Karwin Mar 29 '22 at 14:39

2 Answers2

2

You can solve this problem by using a CTE to generate a regex to match the keys (and desired matching values) inside info_company and then using REGEXP_REPLACE to replace a 1 or 2 with <50% and a 3 or 4 with >50%:

UPDATE my_table
JOIN (
  WITH jkeys_table AS (
    SELECT id, JSON_KEYS(json_data, '$.info_company') AS jkeys
    FROM my_table
  )
  SELECT id,
         CONCAT('((?:',
                REPLACE(SUBSTRING(jkeys, 2, CHAR_LENGTH(jkeys)-2), ', ', '|'),
                ')\\s*:\\s*)"([12])"'
         ) AS regex12,
         CONCAT('((?:',
                REPLACE(SUBSTRING(jkeys, 2, CHAR_LENGTH(jkeys)-2), ', ', '|'),
                ')\\s*:\\s*)"([34])"'
         ) AS regex34
  FROM jkeys_table
) rt ON my_table.id = rt.id
SET json_data = REGEXP_REPLACE(REGEXP_REPLACE(json_data, regex12, '\\1"<50%"'), regex34, '\\1">50%"')

Output (for your sample JSON):

id  json_data
1   {
        "company": 
        {
            "id": "",
            "name": "",
            "address": ""
        },
        "info_company": 
        {
            "diff_v": "<50%",
            "grav_v": "",
            "diff_s": "<50%",
            "grav_s": "",
            "diff_g": ">50%",
            "grav_g": "",
            "diff_ri": ">50%",
            "grav_ri": "<50%"
        }
    }

Demo on dbfiddle

If it's possible the keys in info_company might exist elsewhere inside json_data, you need to localise the changes to the info_company element. You can do this by changing the SET clause of the UPDATE to:

SET json_data = JSON_REPLACE(json_data, '$.info_company',
                JSON_MERGE_PATCH(JSON_QUERY(json_data, '$.info_company'),
                                 REGEXP_REPLACE(REGEXP_REPLACE(JSON_QUERY(json_data, '$.info_company'), regex12, '\\1"<50%"'), regex34, '\\1">50%"')
                                )
                )

Demo on dbfiddle

If the keys in info_company are the same for every row, you can optimise the query by only computing the regex12 and regex34 values once, and then applying those values to all rows in my_table using a CROSS JOIN:

UPDATE my_table
CROSS JOIN (
  WITH jkeys_table AS (
    SELECT JSON_KEYS(json_data, '$.info_company') AS jkeys
    FROM my_table
    LIMIT 1
  )
  SELECT CONCAT('((?:',
                REPLACE(SUBSTRING(jkeys, 2, CHAR_LENGTH(jkeys)-2), ', ', '|'),
                ')\\s*:\\s*)"([12])"'
         ) AS regex12,
         CONCAT('((?:',
                REPLACE(SUBSTRING(jkeys, 2, CHAR_LENGTH(jkeys)-2), ', ', '|'),
                ')\\s*:\\s*)"([34])"'
         ) AS regex34
  FROM jkeys_table
) rt
SET json_data = REGEXP_REPLACE(REGEXP_REPLACE(json_data, regex12, '\\1"<50%"'), regex34, '\\1">50%"')

Demo on dbfiddle

Nick
  • 138,499
  • 22
  • 57
  • 95
  • the use of `CROSS JOIN` is justified only if it cannot be dispensed with. Have you tried running your query on a table with 10k records? I'm sure not. It will take ages! ...if you don't fix the first `SELECT` to reduce the number of possible combinations in the result of `CROSS JOIN`. – seeker Apr 10 '22 at 13:16
  • @seeker Oops... I had originally written the query (before posting) assuming all the keys inside `info_company` were the same for each row so the CTE only produced one row, hence a `CROSS JOIN` was fine. When I removed that assumption I forgot to correct to a `JOIN`. Thanks for pointing that out. – Nick Apr 10 '22 at 22:19
  • Hi, @Nick, Thank you for your help. I understand that in the JOIN you get the keys, but I don't understand the operations you perform in the select (id, concat()) – Jean Apr 11 '22 at 09:12
  • 2
    @Jean basically the query is using a CTE, the first step (with `JSON_KEYS`) gets the keys of the `info_company` field, the second step uses that key string to generate a regex that will match a string of the form `"key": "value"`, where `key` is one of the keys from the first step and value is either `1` or `2` for the first regex (`regex12`) and `3` or `4` for the second (`regex34`). The final step (the `SET`) then performs the replacements, with strings matching `regex12` having the `1` or `2` replaced with `<50%` and those matching `regex34` having the `3` or `4` replaced with `>50%` – Nick Apr 11 '22 at 10:20
  • 2
    [This demo](https://dbfiddle.uk/?rdbms=mariadb_10.4&fiddle=29b1e264b2008918555e99cafc590df8) shows what is generated by the CTE. I hope that helps your understanding. – Nick Apr 11 '22 at 10:23
1

Tested on MariaDB 10.3.34 database server with your json_data:

DELIMITER //
CREATE PROCEDURE percentage()
BEGIN
SELECT @info_keys:=JSON_KEYS(json_data, "$.info_company") FROM my_table;
SELECT @info_keys_num:=JSON_LENGTH(@info_keys);
WHILE @info_keys_num >= 0 DO
   SET @info_keys_num = @info_keys_num - 1;
   SELECT @info_attr:=JSON_EXTRACT(@info_keys, CONCAT("$[", @info_keys_num, "]"));
   UPDATE my_table SET json_data = JSON_REPLACE(json_data, CONCAT("$.info_company.", @info_attr), "<50%") 
      WHERE CHAR_LENGTH(JSON_VALUE(json_data, CONCAT("$.info_company.", @info_attr))) = 1 AND
            JSON_VALUE(json_data, CONCAT("$.info_company.", @info_attr)) < 3;
   UPDATE my_table SET json_data = JSON_REPLACE(json_data, CONCAT("$.info_company.", @info_attr), ">50%") 
      WHERE CHAR_LENGTH(JSON_VALUE(json_data, CONCAT("$.info_company.", @info_attr))) = 1 AND
            JSON_VALUE(json_data, CONCAT("$.info_company.", @info_attr)) > 2;
END WHILE;
END;
//
DELIMITER ;

call percentage();

Example of output:

MariaDB [test]> call percentage();
+------------------------------------------------------------------------------------+
| @info_keys:=JSON_KEYS(json_data, "$.info_company")                                 |
+------------------------------------------------------------------------------------+
| ["diff_v", "grav_v", "diff_s", "grav_s", "diff_g", "grav_g", "diff_ri", "grav_ri"] |
+------------------------------------------------------------------------------------+
1 row in set (0.001 sec)

... [cut here] ...

Query OK, 5 rows affected (0.011 sec)
seeker
  • 806
  • 4
  • 7
  • You can clean up the output of the procedure by using `SELECT ... INTO @var` instead of `SELECT @var:=...` – Nick Apr 10 '22 at 04:00
  • Hi, @seeker. I'm testing your query in my DB, and it give me the following error: SQL query: Edit Edit SET FOREIGN_KEY_CHECKS = ON; #2014 - Commands out of sync; you can't run this command now – Jean Apr 10 '22 at 19:42
  • It's not a query, it's a procedure. You should create it: copypaste code snippet including `DELIMITER //` and `DELIMITER ;`. Then run procedure: `CALL percentage();` – seeker Apr 10 '22 at 19:53
  • Yes, It 's a procedure and I copy and paste in the tab "SQL" including also DELIMITER. The server return the error. Maybe I have to set something else? – Jean Apr 11 '22 at 07:27
  • It's a bug in phpmyadmin. It [fixed](https://github.com/phpmyadmin/phpmyadmin/issues/14614) for use in php, but apparently not in phpmyadmin itself. Anyway, if you are not using the terminal, you can run the procedure in the phpmyadmin GUI (in the tab `Routines`) – seeker Apr 11 '22 at 09:08
  • You should make a note that this procedure relies on the keys in `info_company` being the same in all rows. Otherwise it's possible that not all values will be updated. See for example [this demo](https://dbfiddle.uk/?rdbms=mariadb_10.4&fiddle=f35c5aad50442cda70590f4352ee5bc8) – Nick Apr 11 '22 at 22:58
  • Sure, this is obvious at first glance at the code. Iteration over a fixed number of keys was done intentionally, since the problem statement did not provide for the variability of the `info_company` structure. The solution with `JOIN` is more flexible. – seeker Apr 12 '22 at 09:12