0

I have a Mysql8 DB which contains JSON data. Unfortunately, the content is not always the same. To make it simple, the hierarchy is always the same, but sometimes part of the "tree" is missing or slightly different. For instance: $.bilan.victimes.*.preview.TAGSAU (I use a star, since sometimes, it's '1', '2', etc... and sometimes it is only '$.bilan.victimes' (without further subkeys)

Now, I am using queries to lookup information in the JSON like:

SELECT 
    COUNT(fiche_id) AS USAGE_DSA,
    JSON_VALUE(content, '$.bilan.victimes.*.preview.DSA') AS DSA
FROM bilan_json
WHERE STR_TO_DATE(JSON_VALUE(content, '$.bilan.victimes.*.preview.TAGSAU'),'%e/%c/%Y %H%@%i') >= '2021-01-01'
GROUP BY DSA;

This is working fine, but since there is a lot of records, and JSON could be very long, it takes an awful bunch of time to display the result. In this example, this is only key... I am supposed to retrieve multiples values from the JSON, sometimes in a single query.

I've read about virtual columns (https://stackoverflow.com/questions/68118107/how-to-create-a-virtual-column-to-index-json-column-in-mysql#:~:text=if%20table%20is%20already%20created%20and%20you%20want,%60jval%60%3B%20Dont%20forget%20to%20index%20the%20Generated%20Columns) and also improving performance for JSON object (https://blogit.create.pt/goncalomelo/2018/12/20/query-performance-for-json-objects-inside-sql-server/) but I can't really figure out if I should create a virtual column per key ? And, how can I create a virtual column with a transform ? In above case, I would create something like :

ALTER TABLE bilan_json
    ADD COLUMN tagsau DATETIME 
    GENERATED ALWAYS AS STR_TO_DATE(JSON_VALUE(content, '$.bilan.victimes.*.preview.TAGSAU'),'%e/%c/%Y %H%@%i')
AFTER content;

What would be your advice ?

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
ChezYom
  • 11
  • 1
  • 5

1 Answers1

1

Simply put, If you expect to need a field in JSON for a WHERE or ORDER BY clause, that field should be in its own column.

3 approaches:

  • Redundantly store it in a column as you INSERT the rows.
  • Use a Virtual ("Generated") column (as you suggest).
  • Remove it from JSON as you put it in its own column.

Once it is in a column, it can be indexed. (It is unclear how useful an index would be for the SELECT you show.)

Did you try that ALTER? Did it work? We need SHOW CREATE TABLE in order to advise further.

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • I am not able to change how the json is created and store in the table, so I guess I am left with option 1 and 2. – ChezYom Dec 03 '21 at 07:52
  • I am not able to change how the json is created and stored in the table, so I guess I am left with option 1 and 2. ALTER query is not working as is, I have to find out why. I guess, I have to remove the 'transform' and do it in the SELECT afterwards. According to this site (https://www.percona.com/blog/storing-json-in-your-databases-tips-and-tricks-for-mysql-part-two/) , the syntax looks like generated always as (`comment` ->> '$.bilan.victimes.*.preview.TAGSAU') – ChezYom Dec 03 '21 at 08:03