0

I have a query that is very fast in finding the records it wants, but it takes too long to "extract" the data from the JSON field if I have a few hundred results. Is there a way to speed up the extraction process with this query and data structure? At about 500 results, the query takes around 3 seconds. If I remove the extract, it's basically instant.

The query returns exactly the data that I want, it's just too slow:

        SELECT
            unit_uid,
            JSON_EXTRACT(rateplans, '$."2023-03-14"','$."2023-03-15"') AS rateplans
        FROM
            property_listings_rateplans
        WHERE
            unit_uid IN (1527,1639,5,9,17,20,24,27,1873,4279)

The structure of the data is like this but with a lot more values and more dates in the rateplans data:

unit_uid (int) rateplans (JSON)
1527 { "2023-03-14": { "date": "2022-12-02", "price": 17500, "online": 1 }, "2023-03-15": { "date": "2022-12-03", "price": 17500, "online": 1 }, }
1639 { "2023-03-14": { "date": "2022-12-02", "price": 17500, "online": 1, }, "2023-03-15": { "date": "2022-12-03", "price": 17500, "online": 1, },}

Any ideas on how I should tackle this?

Edit: I should also add that there will eventually be millions of unit_uids and approximately 2 years worth of dates in each JSON field as well as more key:value pairs within each date. This is just a sample of the live data.

shiftyllama
  • 61
  • 1
  • 5
  • 4
    I would probably just remove the JSON dependency completely by separating each of the json values into it's own record. – GrumpyCrouton Jan 24 '23 at 17:05
  • 3
    Yes, design a proper relational database – RiggsFolly Jan 24 '23 at 17:05
  • That isn't possible. The dataset I'm displaying here is much smaller than the live data. There are ~2 years worth of dates in the JSON as well as more fields within each date. – shiftyllama Jan 24 '23 at 17:47
  • Is this real MySQL or is it only MariaDB? Run `SELECT VERSION();` and report the result. – Bill Karwin Jan 24 '23 at 18:06
  • Second question: Is the column defined as the `JSON` data type, or did you define it as `TEXT` and simply store JSON content in the `TEXT` column? – Bill Karwin Jan 24 '23 at 18:07
  • @BillKarwin it is Mysql 8.0.26 - I did set it as JSON, but when created, it was automatically converted to longtext – shiftyllama Jan 24 '23 at 18:39
  • I don't think MySQL does that. If you specify JSON, the column type is JSON. MariaDB does substitute LONGTEXT when you request JSON. So I'm wondering if you are really using MariaDB. Did you verify that with `SELECT VERSION();`? – Bill Karwin Jan 24 '23 at 18:43
  • You are right, it is Mysql and I can set it as JSON. – shiftyllama Jan 24 '23 at 18:50
  • Okay I was asking that because MySQL has code to make JSON extraction more efficient. Since you have confirmed you are already using that, the only other thing I can suggest is to get a database server with a more powerful CPU. – Bill Karwin Jan 24 '23 at 20:06
  • Okay, thank you for that information. I'm going to convert the column to JSON and see if that helps. I just didn't know if there was anything I could improve upon in my query or any way to index the dates in the JSON to make the extraction faster. – shiftyllama Jan 24 '23 at 20:08
  • 2 year's data? Fine spend a night converting. After that, rewrite all the queries to use RDMS principles. – Rick James Jan 24 '23 at 22:47
  • How many rows in the table? How many GB in the table? What is the setting of `innodb_buffer_pool_size`? How much RAM? – Rick James Jan 24 '23 at 22:48
  • The best CPU vs the worst _might_ give you a 2x speedup. Yawn. – Rick James Jan 24 '23 at 22:49
  • The table right now is only 13 gigs with ~200k rows. This will eventually grow to be ~5-10 million rows. I'm not concerned about the ability to upgrade hardware to speed things up but am looking for ways to optimize the actual query or table structure. – shiftyllama Jan 25 '23 at 02:27
  • The idea behind the current structure is that my query will never really be slower than it is right now as I will never pull more than ~500 results by a primary key. The extract takes a little under 3 seconds, which is actually acceptable for our current application, it's just not ideal. The comment about best vs worst cpu being a 2x speedup would actually be fantastic. If converting to a proper JSON field speeds it up a little bit, that would be fantastic as well. – shiftyllama Jan 25 '23 at 02:28
  • Thanks @BillKarwin - switching to a real JSON field was a drastic improvement. Brought the query from 3 seconds to 0.5 seconds and there are further optimizations that I can do to reduce that even further. – shiftyllama Feb 27 '23 at 18:40
  • A further optimization would be not using JSON, but storing multi-valued attributes as rows in a second table (i.e. normalized). Using JSON usually means poor optimization in both performance and storage. – Bill Karwin Feb 27 '23 at 18:57

0 Answers0