1

I am currently trying to optimise some DB queries that get run a lot, the queries are run by using a SELECT query against a view, this view does a lot of joins. I thought I might be able to speed things up by caching the results of the view into a table and selecting from the table instead of the view.

Let's say I have 2 tables

People:

PersonId Name
1 Anne
2 Brian
3 Charlie
4 Doug

CustomerPeople:

CustomerId PersonId
1 1
1 2
1 3
1 4
2 1
2 2

and I have a view that joins the two tables to give a list of people, by name, belonging to the customer:

CustomerId PersonName
1 Anne
1 Brian
1 Charlie
1 Doug
2 Anne
2 Brian

When I query the view, I look at the Duration/Fetch and it is 0.10 sec/4.00 sec

I decide to cache the view data into a table and create a new table:

CustomerNamedPeople

CustomerId PersonName
1 Anne
1 Brian
1 Charlie
1 Doug
2 Anne
2 Brian

Which contains the exact same data, however now when I query the table, I look at the Duration/Fetch and it is 0.05 sec/6.00 sec

My understanding is the Duration is the time it takes MySQL engine to run the query, and Fetch is the time it takes the data to be returned to the client (over the network). Unsurprisingly the Duration was faster, and took only 50% of the time, which makes sense, there is no longer a join occurring, however the Fetch took 150% of the time, and is slower.

My question here is: Does MySQL do some sort of response stream compression, since it knows that Anne and Brian are repeated, it can send them only once and have the client "decompress" the data?

The reason I ask is because I am doing something similar but with 1,000,000 rows returned, the data in the two responses is identical, but the view Fetch takes 20 seconds, and the table Fetch is 60 seconds, most of the PersonNames are repeated more than once, so I am wondering if perhaps there is some sort of compression occurring in the response, should I not expect MySQL to take the same time to Fetch two sets of identical data?

Progman
  • 16,827
  • 6
  • 33
  • 48
Campbell
  • 532
  • 3
  • 12
  • I also realise that these example queries only have a duration of 0.10 seconds, so the performance is unlikely to improve by using a table. – Campbell Dec 10 '21 at 22:04
  • MySQL support a compressed protocol. See this : https://stackoverflow.com/questions/2506460/when-should-i-use-mysql-compressed-protocol – Sofyane MAKERRI Dec 10 '21 at 22:09
  • @SofyaneMAKERRI In this case are we seeing the compression protocol being used for one type of request, but not the other? Is it being used for both, but it only compresses 1 result because it knows that `1:Anne` and `1:Anne` are the same and can be compressed, but it doesn't know that `Anne` and `Anne` are the same? – Campbell Dec 10 '21 at 22:27
  • When reading other sources about this the Fetch time is always mentioned as the transfer time from the server to the client but your measurement would indicate it as time the server itself needs to read the data from the disc, unfortunately I did not find any definition of the Fetch time. – Christian Dec 10 '21 at 22:58
  • My understanding is that `Duration` is the time elapsed before the first row is received by the client, not the execution time. And `Fetch` is the remaining time taken to retrieve the rest of the result set. This is not just related to network transfer time as the full result set is not necessarily available when the first row is sent. – user1191247 Dec 11 '21 at 00:30
  • I don't think that the actions across the network have any clue of whether a VIEW was involved or a SELECT was involved. I suspect there is something going on that is not in evidence yet. Please provide the `SELECT`, `VIEW`, `CREATE TABLE`. And explain what the heck you need 1M rows for. – Rick James Dec 11 '21 at 06:28
  • @RickJames The data is used for reporting (think Crystal Reports), we collect data for our users and allow them to build reports based off of it. You do make an interesting point though, and I have raised the question with my team, as I agree it's unlikely the user is actually going to need all 1M rows. As for what is involved, the view is a bit of a mess and joins about 10 different tables, the table is created at the start of the day, and simply provides a snapshot of the view at the time of creation, the expectation was that the table would be faster, as it's not computing any joins. – Campbell Dec 13 '21 at 14:36
  • @nnichols @christan, you both make an interesting point, I might need to look more into `Duration` vs `Fetch`, as my understanding might be incorrect. – Campbell Dec 13 '21 at 14:37
  • @Campbell - We could discuss the performance of the snapshot query. Start a new Question with the query, `CREATE TABLE`, `EXPLAIN SELECT`, and approximate table sizes. The Joins may or may not be a serious problem; it depends. – Rick James Dec 13 '21 at 16:42

0 Answers0