Amazon RDS - Slow queries at every minute change

Question

We recently noticed that we have a lot more records in our slow query log at every minute change (at around second 0 of every minute). This didn't seem to be related to any of our cronjobs so I decided to do a quick test on a new RDS t3.micro instance with MySQL 8.0.23.

Here are the steps to replicate:

Create a table:

 CREATE TABLE `_test` (
   `date` datetime NOT NULL,
   `val` int NOT NULL DEFAULT '1'
 ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
 COMMIT;

Make 100'000 inserts (important: use a single query for each row, do not insert multiple rows with one query), like:

 INSERT INTO `_test` (`date`, `val`) VALUES (NOW(), '1');
 INSERT INTO `_test` (`date`, `val`) VALUES (NOW(), '1');
 INSERT INTO `_test` (`date`, `val`) VALUES (NOW(), '1');
 ...

Run the following query:

 SELECT  SUBSTRING(`date`, -2, 2) AS `second`, AVG(`count`) AS `avg`
   FROM  
     ( SELECT  `date`, COUNT(1) AS `count`
         FROM  `_test`
         GROUP BY  `date`
     ) AS `temp`
   GROUP BY  `second`
   ORDER BY  `avg` ASC

My results were:

second	avg
00	20.1935
34	50.9667
01	51.1613
26	52.6667
02	52.9032
50	53.5333
20	53.6667
59	53.7188
30	53.8333
06	54.1613
36	54.1667
27	54.3333
05	54.5161
52	54.7742
04	55.0968
35	55.7000
15	55.9000
07	56.0000
38	56.0000
19	56.1000
49	56.1333
11	56.1333
29	56.1667
45	56.2000
03	56.2258

The inserts usually took about 0.02 seconds (50-56 inserts per second on average), except for second 0, where the inserts took 0.05 seconds (only 20 inserts per second on average).

And the lag seems to get bigger with bigger and more complex tables and/or queries with joins. Any idea what may be causing this lag at second 0?

EDIT:

Additional information for @Rick James.

    CREATE TABLE `rds_history` (
     `action_timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
     `called_by_user` varchar(50) NOT NULL,
     `action` varchar(20) NOT NULL,
     `mysql_version` varchar(50) NOT NULL,
     `master_host` varchar(255) DEFAULT NULL,
     `master_port` int DEFAULT NULL,
     `master_user` varchar(16) DEFAULT NULL,
     `master_log_file` varchar(50) DEFAULT NULL,
     `master_log_pos` mediumtext,
     `master_ssl` tinyint(1) DEFAULT NULL,
     `master_delay` int DEFAULT NULL,
     `auto_position` tinyint(1) DEFAULT NULL,
     `master_gtid` text
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1

    CREATE TABLE `rds_replication_status` (
     `action_timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
     `called_by_user` varchar(50) NOT NULL,
     `action` varchar(20) NOT NULL,
     `mysql_version` varchar(50) NOT NULL,
     `master_host` varchar(255) DEFAULT NULL,
     `master_port` int DEFAULT NULL,
     `replication_log_file` text,
     `replication_stop_point` bigint DEFAULT NULL,
     `replication_gtid` text
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1

Name	Engine	Version	Row_format	Rows	Avg_row_length	Data_length	Auto_increment	Create_time	Update_time	Check_time	Collation	Checksum
rds_configuration	InnoDB	10	Compact	3	5461	16384	NULL	08.03.21 11:19	NULL	NULL	latin1_swedish_ci	NULL
rds_global_status_history	InnoDB	10	Dynamic	0	0	16384	NULL	08.03.21 11:19	NULL	NULL	latin1_swedish_ci	NULL
rds_global_status_history_old	InnoDB	10	Dynamic	0	0	16384	NULL	08.03.21 11:19	NULL	NULL	latin1_swedish_ci	NULL
rds_heartbeat2	InnoDB	10	Compact	0	0	16384	NULL	08.03.21 11:19	26.10.21 20:03	NULL	latin1_swedish_ci	NULL
rds_history	InnoDB	10	Dynamic	1	16384	16384	NULL	08.03.21 11:19	NULL	NULL	latin1_swedish_ci	NULL
rds_replication_status	InnoDB	10	Dynamic	1	16384	16384	NULL	08.03.21 11:19	NULL	NULL	latin1_swedish_ci	NULL
rds_sysinfo	InnoDB	10	Compact	0	0	16384	NULL	08.03.21 11:19	NULL	NULL	utf8_general_ci	NULL

UPDATE:

Just did the same test on a brand new RDS t3.micro instance running MySQL 5.7.34 (default:mysql-5-7 option and parameter group). Similar results:

second	avg
00	25.8286
01	44.8857
30	45.0000
22	45.8235
55	46.2857
20	46.8286
15	46.9143
51	47.1143
02	47.3714
45	47.5294
54	47.6286
25	47.6471
50	47.6571
24	47.6765
21	47.7647
10	47.7714
56	47.9714
16	48.1143
43	48.1176
47	48.1429
23	48.2353
07	48.2571
35	48.2647
57	48.3714
53	48.4571
06	48.4857
59	48.5429
05	48.8571
36	49.0588
03	49.1429
46	49.1471
52	49.1714
48	49.1714
26	49.1765
11	49.2000
31	49.2059
40	49.2059
58	49.2286
49	49.2857
39	49.2941
12	49.3429
04	49.3714
19	49.4000
34	49.4118
08	49.4286
38	49.4412
33	49.4706
17	49.5429
09	49.5714
18	49.6286
32	49.6765
27	49.8235
13	49.8286
41	49.8529
29	49.8529
37	49.8824
14	49.8857
44	50.0000
28	50.0588
42	50.0882

Since I wasn't able to replicate the results on my local machine and on a dedicated server (both running MySQL 5.7), this must be an Amazon RDS related problem rather than a MySQL version problem.

I asked Amazon to look into this and do the same test, no response. Their support hasn't been very helpful in general so far. Is anyone else here with an AWS account able to replicate and confirm the results above? I can not believe we are the only ones affected by this.

Do you see the same issue with a "non-burstable" instance type? — Parsifal, Oct 26 '21 at 16:36
Change `long_query_time` to `0`; run for a few minutes before changing that setting back. (Need to run your script.) Then the slowlog may show the culprit. It could be a monitoring task like `SHOW GLOBAL STATUS`. Please report back; this could be an interesting thing to note. Also, what is the version number of RDS-MySQL that is running? — Rick James, Oct 26 '21 at 16:58
@Parsifal, yes, we first noticed this on our production m3.medium instance. — Simon, Oct 26 '21 at 19:12
@Rick James, we enabled general_log to record all queries and Amazon (rdsadmin) is indeed running some queries at or close to second 0. We opened a support ticket a few days ago. Their answer was "Please note that being that RDS is a managed service, the rdsadmin user queries the database to check the health of the database. However, it is very unusual for these queries to have an impact on the performance of your queries as the user only would query system tables and processes as observed in the queries provided". MySQL version is 8.0.23. — Simon, Oct 26 '21 at 19:18
@Simon - For my understanding, please provide the queries that they are running every minute. _Any_ query has _some_ impact on _all_ other queries running. After all, there is a single server (`mysqld`) doing all the work. (When I get the list, I can go into more details.) — Rick James, Oct 26 '21 at 19:27
@Rick James, out of the 230 minutes general_log was on, the following query ran 159 times at exactly second 0: ```SELECT count(*) FROM mysql.rds_history WHERE action = 'disable set master' GROUP BY action_timestamp, called_by_user, action, mysql_version, master_host, master_port, master_user, master_log_file, master_log_pos, master_ssl ORDER BY action_timestamp LIMIT 1``` — Simon, Oct 26 '21 at 19:54
@Rick James, an this one ran 133 times ```SELECT count(*) FROM mysql.rds_replication_status WHERE master_host IS NOT NULL AND master_port IS NOT NULL GROUP BY action_timestamp, called_by_user, action, mysql_version, master_host, master_port ORDER BY action_timestamp LIMIT 1``` — Simon, Oct 26 '21 at 19:57
Can you provide me with `SHOW CREATE TABLE mysql.rds_history` and the other table? And `USE mysql; SHOW TABLE STATUS LIKE 'rds%';` — Rick James, Oct 26 '21 at 19:59

score 0 · Answer 1 · answered Oct 26 '21 at 20:03

0

(Not yet a complete answer.)

These are strange queries:

SELECT  count(*)
    FROM  mysql.rds_history
    WHERE  action = 'disable set master'
    GROUP BY  action_timestamp, called_by_user, action, mysql_version,
        master_host, master_port, master_user, master_log_file,
        master_log_pos, master_ssl
    ORDER BY  action_timestamp
    LIMIT  1

SELECT  count(*)
    FROM  mysql.rds_replication_status
    WHERE  master_host IS NOT NULL
      AND  master_port IS NOT NULL
    GROUP BY  action_timestamp, called_by_user, action, mysql_version,
        master_host, master_port
    ORDER BY  action_timestamp
    LIMIT  1

The combination of COUNT(*) without any other columns, plus GROUP BY, ORDER BY and LIMIT is, in my opinion, nonsense.

answered Oct 26 '21 at 20:03

Rick James

135,179
13
127
222

Agreed. Those queries do not make much sense at all. Furthermore, both tables only have one row. But I also agree with Amazon: I don't see how they would stall other queries. – Simon Oct 26 '21 at 20:21
@Simon - And those tables do not have an explicit `PRIMARY KEY`. Ugh! Suggest you file some kind of bug report with RDS. (I don't have an account, so I would probably be ignored.) – Rick James Oct 26 '21 at 21:07
Absolutely. Although I don't think those queries are the reason for the lag. I have absolutely no idea what other processes Amazon is running on their RDS instances, but this must be related to Amazon, not to the MySQL version (see my latest update to my post). – Simon Oct 27 '21 at 12:03

score 0 · Accepted Answer · answered Nov 04 '21 at 15:40

Since RDS is using block storage (EBS), changes to an RDS database are limited by the IOPS of the storage you choose. General purpose SSD storage comes with 3 IOPS per GB, so you have a limit of 60 IOPS with 20 GB, for example. Now if you use more then the provisioned 60 IOPS, your experiencing a delay at the end of every minute (or the beginning of the new minute).

Solution: Either increase storage (attention, general purpose SSD [gp2] is limited to a throughput of max 250 MiB/s and, thus, capable of max 1'000 IOPS, even if you increase storage to more than 333 GB) or choose IOPS-SSD storage, which supports higher numbers of IOPS.

Amazon RDS - Slow queries at every minute change

2 Answers2