0

We recently noticed that we have a lot more records in our slow query log at every minute change (at around second 0 of every minute). This didn't seem to be related to any of our cronjobs so I decided to do a quick test on a new RDS t3.micro instance with MySQL 8.0.23.

Here are the steps to replicate:

  1. Create a table:

     CREATE TABLE `_test` (
       `date` datetime NOT NULL,
       `val` int NOT NULL DEFAULT '1'
     ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
     COMMIT;
    
  2. Make 100'000 inserts (important: use a single query for each row, do not insert multiple rows with one query), like:

     INSERT INTO `_test` (`date`, `val`) VALUES (NOW(), '1');
     INSERT INTO `_test` (`date`, `val`) VALUES (NOW(), '1');
     INSERT INTO `_test` (`date`, `val`) VALUES (NOW(), '1');
     ...
    
  3. Run the following query:

     SELECT  SUBSTRING(`date`, -2, 2) AS `second`, AVG(`count`) AS `avg`
       FROM  
         ( SELECT  `date`, COUNT(1) AS `count`
             FROM  `_test`
             GROUP BY  `date`
         ) AS `temp`
       GROUP BY  `second`
       ORDER BY  `avg` ASC
    

My results were:

second avg
00 20.1935
34 50.9667
01 51.1613
26 52.6667
02 52.9032
50 53.5333
20 53.6667
59 53.7188
30 53.8333
06 54.1613
36 54.1667
27 54.3333
05 54.5161
52 54.7742
04 55.0968
35 55.7000
15 55.9000
07 56.0000
38 56.0000
19 56.1000
49 56.1333
11 56.1333
29 56.1667
45 56.2000
03 56.2258

The inserts usually took about 0.02 seconds (50-56 inserts per second on average), except for second 0, where the inserts took 0.05 seconds (only 20 inserts per second on average).

And the lag seems to get bigger with bigger and more complex tables and/or queries with joins. Any idea what may be causing this lag at second 0?

EDIT:

Additional information for @Rick James.

    CREATE TABLE `rds_history` (
     `action_timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
     `called_by_user` varchar(50) NOT NULL,
     `action` varchar(20) NOT NULL,
     `mysql_version` varchar(50) NOT NULL,
     `master_host` varchar(255) DEFAULT NULL,
     `master_port` int DEFAULT NULL,
     `master_user` varchar(16) DEFAULT NULL,
     `master_log_file` varchar(50) DEFAULT NULL,
     `master_log_pos` mediumtext,
     `master_ssl` tinyint(1) DEFAULT NULL,
     `master_delay` int DEFAULT NULL,
     `auto_position` tinyint(1) DEFAULT NULL,
     `master_gtid` text
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1

    CREATE TABLE `rds_replication_status` (
     `action_timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
     `called_by_user` varchar(50) NOT NULL,
     `action` varchar(20) NOT NULL,
     `mysql_version` varchar(50) NOT NULL,
     `master_host` varchar(255) DEFAULT NULL,
     `master_port` int DEFAULT NULL,
     `replication_log_file` text,
     `replication_stop_point` bigint DEFAULT NULL,
     `replication_gtid` text
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1
Name Engine Version Row_format Rows Avg_row_length Data_length Max_data_length Index_length Data_free Auto_increment Create_time Update_time Check_time Collation Checksum Create_options Comment
rds_configuration InnoDB 10 Compact 3 5461 16384 0 0 0 NULL 08.03.21 11:19 NULL NULL latin1_swedish_ci NULL
rds_global_status_history InnoDB 10 Dynamic 0 0 16384 0 0 0 NULL 08.03.21 11:19 NULL NULL latin1_swedish_ci NULL
rds_global_status_history_old InnoDB 10 Dynamic 0 0 16384 0 0 0 NULL 08.03.21 11:19 NULL NULL latin1_swedish_ci NULL
rds_heartbeat2 InnoDB 10 Compact 0 0 16384 0 0 0 NULL 08.03.21 11:19 26.10.21 20:03 NULL latin1_swedish_ci NULL
rds_history InnoDB 10 Dynamic 1 16384 16384 0 0 0 NULL 08.03.21 11:19 NULL NULL latin1_swedish_ci NULL
rds_replication_status InnoDB 10 Dynamic 1 16384 16384 0 0 0 NULL 08.03.21 11:19 NULL NULL latin1_swedish_ci NULL
rds_sysinfo InnoDB 10 Compact 0 0 16384 0 0 0 NULL 08.03.21 11:19 NULL NULL utf8_general_ci NULL

UPDATE:

Just did the same test on a brand new RDS t3.micro instance running MySQL 5.7.34 (default:mysql-5-7 option and parameter group). Similar results:

second avg
00 25.8286
01 44.8857
30 45.0000
22 45.8235
55 46.2857
20 46.8286
15 46.9143
51 47.1143
02 47.3714
45 47.5294
54 47.6286
25 47.6471
50 47.6571
24 47.6765
21 47.7647
10 47.7714
56 47.9714
16 48.1143
43 48.1176
47 48.1429
23 48.2353
07 48.2571
35 48.2647
57 48.3714
53 48.4571
06 48.4857
59 48.5429
05 48.8571
36 49.0588
03 49.1429
46 49.1471
52 49.1714
48 49.1714
26 49.1765
11 49.2000
31 49.2059
40 49.2059
58 49.2286
49 49.2857
39 49.2941
12 49.3429
04 49.3714
19 49.4000
34 49.4118
08 49.4286
38 49.4412
33 49.4706
17 49.5429
09 49.5714
18 49.6286
32 49.6765
27 49.8235
13 49.8286
41 49.8529
29 49.8529
37 49.8824
14 49.8857
44 50.0000
28 50.0588
42 50.0882

Since I wasn't able to replicate the results on my local machine and on a dedicated server (both running MySQL 5.7), this must be an Amazon RDS related problem rather than a MySQL version problem.

I asked Amazon to look into this and do the same test, no response. Their support hasn't been very helpful in general so far. Is anyone else here with an AWS account able to replicate and confirm the results above? I can not believe we are the only ones affected by this.

Simon
  • 326
  • 1
  • 12
  • Do you see the same issue with a "non-burstable" instance type? – Parsifal Oct 26 '21 at 16:36
  • Change `long_query_time` to `0`; run for a few minutes before changing that setting back. (Need to run your script.) Then the slowlog may show the culprit. It could be a monitoring task like `SHOW GLOBAL STATUS`. Please report back; this could be an interesting thing to note. Also, what is the version number of RDS-MySQL that is running? – Rick James Oct 26 '21 at 16:58
  • @Parsifal, yes, we first noticed this on our production m3.medium instance. – Simon Oct 26 '21 at 19:12
  • @Rick James, we enabled general_log to record all queries and Amazon (rdsadmin) is indeed running some queries at or close to second 0. We opened a support ticket a few days ago. Their answer was "Please note that being that RDS is a managed service, the rdsadmin user queries the database to check the health of the database. However, it is very unusual for these queries to have an impact on the performance of your queries as the user only would query system tables and processes as observed in the queries provided". MySQL version is 8.0.23. – Simon Oct 26 '21 at 19:18
  • @Simon - For my understanding, please provide the queries that they are running every minute. _Any_ query has _some_ impact on _all_ other queries running. After all, there is a single server (`mysqld`) doing all the work. (When I get the list, I can go into more details.) – Rick James Oct 26 '21 at 19:27
  • @Rick James, out of the 230 minutes general_log was on, the following query ran 159 times at exactly second 0: ```SELECT count(*) FROM mysql.rds_history WHERE action = 'disable set master' GROUP BY action_timestamp, called_by_user, action, mysql_version, master_host, master_port, master_user, master_log_file, master_log_pos, master_ssl ORDER BY action_timestamp LIMIT 1``` – Simon Oct 26 '21 at 19:54
  • @Rick James, an this one ran 133 times ```SELECT count(*) FROM mysql.rds_replication_status WHERE master_host IS NOT NULL AND master_port IS NOT NULL GROUP BY action_timestamp, called_by_user, action, mysql_version, master_host, master_port ORDER BY action_timestamp LIMIT 1``` – Simon Oct 26 '21 at 19:57
  • Can you provide me with `SHOW CREATE TABLE mysql.rds_history` and the other table? And `USE mysql; SHOW TABLE STATUS LIKE 'rds%';` – Rick James Oct 26 '21 at 19:59
  • I've updated the original post with those information. – Simon Oct 26 '21 at 20:19

2 Answers2

0

(Not yet a complete answer.)

These are strange queries:

SELECT  count(*)
    FROM  mysql.rds_history
    WHERE  action = 'disable set master'
    GROUP BY  action_timestamp, called_by_user, action, mysql_version,
        master_host, master_port, master_user, master_log_file,
        master_log_pos, master_ssl
    ORDER BY  action_timestamp
    LIMIT  1

SELECT  count(*)
    FROM  mysql.rds_replication_status
    WHERE  master_host IS NOT NULL
      AND  master_port IS NOT NULL
    GROUP BY  action_timestamp, called_by_user, action, mysql_version,
        master_host, master_port
    ORDER BY  action_timestamp
    LIMIT  1

The combination of COUNT(*) without any other columns, plus GROUP BY, ORDER BY and LIMIT is, in my opinion, nonsense.

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • Agreed. Those queries do not make much sense at all. Furthermore, both tables only have one row. But I also agree with Amazon: I don't see how they would stall other queries. – Simon Oct 26 '21 at 20:21
  • @Simon - And those tables do not have an explicit `PRIMARY KEY`. Ugh! Suggest you file some kind of bug report with RDS. (I don't have an account, so I would probably be ignored.) – Rick James Oct 26 '21 at 21:07
  • Absolutely. Although I don't think those queries are the reason for the lag. I have absolutely no idea what other processes Amazon is running on their RDS instances, but this must be related to Amazon, not to the MySQL version (see my latest update to my post). – Simon Oct 27 '21 at 12:03
0

Since RDS is using block storage (EBS), changes to an RDS database are limited by the IOPS of the storage you choose. General purpose SSD storage comes with 3 IOPS per GB, so you have a limit of 60 IOPS with 20 GB, for example. Now if you use more then the provisioned 60 IOPS, your experiencing a delay at the end of every minute (or the beginning of the new minute).

Solution: Either increase storage (attention, general purpose SSD [gp2] is limited to a throughput of max 250 MiB/s and, thus, capable of max 1'000 IOPS, even if you increase storage to more than 333 GB) or choose IOPS-SSD storage, which supports higher numbers of IOPS.

Simon
  • 326
  • 1
  • 12