Recently in my application which uses MariaDB 10.6, I am facing some weird issues where the same query took more than the expected time and consumes more IO at random times.
Enabled slow query to trace the same where we see a query stuck more than 9min and consumes more IO.
# Time: 230119 15:25:02
# User@Host: user[user] @ [192.*.*.*]
# Thread_id: 125616 Schema: DB QC_hit: No
# Query_time: 567.099806 Lock_time: 0.000500 Rows_sent: 48 Rows_examined: 10859204
# Rows_affected: 0 Bytes_sent: 0
SET timestamp=1674152702;
select column1,column2....columnN where column1=v1 and column2=v2 and column3=v3 and column4=v4 and column5=v5;
On seeing the DB processlist, more no of query are in state "Waiting for table metadata lock" and end up in bigger issues.
| 106804 | userx | IP | DB | Query | 4239 | Sending data | Q1 | 0.000 |
| 106838 | userx | IP | DB | Query | 1980 | Waiting for table metadata lock | Q2 | 0.000 |
| 107066 | userx | IP | DB | Sleep | 0 | | NULL | 0.000 |
| 107196 | userx | IP | DB | Sleep | 1 | | NULL | 0.000 |
| 107223 | userx | IP | DB | Query | 4363 | Sending data | Q3 | 0.000 |
| 107277 | userx | IP | DB | Query | 3221 | Sending data | Q4 | 0.000 |
| 107299 | userx | IP | DB | Sleep | 26 | | NULL | 0.000 |
| 107324 | userx | IP | DB | Sleep | 54 | | NULL | 0.000 |
| 107355 | userx | IP | DB | Sleep | 0 | | NULL | 0.000 |
| 107357 | userx | IP | DB | Sleep | 1 | | NULL | 0.000 |
| 107417 | userx | IP | DB | Query | 1969 | Waiting for table metadata lock | | 0.000 |
| 107462 | userx | IP | DB | Sleep | 55 | | NULL | 0.000 |
| 107489 | userx | IP | DB | Query | 1979 | Waiting for table metadata lock | Q5 | 0.000 |
| 107492 | userx | IP | DB | Sleep | 25 | | NULL | 0.000 |
| 107519 | userx | IP | DB | Query | 1981 | Waiting for table metadata lock | Q6 | 0.000 |
Currently, the manual killing of the suspected query using KILL query cmd unblocks the other query to complete, and via MariaDB property, we can use max_statement_time to terminate the long-running query.
But Is there a way to check what was killed by the max_statement_time? Unable to find any traces in error.log
.
The actual query should return around 1765 records while the slow query reports row_sent as 48.
- Is it a problem with scanning the table or record fetched got stuck after some time?
- Or Am I misinterpreting the slow query output Row_send record count
127.0.0.1:3307> select column1,column2....columnN where column1=v1 and column2=v2 and column3=v3 and column4=v4 and column5=v5;
+----------+
| count(*) |
+----------+
| 1756 |
+----------+
1 row in set (0.006 sec)
---EDITED--- Missed adding column 5 added now in the query. The table is indexed and let me explain the statement.
127.0.0.1:3307> explain extended select..... from Tablename where column1=v1 and column2=v2 and column3=v3 and column4=v4 and column5=v5;
+------+-------------+-------+------+---------------+---------+---------+-------------------------------+------+----------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+------+-------------+-------+------+---------------+---------+---------+-------------------------------+------+----------+-------+
| 1 | SIMPLE | s | ref | PRIMARY | PRIMARY | 7 | const,const,const,const,const | 73 | 100.00 | |
+------+-------------+-------+------+---------------+---------+---------+-------------------------------+------+----------+-------+
1 row in set, 1 warning (0.007 sec)