I'm working on a project with php/mysql(mariaDB) that generates statistics with some data as input. The system that filled the DB is an electric powerplant and specifically the alarm system.
My DB has three fields: DeviceTime
, VariableName
and alarmState
The alarmState field can have 2 possible states: Normal
when the alarm is going off, and Active
when alarm is active.
I want to make a statistic that counts between the time (DeviceTime) an alarm appears (Active State) and the time (DeviceTime) the alarm switches off (Normal state)
Today I make a request that works fine, but only with a few data. When I test the request with all data (about 48k rows) the request takes too long and mysql crash after some time.
This is my request that works with a small amount of data
select k.deviceTime as stime, k.variableName as svar, k.alarmState as sstate, i.deviceTime, i.variableName, i.alarmState, timediff(i.deviceTime, k.deviceTime) as diff
from imports k
join imports i on i.variableName = k.variableName
and
i.deviceTime = (select t.deviceTime
from imports t
where t.variableName = k.variableName and
t.deviceTime > k.deviceTime and
t.alarmState ='NORMAL'
order by t.deviceTime limit 1
)
where k.alarmState = 'ACTIVE'
Here is my data table:
id deviceTime variableName alarmState
1 2019-07-11T10:05:24.482 B1.d_07QFA11AN001XB08 ACTIVE
2 2019-07-11T10:05:24.282 B1.d_07QFA11AN001XB08 NORMAL
3 2019-07-11T10:05:15.409 G1.PTUR-38-T.228.52 ACTIVE
4 2019-07-11T10:03:51.409 G1.PTUR-38-T.228.52 NORMAL
5 2019-07-11T10:03:37.409 G1.PTUR-38-T.228.52 ACTIVE
6 2019-07-11T10:03:09.409 G1.PTUR-38-T.228.52 NORMAL
7 2019-07-11T10:02:55.409 G1.PTUR-38-T.228.52 ACTIVE
8 2019-07-11T09:52:06.415 B1.d_07QFA11AN001XB08 ACTIVE
9 2019-07-11T09:52:06.214 B1.d_07QFA11AN001XB08 NORMAL
10 2019-07-11T09:51:06.403 B1.d_07QFA11AN001XB08 ACTIVE
Result with small amount of data:
stime svar sstate deviceTime variableName alarmState diff
2019-07-11T09:52:06.415 B1.d_07QFA11AN001XB08 ACTIVE 2019-07-11T10:05:24.282 B1.d_07QFA11AN001XB08 NORMAL 00:13:17
2019-07-11T10:03:37.409 G1.PTUR-38-T.228.52 ACTIVE 2019-07-11T10:03:51.409 G1.PTUR-38-T.228.52 NORMAL 00:00:14
2019-07-11T10:02:55.409 G1.PTUR-38-T.228.52 ACTIVE 2019-07-11T10:03:09.409 G1.PTUR-38-T.228.52 NORMAL 00:00:14
2019-07-11T09:51:06.403 B1.d_07QFA11AN001XB08 ACTIVE 2019-07-11T09:52:06.214 B1.d_07QFA11AN001XB08 NORMAL 00:00:59
It's exactly what I want for the result, but if anyone had an idea to optimize this request, or another method to build a request that can return the time difference between the alarmState
and the corresponding variableName
.
Edit:
My MariaDB Version is 10.4.6-MariaDB
Here is the table structure
create table imports
(
id bigint unsigned auto_increment
primary key,
deviceTime varchar(255) not null,
variableName varchar(255) not null,
alarmState varchar(255) null,
created_at timestamp null,
updated_at timestamp null
);
and the Explain query
id select_type table type possible_key key key_len ref rows Extra
1 PRIMARY k ALL <null> <null> <null> <null> 44679 Using where; Using temporary; Using filesort
1 PRIMARY i ALL <null> <null> <null> <null> 44679 Using where; Using join buffer (flat, BNL join)
2 DEPENDENT SUBQUERY t ALL <null> <null> <null> <null> 44679 Using where; Using filesort
Edit2
I changed type for my deviceTime
column to DATETIME
. I created the index like this
create index imports_alarmstate_index
on imports (alarmState);
create index imports_devicetime_index
on imports (deviceTime);
create index imports_variablename_index
on imports (variableName);
I modified my query to use MIN()
instead of mysql Order BY ... Limit 1
.
Now my query look like
select k.deviceTime as stime,
k.variableName as svar,
k.alarmState as sstate,
i.deviceTime,
i.variableName,
i.alarmState,
timestampdiff (second, i.deviceTime, k.deviceTime) as diff
from imports k
join imports i on i.variableName = k.variableName and
i.deviceTime = (select MIN(t.deviceTime)
from imports t
where t.variableName = k.variableName and
t.deviceTime > k.deviceTime and
t.alarmState ='NORMAL'
)
where k.alarmState <> 'NORMAL'
I use timestampdiff()
instead of datediff()
because the timestamp format is more easy to order.
My where condition k.alarmState <> 'NORMAL'
has changed because sommetime the alarmState
can take another state in particular condition, but this new state is like an Active
state
Here my new EXPLAIN
id select_type table type possible_keys key key_len ref rows filtered Extra
1 PRIMARY k ALL imports_variablename_index,imports_alarmstate_index <null> <null> <null> 45474 50 Using where
1 PRIMARY i ref imports_devicetime_index,imports_variablename_index imports_devicetime_index 5 func 1 100 Using where
2 DEPENDENT SUBQUERY t ref imports_devicetime_index,imports_variablename_index,imports_alarmstate_index imports_variablename_index 1022 Alarms.k.variableName 29 50 Using where
When I execute my query, I get
34567 rows retrieved starting from 1 in 3 m 26 s 135 ms (execution: 158 ms, fetching: 3 m 25 s 977 ms)
I think 3 minutes is a bit long no? any other optimisation or suggestion?
Thank you !