1

I'm working on a project with php/mysql(mariaDB) that generates statistics with some data as input. The system that filled the DB is an electric powerplant and specifically the alarm system.

My DB has three fields: DeviceTime, VariableName and alarmState

The alarmState field can have 2 possible states: Normal when the alarm is going off, and Active when alarm is active.

I want to make a statistic that counts between the time (DeviceTime) an alarm appears (Active State) and the time (DeviceTime) the alarm switches off (Normal state)

Today I make a request that works fine, but only with a few data. When I test the request with all data (about 48k rows) the request takes too long and mysql crash after some time.

This is my request that works with a small amount of data

select k.deviceTime as stime, k.variableName as svar, k.alarmState as sstate, i.deviceTime, i.variableName, i.alarmState, timediff(i.deviceTime, k.deviceTime) as diff
from imports k
join imports i on i.variableName = k.variableName
                and
                  i.deviceTime = (select t.deviceTime
                                  from imports t
                                  where t.variableName = k.variableName and
                                          t.deviceTime > k.deviceTime and
                                          t.alarmState ='NORMAL'
                                  order by t.deviceTime limit 1
                  )

where k.alarmState = 'ACTIVE'

Here is my data table:

id  deviceTime              variableName            alarmState
1   2019-07-11T10:05:24.482 B1.d_07QFA11AN001XB08   ACTIVE
2   2019-07-11T10:05:24.282 B1.d_07QFA11AN001XB08   NORMAL
3   2019-07-11T10:05:15.409 G1.PTUR-38-T.228.52     ACTIVE
4   2019-07-11T10:03:51.409 G1.PTUR-38-T.228.52     NORMAL
5   2019-07-11T10:03:37.409 G1.PTUR-38-T.228.52     ACTIVE
6   2019-07-11T10:03:09.409 G1.PTUR-38-T.228.52     NORMAL
7   2019-07-11T10:02:55.409 G1.PTUR-38-T.228.52     ACTIVE
8   2019-07-11T09:52:06.415 B1.d_07QFA11AN001XB08   ACTIVE
9   2019-07-11T09:52:06.214 B1.d_07QFA11AN001XB08   NORMAL
10  2019-07-11T09:51:06.403 B1.d_07QFA11AN001XB08   ACTIVE

Result with small amount of data:

stime                   svar                    sstate  deviceTime              variableName            alarmState  diff
2019-07-11T09:52:06.415 B1.d_07QFA11AN001XB08   ACTIVE  2019-07-11T10:05:24.282 B1.d_07QFA11AN001XB08   NORMAL      00:13:17
2019-07-11T10:03:37.409 G1.PTUR-38-T.228.52     ACTIVE  2019-07-11T10:03:51.409 G1.PTUR-38-T.228.52     NORMAL      00:00:14
2019-07-11T10:02:55.409 G1.PTUR-38-T.228.52     ACTIVE  2019-07-11T10:03:09.409 G1.PTUR-38-T.228.52     NORMAL      00:00:14
2019-07-11T09:51:06.403 B1.d_07QFA11AN001XB08   ACTIVE  2019-07-11T09:52:06.214 B1.d_07QFA11AN001XB08   NORMAL      00:00:59

It's exactly what I want for the result, but if anyone had an idea to optimize this request, or another method to build a request that can return the time difference between the alarmState and the corresponding variableName.

Edit: My MariaDB Version is 10.4.6-MariaDB

Here is the table structure

create table imports
(
    id           bigint unsigned auto_increment
        primary key,
    deviceTime   varchar(255) not null,
    variableName varchar(255) not null,
    alarmState   varchar(255) null,
    created_at   timestamp    null,
    updated_at   timestamp    null
);

and the Explain query

id  select_type         table       type    possible_key    key     key_len     ref     rows        Extra
1   PRIMARY             k           ALL     <null>          <null>  <null>      <null>  44679       Using where; Using temporary; Using filesort
1   PRIMARY             i           ALL     <null>          <null>  <null>      <null>  44679       Using where; Using join buffer (flat, BNL join)
2   DEPENDENT SUBQUERY  t           ALL     <null>          <null>  <null>      <null>  44679       Using where; Using filesort

Edit2 I changed type for my deviceTime column to DATETIME. I created the index like this

create index imports_alarmstate_index
    on imports (alarmState);

create index imports_devicetime_index
    on imports (deviceTime);

create index imports_variablename_index
    on imports (variableName);

I modified my query to use MIN() instead of mysql Order BY ... Limit 1. Now my query look like

select k.deviceTime     as stime,
       k.variableName   as svar,
       k.alarmState     as sstate,
       i.deviceTime,
       i.variableName,
       i.alarmState,
       timestampdiff (second, i.deviceTime, k.deviceTime) as diff
from imports k
join imports i on   i.variableName = k.variableName and
                    i.deviceTime = (select MIN(t.deviceTime)
                                    from imports t
                                    where   t.variableName = k.variableName and
                                            t.deviceTime > k.deviceTime and
                                            t.alarmState ='NORMAL'
                                    )
where k.alarmState <> 'NORMAL' 

I use timestampdiff() instead of datediff() because the timestamp format is more easy to order. My where condition k.alarmState <> 'NORMAL' has changed because sommetime the alarmState can take another state in particular condition, but this new state is like an Active state

Here my new EXPLAIN

id  select_type         table   type    possible_keys                                                                   key                         key_len     ref                     rows    filtered    Extra
1   PRIMARY             k       ALL     imports_variablename_index,imports_alarmstate_index                             <null>                      <null>      <null>                  45474   50          Using where
1   PRIMARY             i       ref     imports_devicetime_index,imports_variablename_index                             imports_devicetime_index    5           func                    1       100         Using where
2   DEPENDENT SUBQUERY  t       ref     imports_devicetime_index,imports_variablename_index,imports_alarmstate_index    imports_variablename_index  1022        Alarms.k.variableName   29      50          Using where

When I execute my query, I get

34567 rows retrieved starting from 1 in 3 m 26 s 135 ms (execution: 158 ms, fetching: 3 m 25 s 977 ms)

I think 3 minutes is a bit long no? any other optimisation or suggestion?

Thank you !

Rick James
  • 135,179
  • 13
  • 127
  • 222
Cripsii
  • 11
  • 3
  • Which MySQL/MariaDB version? `SELECT VERSION();` – Raymond Nijland Jul 24 '19 at 13:21
  • Also questions about performance should also include table structures (`SHOW CREATE TABLE table`) of every table involved in the query.. Also a `EXPLAIN query` output – Raymond Nijland Jul 24 '19 at 13:22
  • Looks like a "groupwise-max" problem. See the added Tag for optimizations. – Rick James Jul 26 '19 at 00:32
  • As already advised, add a covering index for all 3 columns (ie, 1 index that has all 3 columns). In general MySQL will not combine seperate indexes, so you need 1 index that covers all the coumns. And removing the sub query is likely to help a lot (it is having to perform that sub query for every row that it is trying to join). – Kickstart Jul 30 '19 at 08:19

2 Answers2

2

Indexes make a big difference,

But is probably possible to rewrite your query without a sub query.

If I read your SQL correctly, something like this would work.

SELECT k.deviceTime as stime,
        k.variableName as svar,
        k.alarmState as sstate,
        i.deviceTime,
        i.variableName,
        i.alarmState,
        timediff(i.deviceTime, k.deviceTime) as diff
FROM imports k
INNER JOIN imports i
ON i.variableName = k.variableName
AND i.deviceTime > k.deviceTime
AND t.alarmState ='NORMAL'
LEFT OUTER JOIN imports t
ON t.variableName = k.variableName
AND t.deviceTime > k.deviceTime
AND t.deviceTime < i.deviceTime
AND t.alarmState ='NORMAL'
WHERE k.alarmState = 'ACTIVE'
AND t.id IS NULL

This is doing a join to find future imports for the same variable with a time in the future, and an OUTER JOIN to find any that are between the 'Active' and the 'Normal' alarm times. Then where there are any found in the OUTER JOIN the resulting row is ignored.

Kickstart
  • 21,403
  • 2
  • 21
  • 33
  • Thank you for your query, but the performances aren't fine. The execute time is very long (like 30 minutes +). I don't know why. I also make indexes and my query work better but not yet – Cripsii Jul 25 '19 at 14:16
  • @Cripsii - what indexes have you added? A covering index over the feilds alarmstate, variablename and devicetime (in that order) will likely make a big difference. – Kickstart Jul 25 '19 at 16:07
0

Try creating a compound index on (alarmState, variableName, deviceTime).

Why? You have this WHERE clause

       where t.variableName = k.variableName
         and t.deviceTime > k.deviceTime
         and t.alarmState ='NORMAL'

This does equality matches on variableName and alarmState, and then a range match on deviceTime. The equality matches come first in an index, then the range.

And, try refactoring your dependent subquery to use MIN() instead of ORDER BY ... LIMIT 1 Like this:

                            select MIN(t.deviceTime)
                              from imports t
                              where t.variableName = k.variableName and
                                      t.deviceTime > k.deviceTime and
                                      t.alarmState ='NORMAL'

That can use a so-called loose index scan to find the next time.

Edit From your edit I see your deviceTime column is varchar(255). For what you're trying to do that is grossly inefficient. Can you use a TIMESTAMP or DATETIME data type instead?

O. Jones
  • 103,626
  • 17
  • 118
  • 172