1

I have data in a table that is log data for server activity. Here's what it looks like (the # column isn't part of the database or output, but there to be able to refer to that data in my Notes below):

# | DateStamp           | Server  
- | ------------------- | ---------  
1 | 2016-12-01 03:15:19 | Server 1
2 | 2016-12-01 03:17:19 | Server 2
3 | 2016-12-01 03:17:24 | Server 2
4 | 2016-12-01 03:18:01 | Server 1
5 | 2016-12-01 03:18:07 | Server 3
6 | 2016-12-01 04:01:03 | Server 3
7 | 2016-12-01 07:18:47 | Server 1
8 | 2016-12-01 07:19:23 | Server 1
9 | 2016-12-01 09:19:39 | Server 2
10| 2016-12-01 11:19:54 | Server 3

And I want to write a query that outputs:

# | Server   | Online              | Offline
- | -------- | ------------------- | -------------------
1 | Server 1 | 2016-12-01 03:15:19 | 2016-12-01 03:18:01
2 | Server 2 | 2016-12-01 03:17:19 | 2016-12-01 03:17:24
3 | Server 3 | 2016-12-01 03:18:07 | 2016-12-01 03:18:07
4 | Server 1 | 2016-12-01 07:18:47 | 2016-12-01 07:19:23
5 | Server 2 | 2016-12-01 09:19:39 | 2016-12-01 09:19:39
6 | Server 3 | 2016-12-01 11:19:54 | (still online)

Notes:

  • This is basically a tally of when these servers were "active" and for how long.
  • If the next server's activity is greater than an hour apart, it is considered a new session and gets a new line. (i.e. row 1 and row 4 of output, based on data rows 4 and 7 above)
  • To Clarify: Line 1 of output decides 03:18:01 is "offline" because the next entry for Server 1 (at 07:18:47 on line 7 of Data) is more than an hour later.
  • Line 5 of output shows offline because more than an hour has passed and no new entries for Server 2 have appeared

I would love to know how to query for this, and group my results based on the Output and Notes above. Let me know if you need more information to suggest a solution.

ekad
  • 14,436
  • 26
  • 44
  • 46
D LeG8
  • 11
  • 1

1 Answers1

0

1) First of all you should load your logs into MySQL DB:

# Optionally
#drop table if exists srv_logs;

create table srv_logs (
        `id` INT(10) NOT NULL AUTO_INCREMENT,
        `datetime` DATETIME ,
        `server` VARCHAR(300),
    PRIMARY KEY (`id`)
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
LOAD DATA INFILE 'yourfile.log'
    INTO TABLE srv_logs
    CHARSET utf8
    FIELDS TERMINATED BY '|'  
    OPTIONALLY ENCLOSED BY '"'   
    LINES TERMINATED BY '\n'  IGNORE 2 LINES (
        `id`,
        `datetime`,
        `server`
    );

2) Create/fill init data your downtime table:

create table srv_downtime (
        `id` INT(10) NOT NULL AUTO_INCREMENT,
        `server` VARCHAR(300),
        `online` DATETIME ,
        `offline` DATETIME ,
    PRIMARY KEY (`id`)
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8;

insert into  srv_downtime (`server`, `online`, `offline`)
SELECT l.server, MAX(l.datetime), null
FROM srv_logs l
left join srv_logs l2 
   on l.server = l2.server 
   and l.datetime > l2.datetime
   and TIMESTAMPDIFF(MINUTE,l2.datetime,l.datetime) < 60
where l2.id is null 
GROUP BY l.server

3) Repeatedly invoke this insert until no new lines will be added, it will add new lines to the bottom, (previous work period)

    insert into  srv_downtime (`server`, `online`, `offline`)
    (select a.server, min(l2.datetime), offline from
      (SELECT d.server, max(l.datetime) as offline
         FROM srv_downtime d
         left join srv_logs l 
            on l.server = d.server 
            and d.online > l.datetime
         group by l.server
      ) a
      left join srv_logs l2 
        on a.offline > l2.datetime 
        and l2.server = a.server
        and TIMESTAMPDIFF(MINUTE, l2.datetime, a.offline) < 60
      group by a.server
   )

So on example dataset after this 3 steps the result seems correct:

Server 1  | 2016-12-01 03:15:19 | 2016-12-01 03:18:01
Server 1  | 2016-12-01 07:18:47 | NULL
Server 2  | 2016-12-01 03:17:19 | 2016-12-01 03:17:24
Server 2  | 2016-12-01 09:19:39 | NULL
Server 3  | 2016-12-01 03:18:07 | 2016-12-01 04:01:03
Server 3  | 2016-12-01 11:19:54 | NULL
2oppin
  • 1,941
  • 20
  • 33
  • Thanks 2oppin! I tried the inner SELECT query in your Step 2, and the query never completed. Also tried it with the first 'ON' clause as "on l.server = l2.server". Still never completed. The srv_logs table has just over 1 million rows. – D LeG8 Dec 03 '16 at 01:46
  • @DLeG8, yupp, sorry for that, wasn't able to test it, so there were several mistakes in 2nd part (obvious like join on l.datetime = l.datetime), fixed this, and the 3rd part was maybe wrong by sence (it took first log record after downtime as offline), now replaced it with last record before server's on. Please check - should work. – 2oppin Dec 03 '16 at 08:57