MySQL - Get Last Entry by Time (or first if Null)

Question

I have the following SQL Server 2008 query:

SELECT T.*,Data.Value FROM [Table] T OUTER APPLY      

(SELECT TOP 1 E.Value FROM [Table2] E     
ORDER BY CASE WHEN T.TDateTime >= E.EDateTime then 1 else 2 end,
ABS(DateDiff(ss,T.TDateTime,E.EDatetime))) AS Data

This basically gets the Last E value for every record in T, but if the record in T is before the first record in E, then it gets the first record in E.

What is the equivalent in MySQL?

EDIT

Here is my schema and data:

DROP TABLE IF EXISTS `data`;
CREATE TABLE `data` (
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `DataDateTime` datetime DEFAULT NULL,
  `Value` int(11) DEFAULT NULL,
  PRIMARY KEY (`ID`)
) ENGINE=InnoDB AUTO_INCREMENT=9 DEFAULT CHARSET=utf8;


LOCK TABLES `data` WRITE;
INSERT INTO `data` VALUES (1,'2012-02-01 00:00:00',1),(2,'2012-03-01 01:00:00',2),(3,'2012-04-01 02:00:00',3),(4,'2012-05-01 03:00:00',4),(5,'2012-06-01 04:00:00',5),(6,'2012-07-01 05:00:00',6),(7,'2012-08-01 06:00:00',7),(8,'2012-09-01 07:00:00',8);
UNLOCK TABLES;


DROP TABLE IF EXISTS `t`;
CREATE TABLE `t` (
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `TDateTime` datetime DEFAULT NULL,
  PRIMARY KEY (`ID`)
) ENGINE=InnoDB AUTO_INCREMENT=9 DEFAULT CHARSET=utf8;

LOCK TABLES `t` WRITE;
INSERT INTO `t` VALUES (1,'2012-01-01 00:00:00'),(2,'2012-02-01 00:00:00'),(3,'2012-02-01 12:00:00'),(4,'2012-03-01 00:00:00'),(5,'2012-04-01 00:00:00'),(6,'2012-05-01 12:00:00'),(7,'2012-06-01 00:00:00'), (8,'2012-07-01 00:00:00');
UNLOCK TABLES;

SQLFiddle:

CREATE TABLE `data` (
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `DataDateTime` datetime DEFAULT NULL,
  `Value` int(11) DEFAULT NULL,
  PRIMARY KEY (`ID`)
);

INSERT INTO `data` (`DataDateTime`, `Value`) VALUES 
('2012-02-01 00:00:00',1),
('2012-03-01 01:00:00',2),
('2012-04-01 02:00:00',3),
('2012-05-01 03:00:00',4),
('2012-06-01 04:00:00',5),
('2012-07-01 05:00:00',6),
('2012-08-01 06:00:00',7),
('2012-09-01 07:00:00',8);


CREATE TABLE `t` (
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `TDateTime` datetime DEFAULT NULL,
  PRIMARY KEY (`ID`)
);

INSERT INTO `t` (`TDateTime`) VALUES 
('2012-01-01 00:00:00'),
('2012-02-01 00:00:00'),
('2012-02-01 12:00:00'),
('2012-03-01 00:00:00'),
('2012-04-01 00:00:00'),
('2012-05-01 12:00:00'),
('2012-06-01 00:00:00'),
('2012-07-01 00:00:00');

My Desired Output:

T.ID, T.TDateTime, Data.DataDateTime, Data.Value
1, 2012-01-01 00:00:00, 2012-02-01 00:00:00, 1
2, 2012-02-01 00:00:00, 2012-02-01 00:00:00, 1
3, 2012-02-01 12:00:00, 2012-02-01 00:00:00, 1
4, 2012-03-01 00:00:00, 2012-02-01 00:00:00, 1
5, 2012-04-01 00:00:00, 2012-03-01 01:00:00, 2
6, 2012-05-01 12:00:00, 2012-05-01 03:00:00, 4
7, 2012-06-01 00:00:00, 2012-05-01 03:00:00, 4
8, 2012-07-01 00:00:00, 2012-06-01 04:00:00, 5

OUTER APPLY/CROSS APPLY is not part of the SQL standard as far as I know(correct me if wrong). It is a M$ SQL server specific function. This is a typical symptom of vendor lock in. Hope you find a solution. — Namphibian, Oct 04 '12 at 10:01
Yes, I know! Hopefully there is another way. If this can be achieved using another method (JOIN etc) I am all ears. — Simon, Oct 04 '12 at 10:38
can you include schema for 'T' and 'E' above, or share a link on sqlfiddle to the above example — kasi, Oct 05 '12 at 02:35
Yes your right Andriy, modified - something simple in T-SQL is a bit more difficult in MySQL (but i guess could be vice versa in other situations) — Simon, Oct 05 '12 at 08:07

RichardTheKiwi · Answer 1 · 2012-10-07T05:56:56.943

Here's my submission :)

select *, if(Segment1Time<=ifnull(Segment2Time,Segment1Time),
             Segment1Value,
             Segment2Value) Value
from
(
  select *,
    (select DataDateTime from `data` where DataDateTime<=t.TDateTime order by DataDateTime desc limit 1) Segment1Time,
    (select Value from `data` where DataDateTime<=t.TDateTime order by DataDateTime desc limit 1) Segment1Value,
    (select DataDateTime from `data` where DataDateTime> t.TDateTime order by DataDateTime limit 1) Segment2Time,
    (select Value from `data` where DataDateTime> t.TDateTime order by DataDateTime limit 1) Segment2Value
  from `t` t
) X
order by tdatetime;

And here's the EXPLAIN for the query. The good thing about this query is that with an index on data.datadatetime, it is linear with the 4 subqueries all producing single SEEKs instead of having to go through all records to rank them. In theory, it should work better the larger the data table becomes.

ID  SELECT_TYPE TABLE   TYPE    POSSIBLE_KEYS   KEY KEY_LEN REF ROWS    EXTRA
1   PRIMARY <derived2>  ALL (null)  (null)  (null)  (null)  8   Using filesort
2   DERIVED t   ALL (null)  (null)  (null)  (null)  8   
6   DEPENDENT SUBQUERY  data    ALL DataDateTime    (null)  (null)  (null)  8   Using where; Using filesort
5   DEPENDENT SUBQUERY  data    index   DataDateTime    DataDateTime    9   (null)  1   Using where; Using index
4   DEPENDENT SUBQUERY  data    ALL DataDateTime    (null)  (null)  (null)  8   Using where; Using filesort
3   DEPENDENT SUBQUERY  data    index   DataDateTime    DataDateTime    9   (null)  1   Using where; Using index

+1 Thanks, I changed a few things around (> and < signs) but this has the advantage that its quicker and easily called from c# — Simon, Oct 07 '12 at 05:48
I obviously had trouble reading the t's and d's on the nondescript table names :) Fixed it - and thanks for the correction. — RichardTheKiwi, Oct 07 '12 at 05:57

score 2 · Answer 2 · answered Oct 07 '12 at 21:18

Took me a while to understand the requirements. In the end started analyzing your base query with the OUTER APPLY and changed it into this :

SELECT t.*, data.*
  FROM t
  JOIN (SELECT t_ID = t.ID, 
        data_ID = ISNULL((SELECT TOP 1 ID FROM data WHERE data.DataDateTime <= t.TDateTime ORDER BY DataDateTime DESC),
                         (SELECT TOP 1 ID FROM data WHERE data.DataDateTime > t.TDateTime ORDER BY DataDateTime ASC))
   FROM t) lnk
    ON lnk.t_ID = t.ID
  JOIN data
    ON data.ID = lnk.data_ID
  ORDER BY t.ID

Execution plan shows it as less efficient though, which kind of surprised me. However, adding an index on DataDateTime changed that dramatically something that might come in handy on your MSSQL version ?!

Anyway, starting from here I created this in MySQL :

SELECT t.*, data.*
  FROM t
  JOIN (SELECT t.ID t_ID, 
           COALESCE((SELECT ID FROM data WHERE data.DataDateTime <= t.TDateTime ORDER BY DataDateTime DESC LIMIT 1),
                           (SELECT ID FROM data WHERE data.DataDateTime > t.TDateTime ORDER BY DataDateTime ASC LIMIT 1)) data_ID
   FROM t) lnk
    ON lnk.t_ID = t.ID
  JOIN data
    ON data.ID = lnk.data_ID
  ORDER BY t.ID

Seems to do what it's expected to do...

Just realised that I missed the datediff in mine, but it could easily be added. With the dependency between the 2 segments expressed as coalesce here against the entire subquery, it would be more difficult, would it not? — RichardTheKiwi, Oct 08 '12 at 20:53
To be entirely honest, I'm still not certain I 'get' the /original/ query so I'm not sure if an extra DateDiff() is needed. I simply tried to figure out the needs based on the description and the expected output. — deroby, Oct 10 '12 at 10:29

score 1 · Answer 3 · answered Oct 05 '12 at 08:58

You could cross join t and data, then rank data rows for every t row based on (an equivalent of) the ORDER BY in your SQL Server query.

The DATEDIFF(ss, dt1, dt2) part could be replaced with UNIX_TIMESTAMP(dt2) - UNIX_TIMESTAMP(dt1). Ranking could be implemented using variables. Here's my attempt at a solution:

SELECT
  ID,
  TDateTime,
  DataDateTime,
  Value
FROM (
  SELECT
    ID,
    TDateTime,
    DataDateTime,
    Value,
    @rnk := @rnk * (@lastid = ID) + 1 AS rnk,
    @lastid := ID
  FROM (
    SELECT
      t.ID,
      t.TDateTime,
      data.DataDateTime,
      data.Value
    FROM
      t CROSS JOIN data,
      (SELECT @lastid := 0, @rnk := 0) s
    ORDER BY
      t.ID,
      (t.TDateTime >= data.DataDateTime) DESC,
      ABS(UNIX_TIMESTAMP(t.TDateTime) - UNIX_TIMESTAMP(data.DataDateTime))
  ) s
) s
WHERE
  rnk = 1
;

You can find a working demo at SQL Fiddle.

Thanks, this solution works, wish it was a little faster but correct nevertheless. — Simon, Oct 05 '12 at 10:39

score 1 · Answer 4 · answered Dec 16 '11 at 06:30

1

GROUP_CONCAT() ?

This link might help: http://www.youdidwhatwithtsql.com/comparing-tsql-cross-apply-mysql-groupconcat/280

answered Dec 16 '11 at 06:30

Unfortunately this wont help me to acheive my result. In some circumstances, GROUP_CONCAT would work, however – Simon Dec 16 '11 at 06:34

MySQL - Get Last Entry by Time (or first if Null)

4 Answers4