0

Input

id  lst_redeem_op_time
a  2020-07-22 11:14:37
a  2020-07-28 11:14:37
a  2020-07-30 14:55:04
b  2020-07-27 11:14:37
b  2020-07-28 11:14:37
b  2020-08-15 11:14:37
c  2020-07-27 11:14:37
c  2020-07-29 11:14:37
c  2020-07-31 11:14:37

Expected

id  lst_redeem_op_time
a  2020-07-22 11:14:37
a  2020-07-30 14:55:04
b  2020-07-27 11:14:37
b  2020-07-28 11:14:37
b  2020-08-15 11:14:37
c  2020-07-27 11:14:37
c  2020-07-31 11:14:37

For each id, delete records where the datediff between the date of lst_redeem_op_time and the date of latest time less than 2 days. The latest time is not '2020-08-15 11:14:37' but it's the max time for each id. (a:2020-07-30 14:55:04, b:2020-08-15 11:14:37, c:2020-07-31 11:14:37)

Try

I search these post1, post2 but still cannot get it.

version

  • 5.5
Jack
  • 1,724
  • 4
  • 18
  • 33

2 Answers2

1

With reference to the amended result set, I think you're after something like this:

DROP TABLE IF EXISTS my_table;

CREATE TABLE my_table
(id CHAR(1) NOT NULL
,lst_redeem_op_time DATETIME NOT NULL
,PRIMARY KEY (id,lst_redeem_op_time)
);

INSERT INTO my_table VALUES
('a','2020-07-22 11:14:37'),
('a','2020-07-28 11:14:37'),
('a','2020-07-30 14:55:04'),
('b','2020-07-27 11:14:37'),
('b','2020-07-28 11:14:37'),
('b','2020-08-15 11:14:37'),
('c','2020-07-27 11:14:37'),
('c','2020-07-29 11:14:37'),
('c','2020-07-31 11:14:37');

SELECT a.*  
  FROM my_table a
  LEFT
  JOIN
     ( SELECT id, MAX(lst_redeem_op_time) lst_redeem_op_time FROM my_table GROUP BY id) b
    ON b.id = a.id
   AND b.lst_redeem_op_time > a.lst_redeem_op_time
   AND DATEDIFF(b.lst_redeem_op_time,a.lst_redeem_op_time) <= 2
 WHERE b.id IS NULL;
+----+---------------------+
| id | lst_redeem_op_time  |
+----+---------------------+
| a  | 2020-07-22 11:14:37 |
| a  | 2020-07-30 14:55:04 |
| b  | 2020-07-27 11:14:37 |
| b  | 2020-07-28 11:14:37 |
| b  | 2020-08-15 11:14:37 |
| c  | 2020-07-27 11:14:37 |
| c  | 2020-07-31 11:14:37 |
+----+---------------------+

A DELETE for this might be as follows, but I would question the wisdom of deleting data from your database...

DELETE a.* -- .* is meaningless in this context, but MySQL allows it anyway
  FROM my_table a
  JOIN
     ( SELECT id, MAX(lst_redeem_op_time) lst_redeem_op_time FROM my_table GROUP BY id) b
    ON b.id = a.id
   AND b.lst_redeem_op_time > a.lst_redeem_op_time
   AND DATEDIFF(b.lst_redeem_op_time,a.lst_redeem_op_time) <= 2;
Strawberry
  • 33,750
  • 13
  • 40
  • 57
0

You have to use the DELETE with INNER JOIN Akina pointed out that the date calculation should be made in the subquery

CREATE TABLE table1 (
  `id` VARCHAR(1),
  `lst_redeem_op_time` DATETIME
);

INSERT INTO table1
  (`id`, `lst_redeem_op_time`)
VALUES
  ('a', '2020-07-22 11:14:37'),
  ('a', '2020-07-28 11:14:37'),
  ('a', '2020-07-30 14:55:04'),
  ('b', '2020-07-27 11:14:37'),
  ('b', '2020-07-28 11:14:37'),
  ('b', '2020-08-15 11:14:37'),
  ('c', '2020-07-27 11:14:37'),
  ('c', '2020-07-29 11:14:37'),
  ('c', '2020-07-31 11:14:37');
 SELECT 
 t1.`id`, `lst_redeem_op_time`
 FROM table1 t1 
 INNER JOIN       (SELECT `id`, MAX(DATE_ADD(`lst_redeem_op_time`, INTERVAL - 2 DAY)) max_time FROM table1 GROUP BY id) t2 ON t1.id = t2.id 
WHERE t1.`lst_redeem_op_time` < t2.max_time;
id | lst_redeem_op_time 
:- | :------------------
a  | 2020-07-22 11:14:37
a  | 2020-07-28 11:14:37
b  | 2020-07-27 11:14:37
b  | 2020-07-28 11:14:37
c  | 2020-07-27 11:14:37
DELETE t1
FROM table1 t1
        INNER JOIN
      (SELECT `id`, MAX(DATE_ADD(`lst_redeem_op_time`, INTERVAL - 2 DAY)) max_time FROM table1 GROUP BY id) t2 ON t1.id = t2.id 
WHERE t1.`lst_redeem_op_time` < t2.max_time;
SELECT * FROM table1
id | lst_redeem_op_time 
:- | :------------------
a  | 2020-07-30 14:55:04
b  | 2020-08-15 11:14:37
c  | 2020-07-29 11:14:37
c  | 2020-07-31 11:14:37

db<>fiddle here

even using DATEDIFF

The result is never the wanted result

CREATE TABLE table1 (
  `id` VARCHAR(1),
  `lst_redeem_op_time` DATETIME
);

INSERT INTO table1
  (`id`, `lst_redeem_op_time`)
VALUES
  ('a', '2020-07-22 11:14:37'),
  ('a', '2020-07-28 11:14:37'),
  ('a', '2020-07-30 14:55:04'),
  ('b', '2020-07-27 11:14:37'),
  ('b', '2020-07-28 11:14:37'),
  ('b', '2020-08-15 11:14:37'),
  ('c', '2020-07-27 11:14:37'),
  ('c', '2020-07-29 11:14:37'),
  ('c', '2020-07-31 11:14:37');
 SELECT 
 t1.`id`, `lst_redeem_op_time`
 FROM table1 t1 
 INNER JOIN       (SELECT `id`, MAX(`lst_redeem_op_time`) max_time FROM table1 GROUP BY id) t2 ON t1.id = t2.id 
WHERE DATEDIFF(t2.max_time,t1.`lst_redeem_op_time`)   > 2 ;
id | lst_redeem_op_time 
:- | :------------------
a  | 2020-07-22 11:14:37
b  | 2020-07-27 11:14:37
b  | 2020-07-28 11:14:37
c  | 2020-07-27 11:14:37
DELETE t1
FROM table1 t1
 INNER JOIN       (SELECT `id`, MAX(`lst_redeem_op_time`) max_time FROM table1 GROUP BY id) t2 ON t1.id = t2.id 
WHERE DATEDIFF(t2.max_time,t1.`lst_redeem_op_time`)   > 2 ;
SELECT * FROM table1
id | lst_redeem_op_time 
:- | :------------------
a  | 2020-07-28 11:14:37
a  | 2020-07-30 14:55:04
b  | 2020-08-15 11:14:37
c  | 2020-07-29 11:14:37
c  | 2020-07-31 11:14:37

db<>fiddle here

nbk
  • 45,398
  • 8
  • 30
  • 47
  • 2
    Shouldn't the results align with the question? – Strawberry Aug 17 '20 at 09:11
  • Perform DATEADD in the subquery - this will decrease the calculations amount... – Akina Aug 17 '20 at 09:11
  • You are both right, he probably wanted the last records, but that usn-ä#t hgere the question it stated 2 days, but for that he needs to use ROW:NUMBER by Partition – nbk Aug 17 '20 at 09:24