5

I have the following table structure, for a table named listens with PRIMARYKEY on (uid,timestamp)

     Column     |            Type             |                      Modifiers                       
----------------+-----------------------------+------------------------------------------------------    
 id             | integer                     | not null default nextval('listens_id_seq'::regclass)
 uid            | character varying           | not null
 date           | timestamp without time zone | 
 timestamp      | integer                     | not null
 artist_msid    | uuid                        | 
 album_msid     | uuid                        | 
 recording_msid | uuid                        | 
 json           | character varying           | 

I need to remove all the entries for a particular user (uid) which are older than the max timestamp, say max is 123456789 (in seconds) and delta is 100000, then, all records older than max-100000.

I have managed to create a query when the table contains a single user but i am unable to formulate it to work for every user in the database. This operation needs to be done for every user in the database.

WITH max_table as (
    SELECT max(timestamp) - 10000 as max 
    FROM listens 
    GROUP BY uid) 
DELETE FROM listens 
WHERE timestamp < (SELECT max FROM max_table);

Any solutions?

Pinkesh Badjatiya
  • 346
  • 1
  • 5
  • 15

2 Answers2

13

I think all you need, is to make this a co-related subquery:

WITH max_table as (
    SELECT uid, max(timestamp) - 10000 as mx
    FROM listens 
    GROUP BY uid
) 
DELETE FROM listens 
WHERE timestamp < (SELECT mx
                   FROM max_table 
                   where max_table.uid = listens.uid);

Btw: timestamp is a horrible name for a column, especially one that doesn't contain a timestamp value. One reason is because it's also a keyword but more importantly it doesn't document what that column contains. A registration timestamp? An expiration timestamp? A last active timestamp?

  • `timestamp` contains the listening timestamp (actual one) of the record which is being inserted. The data is the listens data of every user. – Pinkesh Badjatiya May 14 '16 at 10:08
3

Alternatively, you could avoid the MAX() by using an EXISTS()

DELETE FROM listens d
WHERE EXISTS (
        SELECT * FROM listens x
        WHERE x.uid = d.uid
        AND x.timestamp >= d.timestamp + 10000 
        );  

BTW: timestamp is an ugly name for a column, since it is also a typename.

wildplasser
  • 43,142
  • 8
  • 66
  • 109