1

I have the following current tables:

table_1
id | timestamp | origin | info

table_2
id | timestamp | origin | type

My aim is to find, for each line in table 2, the origin event in table 1. I want to keep only the first one. For instance:

table 1
1 | 1000 | "o1" | "i1"
2 | 2000 | "o2" | "i2"
3 | 2010 | "o2" | "i2"

table 2
1 | 1010 | "o1" | "t1"
2 | 2100 | "o2" | "t2"

My expected result is:

table_2.id | table_2.timestamp | table_2.origin | table_2.type | table_1.info | table_1.timestamp
1          | 1010              | "o1"           | "t1"         | "i1"         | 1000
2          | 2100              | "o2"           | "t2"         | "i2"         | 2010

Currently I'm just using a simple join on origin and table_2.timestamp > table_1.timestamp which give me:

table_2.id | table_2.timestamp | table_2.origin | table_2.type | table_1.info | table_1.timestamp
1          | 1010              | "o1"           | "t1"         | "i1"         | 1000
2          | 2100              | "o2"           | "t2"         | "i2"         | 2000
2          | 2100              | "o2"           | "t2"         | "i2"         | 2010

As you can see I don't want second line above because I just want first closest event in table_1.

Any ideas?

GMB
  • 216,147
  • 25
  • 84
  • 135
Ekans
  • 997
  • 1
  • 8
  • 16

1 Answers1

1

A cross-database solution is to join and filter with a correlated subquery:

select 
    t2.*,
    t1.info,
    t1.timestamp t1_timestamp
from 
    table_2 t2
    inner join table_1 t1
        on t1.origin = t2.origin
        and t1.timestamp = (
            select max(t11.timestamp) 
            from table_1 t11
            where t11.origin = t2.origin and t11.timestamp < t2.timestamp
        )
order by t2.id

Since you are using Postgres, you can use handy syntax distinct on; this might actually perform better:

select 
    distinct on(t2.id)
    t2.*,
    t1.info,
    t1.timestamp t1_timestamp
from 
    table_2 t2
    inner join table_1 t1 
        on t1.origin = t2.origin and t1.timestamp < t2.timestamp
order by t2.id, t1.timestamp desc

Demo on DB Fiddle - both queries yield:

id | timestamp | origin | type | info | t1_timestamp
-: | --------: | :----- | :--- | :--- | -----------:
 1 |      1010 | o1     | t1   | i1   |         1000
 2 |      2100 | o2     | t2   | i2   |         2010
GMB
  • 216,147
  • 25
  • 84
  • 135