2

I managed to write my query to get me correct data but to me it looks pretty bad since i had to use query inside query 3 times and even doe query performance is fine now around 700msec i am afraid it will slow down in future when there will be more data to process. Any info on how bad is this and how can i optimize it would be appreciated.

Edit:

I forgot to mention that tables s3 and s14 have multiple rows with same parcelno and i always need newest row from both tables(which is determined by sdate and stime). If newest row from s14 is newer than newest from s3, or newest row from s3 column emadr2 has same value as column parcelshop_id from table d, data is not shown. Keep in mind that these tables where not created by me and i am only reading data from them.

SELECT 
    q1.ddepot, 
    q1.parcelno, 
    q1.sdate, 
    q1.stime, 
    q1.dpostal, 
    q1.service, 
    q1.lorry,
    q1.zc5x3,
    q1.parcelshop_id,
    q1.country,
    q1.dname1
FROM(
    SELECT DISTINCT ON (q.parcelno) q.* FROM(
        SELECT 
            d.ddepot, 
            d.parcelno, 
            s3.sdate, 
            s3.stime, 
            d.dpostal, 
            d.service, 
            s3.lorry,
            s3.zc5x3,
            d.parcelshop_id,
            s3.country,
            d.dname1,
            s3.emadr1,
            s3.emadr2
        FROM dispatcher.detour_avis d
        LEFT JOIN scans.scandata03 s3 ON d.parcelno = s3.parcelno
        LEFT JOIN scans.scandata14 s14 ON d.parcelno = s14.parcelno 
        WHERE   
            d.ddate > (NOW() - interval '5 day') 
            AND d.parcelshop_id IS NOT NULL 
            AND s3.parcelno IS NOT NULL 
            AND (s14.parcelno IS NULL OR (s14.sdate + s14.stime)::timestamp without time zone < (s3.sdate + s3.stime)::timestamp without time zone)
        ORDER BY s3.sdate, s3.stime DESC
    )q 
    ORDER BY q.parcelno
) q1
WHERE q1.parcelshop_id != q1.emadr2

explain (analyze, verbose):

Subquery Scan on q1  (cost=68552.93..68554.90 rows=84 width=68) (actual time=701.318..701.324 rows=4 loops=1)
  Output: q1.ddepot, q1.parcelno, q1.sdate, q1.stime, q1.dpostal, q1.service, q1.lorry, q1.zc5x3, q1.parcelshop_id, q1.country, q1.dname1
  Filter: ((q1.parcelshop_id)::text <> (q1.emadr2)::text)
  Rows Removed by Filter: 2
  ->  Unique  (cost=68552.93..68553.85 rows=84 width=87) (actual time=701.310..701.314 rows=6 loops=1)
        Output: d.ddepot, d.parcelno, s3.sdate, s3.stime, d.dpostal, d.service, s3.lorry, s3.zc5x3, d.parcelshop_id, s3.country, d.dname1, s3.emadr1, s3.emadr2
        ->  Sort  (cost=68552.93..68553.39 rows=184 width=87) (actual time=701.309..701.311 rows=15 loops=1)
              Output: d.ddepot, d.parcelno, s3.sdate, s3.stime, d.dpostal, d.service, s3.lorry, s3.zc5x3, d.parcelshop_id, s3.country, d.dname1, s3.emadr1, s3.emadr2
              Sort Key: d.parcelno
              Sort Method: quicksort  Memory: 27kB
              ->  Sort  (cost=68543.71..68544.17 rows=184 width=87) (actual time=701.269..701.269 rows=15 loops=1)
                    Output: d.ddepot, d.parcelno, s3.sdate, s3.stime, d.dpostal, d.service, s3.lorry, s3.zc5x3, d.parcelshop_id, s3.country, d.dname1, s3.emadr1, s3.emadr2
                    Sort Key: s3.sdate, s3.stime
                    Sort Method: quicksort  Memory: 27kB
                    ->  Nested Loop  (cost=0.00..68536.79 rows=184 width=87) (actual time=689.775..701.238 rows=15 loops=1)
                          Output: d.ddepot, d.parcelno, s3.sdate, s3.stime, d.dpostal, d.service, s3.lorry, s3.zc5x3, d.parcelshop_id, s3.country, d.dname1, s3.emadr1, s3.emadr2
                          Join Filter: ((s14.parcelno IS NULL) OR ((s14.sdate + s14.stime) < (s3.sdate + s3.stime)))
                          Rows Removed by Join Filter: 16
                          ->  Nested Loop Left Join  (cost=0.00..57423.07 rows=455 width=74) (actual time=689.615..700.578 rows=14 loops=1)
                                Output: d.ddepot, d.parcelno, d.dpostal, d.service, d.parcelshop_id, d.dname1, s14.parcelno, s14.sdate, s14.stime
                                ->  Seq Scan on dispatcher.detour_avis d  (cost=0.00..49247.17 rows=455 width=47) (actual time=689.535..700.162 rows=11 loops=1)
                                      Output: d.id, d.parcelno, d.service, d.detour_type, d.ddepot, d.dname1, d.dname2, d.dstreet, d.dhouseno, d.dcountryn, d.dstate, d.dpostal, d.dcity, d.dphone, d.odepot, d.oname1, d.oname2, d.ostreet, d.ohouseno, d.ocoun (...)
                                      Filter: ((d.parcelshop_id IS NOT NULL) AND (d.ddate > (now() - '5 days'::interval)))
                                      Rows Removed by Filter: 985930
                                ->  Append  (cost=0.00..17.92 rows=5 width=33) (actual time=0.036..0.036 rows=1 loops=11)
                                      ->  Seq Scan on scans.scandata14 s14  (cost=0.00..0.00 rows=1 width=58) (actual time=0.000..0.000 rows=0 loops=11)
                                            Output: s14.parcelno, s14.sdate, s14.stime
                                            Filter: ((d.parcelno)::text = (s14.parcelno)::text)
                                      ->  Index Scan using scandata14_2013_pl_indx on scans.scandata14_2013 s14_1  (cost=0.14..0.25 rows=1 width=27) (actual time=0.001..0.001 rows=0 loops=11)
                                            Output: s14_1.parcelno, s14_1.sdate, s14_1.stime
                                            Index Cond: ((d.parcelno)::text = (s14_1.parcelno)::text)
                                      ->  Index Scan using scandata14_2014_pl_indx on scans.scandata14_2014 s14_2  (cost=0.29..4.29 rows=1 width=27) (actual time=0.007..0.007 rows=0 loops=11)
                                            Output: s14_2.parcelno, s14_2.sdate, s14_2.stime
                                            Index Cond: ((d.parcelno)::text = (s14_2.parcelno)::text)
                                      ->  Index Scan using scandata14_2015_pl_indx on scans.scandata14_2015 s14_3  (cost=0.42..6.47 rows=1 width=27) (actual time=0.010..0.010 rows=0 loops=11)
                                            Output: s14_3.parcelno, s14_3.sdate, s14_3.stime
                                            Index Cond: ((d.parcelno)::text = (s14_3.parcelno)::text)
                                      ->  Index Scan using scandata14_2016_pl_indx on scans.scandata14_2016 s14_4  (cost=0.42..6.91 rows=1 width=27) (actual time=0.014..0.015 rows=1 loops=11)
                                            Output: s14_4.parcelno, s14_4.sdate, s14_4.stime
                                            Index Cond: ((d.parcelno)::text = (s14_4.parcelno)::text)
                          ->  Append  (cost=0.00..24.34 rows=5 width=80) (actual time=0.044..0.045 rows=2 loops=14)
                                ->  Seq Scan on scans.scandata03 s3  (cost=0.00..0.00 rows=1 width=186) (actual time=0.000..0.000 rows=0 loops=14)
                                      Output: s3.sdate, s3.stime, s3.lorry, s3.zc5x3, s3.country, s3.emadr1, s3.emadr2, s3.parcelno
                                      Filter: ((s3.parcelno IS NOT NULL) AND ((d.parcelno)::text = (s3.parcelno)::text))
                                ->  Index Scan using scandata03_2013_pl_indx on scans.scandata03_2013 s3_1  (cost=0.14..0.26 rows=1 width=51) (actual time=0.001..0.001 rows=0 loops=14)
                                      Output: s3_1.sdate, s3_1.stime, s3_1.lorry, s3_1.zc5x3, s3_1.country, s3_1.emadr1, s3_1.emadr2, s3_1.parcelno
                                      Index Cond: (((s3_1.parcelno)::text = (d.parcelno)::text) AND (s3_1.parcelno IS NOT NULL))
                                ->  Index Scan using scandata03_2014_pl_indx on scans.scandata03_2014 s3_2  (cost=0.42..7.55 rows=1 width=53) (actual time=0.009..0.009 rows=0 loops=14)
                                      Output: s3_2.sdate, s3_2.stime, s3_2.lorry, s3_2.zc5x3, s3_2.country, s3_2.emadr1, s3_2.emadr2, s3_2.parcelno
                                      Index Cond: (((s3_2.parcelno)::text = (d.parcelno)::text) AND (s3_2.parcelno IS NOT NULL))
                                ->  Index Scan using scandata03_2015_pl_indx on scans.scandata03_2015 s3_3  (cost=0.42..8.21 rows=1 width=54) (actual time=0.013..0.013 rows=0 loops=14)
                                      Output: s3_3.sdate, s3_3.stime, s3_3.lorry, s3_3.zc5x3, s3_3.country, s3_3.emadr1, s3_3.emadr2, s3_3.parcelno
                                      Index Cond: (((s3_3.parcelno)::text = (d.parcelno)::text) AND (s3_3.parcelno IS NOT NULL))
                                ->  Index Scan using scandata03_2016_pl_indx on scans.scandata03_2016 s3_4  (cost=0.43..8.31 rows=1 width=55) (actual time=0.019..0.020 rows=2 loops=14)
                                      Output: s3_4.sdate, s3_4.stime, s3_4.lorry, s3_4.zc5x3, s3_4.country, s3_4.emadr1, s3_4.emadr2, s3_4.parcelno
                                      Index Cond: (((s3_4.parcelno)::text = (d.parcelno)::text) AND (s3_4.parcelno IS NOT NULL))
Planning time: 4.670 ms
Execution time: 701.550 ms
  • An EXPLAIN ANALYZE might be useful. – Jonnix Nov 09 '16 at 11:36
  • 2
    Please [edit] your question add the `create table` statements for the tables in question (including all indexes) and the execution plan generated using **`explain (analyze, verbose)`**. [_Formatted_](http://stackoverflow.com/editing-help#code) **text** please, [no screen shots](http://meta.stackoverflow.com/questions/285551/why-may-i-not-upload-images-of-code-on-so-when-asking-a-question/285557#285557) –  Nov 09 '16 at 11:37
  • Note: `AND s3.parcelno IS NOT NULL` will turn the LEFT JOIN into a plain join. – wildplasser Nov 09 '16 at 12:13
  • Note2: `ORDER BY s3.sdate, s3.stime DESC` looks wrong. Why not combine date+time into a timestamp? And, since you seem to be interested in *the most recent* `s3` record, why don't you select the most recent s3 record, instead of the (wrong) order by + distinct? – wildplasser Nov 09 '16 at 12:36
  • @wildplasser That `order by` is used for nothing and since he says the results are correct I suppose it is irrelevant. – Clodoaldo Neto Nov 09 '16 at 12:48
  • 1
    IMO that `ORDER BY` is used to dictate which detail record from s3 will show up in the `DISTINCT ON ...` My guess is that there is an 1:N relation between d and s3. – wildplasser Nov 09 '16 at 12:53
  • @wildplasser yes that is correct that's why i used ORDER BY. I added explain (analyze, verbose) and more text explanation in my original post. – Karlo Petravić Nov 09 '16 at 13:20
  • I repeat: you are *not* selecting the most recent s3 record. `ORDER BY s3.sdate, s3.stime DESC` will put the oldest date first, and the largest time first. At least you will need: `ORDER BY s3.sdate DESC, s3.stime DESC` – wildplasser Nov 09 '16 at 13:24
  • @wildplasser ok thank you – Karlo Petravić Nov 09 '16 at 13:30
  • What happens if you move `WHERE q1.parcelshop_id != q1.emadr2` inside the deepest `SELECT`? In the end it should look like `d.ddate > (NOW() - interval '5 day') AND ... AND d.parcelshop_id != s3.emadr2`. – pietrop Nov 09 '16 at 23:30
  • @pietrop that eliminates records from s3 and results in query not giving newest record from that table if newest record had d.parcelshop_id == s3.emadr2 so then it would join older record from s3 and that makes results invalid – Karlo Petravić Nov 10 '16 at 07:23

1 Answers1

1

It seems to me there are lots of unnecessary nesting. Check if this is functionally equivalent

select distinct on (d.parcelno) d.*
from
    dispatcher.detour_avis d
    inner join
    scans.scandata03 s3 on d.parcelno = s3.parcelno
    left join
    scans.scandata14 s14 on d.parcelno = s14.parcelno
where
    d.ddate > now() - interval '5 day'
    and d.parcelshop_id is not null and parcelshop_id != emadr2
    and (
        s14.parcelno is null or
        (s14.sdate + s14.stime)::timestamp < (s3.sdate + s3.stime)::timestamp
    )
order by d.parcelno

As you are doing a left join and putting in the where clause the s3.parcelno is not null condition which contains the right table column join condition you are really doing an inner join. So I just eliminated it from the where clause and turned the left into an inner join

Clodoaldo Neto
  • 118,695
  • 26
  • 233
  • 260