0
UPDATE nas_backup
SET fiber_serviceability_class = '0', 
last_updated_ts = CURRENT_TIMESTAMP 
WHERE location_id IN ( 
SELECT location_id 
FROM ( 
WITH distinct_locs AS ( 
    SELECT location_id, boundary_type 
    FROM ( 
        SELECT location_id, boundary_type 
        FROM nc
        WHERE technology_type = 'Fibre' 
    ) 
    GROUP BY location_id, boundary_type 
    HAVING COUNT( * ) = 1
)
SELECT nas.location_id
FROM distinct_locs, nas_backup nas
WHERE distinct_locs.location_id = nas.location_id
AND distinct_locs.boundary_type = 'FSA'
GROUP BY nas.location_id
)
);

Can anyone suggest a way to optimize the query. It takes more than 5 minutes now.

Table nc has 16 million records and table nas_backup has 2 million records.

Peter Lang
  • 54,264
  • 27
  • 148
  • 161
  • You haven't mentioned any indexes, so adding those would be a good place to start. – Anthony Grist Jun 12 '14 at 10:10
  • Indexes are in place for the relevant columns. I just want a more optimized version of the query provided. – Wamglind Carmasaic Jun 12 '14 at 10:11
  • 2
    If you can provide the Actual execution plan we may be able to identify where the bottlenecks are ? – sarin Jun 12 '14 at 10:39
  • Just to be sure I'm understanding your goal: You want tu update all records in `nas_backup`where the location's technology type is 'Fibre' and boundary type is 'FSA'. Is there anything else to check? – Pred Jun 12 '14 at 11:09

2 Answers2

0

The EXISTS could help you a bit. Give it a try:

UPDATE nas_backup
SET fiber_serviceability_class = '0', 
last_updated_ts = CURRENT_TIMESTAMP 
-- Remove the IN condition and check only that at least one row exists
WHERE EXISTS ( 
SELECT location_id 
FROM ( 
    WITH distinct_locs AS ( 
        SELECT location_id, boundary_type 
        FROM ( 
            SELECT location_id, boundary_type 
            FROM nc
            WHERE technology_type = 'Fibre'
        ) 
        GROUP BY location_id, boundary_type 
        HAVING COUNT( * ) = 1
    )
    SELECT
        nas.location_id
    FROM
        distinct_locs
    WHERE
        distinct_locs.boundary_type = 'FSA'
        -- This is the new line
        AND distinct_locs.location_id = nas_backup.location_id
    GROUP BY
        nas.location_id
    )
);

However it would be easier to help if you share us your db's stucture and your goal.

Next time please share us the vendor and version of the DBMS you are using.

Pred
  • 8,789
  • 3
  • 26
  • 46
  • Thanks Pred. But I wonder why have you included the new line at the end. That condition is already included in the original query. – Wamglind Carmasaic Jun 12 '14 at 10:44
  • that line should refer to the table you are updating. This is why I included (and this is why i did not used the `nas` alias, but the table name. That condition links the subquery to the original table. – Pred Jun 12 '14 at 10:47
  • I've just edited the answer. I removed the 'join' from the last select. – Pred Jun 12 '14 at 10:48
0

Most of the things i say here should be done by a good optimizer anyway so this is mainly for reasons of better readability.

All the filters you have apply to the location part, so take it to the subquery to reduce the resultset:

SELECT location_id, boundary_type 
FROM nc 
WHERE technology_type = 'Fibre' AND nc.boundary_type='FSA'
GROUP BY location_id, boundary_type 
HAVING COUNT(*) = 1

Next, use JOIN syntax instead of implicit Join:

UPDATE nas fiber_serviceability_class = '0', last_updated_ts = CURRENT_TIMESTAMP 
FROM nas_backup nas
    JOIN (SELECT location_id, boundary_type 
            FROM nc 
            WHERE technology_type = 'Fibre' AND nc.boundary_type='FSA'
            GROUP BY location_id, boundary_type 
            HAVING COUNT(*) = 1) loc ON loc.location_id=nas.location_id

I don't know whether there is a special reason for the Subquery with the HAVING clause. Without knowing your structure and data it is only a guess but maybe you don't need that. It's a very simple query then which translates to "Update every nas_backup where technology type is Fibre and boundary type is FSA at the same time".

UPDATE nas fiber_serviceability_class = '0', last_updated_ts = CURRENT_TIMESTAMP 
FROM nas_backup nas
    JOIN nc ON nas.location_id=nc.location_id
WHERE nc.technology_type='Fibre'
    AND nc.boundary_type='FSA'
Martin K.
  • 1,050
  • 8
  • 19