Redshift DELETE using slow Hash Join while equivalent SELECT uses Merge Join

Question

We are using the recommended method defined here for performing "upserts": http://docs.aws.amazon.com/redshift/latest/dg/merge-replacing-existing-rows.html

It is taking almost two minutes to load a file of just 150 rows. Almost all of this time is spent on this delete operation: delete from measurement using measurement_temp where measurement.measurement_tag_id = measurement_temp.measurement_tag_id and measurement.date_time = measurement_temp.date_time

Even if the temp table is empty, it still takes the nearly two minutes for this operation to complete.

It is still this slow after running a full vacuum on all tables.

Both of the columns in the predicate are part of the compound sort key, and the measurement_tag_id is the distribution key, so it's not clear to me why Redshift is taking so long.

The schema for this table looks like this: create table measurement( measurement_tag_id integer not null distkey, date_time timestamp not null, value_avg decimal(8,3), value_min decimal(8,3), value_max decimal(8,3), value_std_dev decimal(8,3), failed_qa_rule_id integer, primary key(measurement_tag_id, date_time), foreign key (measurement_tag_id) references measurement_tag(measurement_tag_id), foreign key (failed_qa_rule_id) references qa_rule(qa_rule_id) ) compound sortkey(measurement_tag_id, date_time);

Here is the query plan for the DELETE: XN Hash Join DS_DIST_NONE (cost=9.30..3368957513.17 rows=451945 width=6) Hash Cond: (("outer".date_time = "inner".date_time) AND ("outer".measurement_tag_id = "inner".measurement_tag_id)) -> XN Seq Scan on measurement (cost=0.00..26844282.88 rows=2684428288 width=18) -> XN Hash (cost=6.20..6.20 rows=620 width=12) -> XN Seq Scan on measurement_temp (cost=0.00..6.20 rows=620 width=12)

The equivalent SELECT returns almost instantly. Here's its query plan: explain select * from measurement as m join measurement_temp as mt on m.measurement_tag_id = mt.measurement_tag_id and m.date_time = mt.date_time

XN Merge Join DS_DIST_NONE (cost=0.00..40266436.05 rows=451945 width=144) Merge Cond: (("outer".measurement_tag_id = "inner".measurement_tag_id) AND ("outer".date_time = "inner".date_time)) -> XN Seq Scan on measurement m (cost=0.00..26844282.88 rows=2684428288 width=68) -> XN Seq Scan on measurement_temp mt (cost=0.00..6.20 rows=620 width=76)

So the DELETE is performing a Hash Join, while the SELECT is using a Merge Join which is much faster.

Any ideas how to speed this up? Most of the time, there isn't anything to delete (and it's still this slow), so I could add a SELECT query first to check whether it's necessary to delete anything, but it is still going to be slow for re-loading over existing data.

how big is measurement table? does vacuum first on that table make a difference — Jon Scott, Nov 20 '17 at 23:06
It is 2.7 billion rows, 100 GB. It is still this slow even after vacuum. — Jared Gommels, Nov 20 '17 at 23:11
how long does that delete take? and then how long does a similar select take, using the same data? and what does the explain say about the select? select * from measurement as m join measurement_temp as mt on m.measurement_tag_id = mt.measurement_tag_id and m.date_time = mt.date_time — Jon Scott, Nov 21 '17 at 07:16
In the explain plan, is Redshift doing a merge join or a hash join between these two tables? — Tony Gibbs, Nov 21 '17 at 07:22
I have the same problem and I have both `distkey` and `sortkey` set the same. Both vacuumed and analyzed. The `INNER JOIN` uses `Merge Join` in the query plan, and the `DELETE` uses a `Hash Join`. — bruno-uy, Jun 16 '22 at 13:29

Redshift DELETE using slow Hash Join while equivalent SELECT uses Merge Join

0 Answers0