0

I have (relatively) basic access rights to an 11G datawarehouse. Some of the processing I do involves joining big tables to big tables with some calculations or adding additional columns to a base set of data. I generally bring data into my own schema for calculation but warehouse processing power is limited and some queries take a long time as most of the work involves affecting or joining data to all rows of a table (lots of full table scans on the explain plan!).

My question is if I have a reference number as a primary key on two tables, if I also use a hashed version of this reference number, will I see any performance increase?

If not, is there any other way to boost performance in these situations?

Ben
  • 51,770
  • 36
  • 127
  • 149
  • I don't understand the question, can you add an example with some code? – Jon Heller Jun 10 '13 at 23:20
  • No, you firstly have to perform the hash, which'll take time. Secondly, you'll end up a large number of clashes and lastly, you're already joining on a number, why would joining with another number speed anything up? – Ben Jun 10 '13 at 23:43

1 Answers1

1

If you are equijoining then you're probably getting a hash join anyway, and converting the join keys to hash values is implicit in that.

The best method for getting better equijoin performance between large tables is to hash partition both of them on the join key(s).

The partition-wise join gives performance benefits by avoiding the hash tables from spilling to disk if they get over a particular size, so you choose the number of partitions to try to keep the memory allocation on the hash join low.

David Aldridge
  • 51,479
  • 8
  • 68
  • 96