1

I have two tables Table A and Table B. Both table have million rows and a column - column C where data type is "varchar2(3000)".

Following is the schema:

Create Table TableA
ID number(20),
C Varchar2(3000)

Create Table TableB
ID number(20),
C Varchar2(3000),
C_HARSH Varchar2(100)

I only have read access to Table A. However, I am the owner and have full access to table B. Both tables have million rows

I need to make a comparison on the column C from both table.

Would it be faster if I firstly add a new column to Table B to store the hash value of the column C from table and compare the hash value of column C from Table A than simply compare long strings?

IE Would

B.C_Harsh = ora_hash(A.C,99,5)

Faster than

B.C = A.C

In my scenario?

-------------- Added example as requested --------------------

Following is one of the sample data in Column C

Small Changes

Following is one of the sample data in Column C(Small Changes)

Big Changes enter image description here

E. L.
  • 502
  • 3
  • 16
  • 1
    Describe your full scenario - do you compare only one or few rows or the full table? The *intuitive answer* is for few rows you will see no difference. You will spot a difference if *both tables have hash code* and *one of the tables is acces via DB link*. For all cases in between the *basic question* is **why don't you benchmark it?** – Marmite Bomber Dec 16 '19 at 08:47
  • 1
    As MB said, benchmark if with your data and see. One thing to keep in mind is that the 2 strings may hash to the same value without being equal, so you can use the hash comparison to reduce the number of actual string comparisons you do: select * from (select a.*, b.* from a, b where ora_hash(a.X)=ora_hash(b.Y)) where X=Y – gsalem Dec 16 '19 at 08:56
  • @MarmiteBomber, We will have million of rows to compare in a daily base, and have added example as per your request. – E. L. Dec 16 '19 at 22:52
  • However, the lucky thing is we can assume that the ID between two tables is the primary key of the table and we are only comparing rows between the two table with the same ID. – E. L. Dec 16 '19 at 23:01
  • @gsalem, Thanks for your reminder. It is very important. – E. L. Dec 16 '19 at 23:05
  • @gsalem, I just learn from my other post that we can use standard_hash instead of ora_hash to generate a unique hash code. However, bear in mind that the result is in raw format. – E. L. Mar 18 '20 at 23:44

0 Answers0