0

I would like do a checksum against different views implementing different approach supposed to produce the same result (assuming each view contains no duplicates and have the exact same columns).

Questions

  1. What is the easiest way to compute a table checksum or hash ?
  2. Is a checksum the only way to compare in Oracle if 2 tables are identical ?

Note: Googling around I found some old answers such as this one.

Hey StackExchange
  • 2,057
  • 3
  • 19
  • 35
  • [Get the SHA-512 of any SQL query](https://stackoverflow.com/questions/50724104/get-the-sha-512-of-any-sql-query) - if you want to comapre if 2 queries are equivalent to each other then please read comment section – Lukasz Szozda Jun 14 '18 at 12:10
  • Thanks. Which comment section ? If on the link I provided, it is from 2005 – Hey StackExchange Jun 14 '18 at 12:20
  • I meant comments from link that I provided. Especially examples what could go wrong even if two queries return same resultset – Lukasz Szozda Jun 14 '18 at 12:23
  • OK. I have read but sounds heavy (e.g. DBMS_COMPARISON - http://download.oracle.com/docs/cd/B28359_01/appdev.111/b28419/d_comparison.htm). Was wondering if there was others approach such as computing the checksum of each rows, sorting these checksum row and doing a checksum of all them. If so, what is the function in Oracle doing a checksum ? – Hey StackExchange Jun 14 '18 at 12:26
  • I recommend blog https://stewashton.wordpress.com/2018/01/26/techniques-for-comparing-tables/ and search for techniques described there. – Lukasz Szozda Jun 14 '18 at 12:28
  • Thanks. Do you have an idea on how to do an `Group by (HASH)`. I would be interested knowing how this hash is calculated. The blog doesn't provide any example but performance comparison between Technics. – Hey StackExchange Jun 14 '18 at 12:34
  • The optimiser has two possible approaches to computing results of `group by` queries. I don't think think this is relevant to your question, though. – William Robertson Jun 14 '18 at 12:42
  • 1
    If you are looking for a hash function, you can use [`ora_hash`](https://docs.oracle.com/en/database/oracle/oracle-database/12.2/sqlrf/ORA_HASH.html). – William Robertson Jun 14 '18 at 12:56
  • Thanks! Will try this ora_hash function. Up voted – Hey StackExchange Jun 14 '18 at 12:59

1 Answers1

1

Is a checksum the only way to compare in Oracle if 2 tables are identical ?

No, since you have no duplicates, you can use the MINUS operator:

SELECT * FROM Table1
MINUS
SELECT * FROM Table2

Will return all the rows from Table1 that do not exist in Table2. If zero rows are returned then all the rows in Table1 also exist in Table2.

If you want to check the tables in both directions then:

SELECT *
FROM   (
  SELECT t1.*, 'T1' AS "WHERE" FROM Table1 t1
  MINUS
  SELECT t2.*, 'T1' FROM Table2 t2
)
UNION ALL
(
  SELECT t2.*, 'T2' FROM Table2 t2
  MINUS
  SELECT t1.*, 'T2' FROM Table1 t1
)

Again, if no rows are returned then the tables are identical.

If you are comparing tables with duplicates and want the same number of duplicate rows in each table then you can use something like:

SELECT Col1, Col2, Col3, /*...*/ ColN,
       ROW_NUMBER() OVER ( PARTITION BY Col1, Col2, Col3, /*...*/ ColN ORDER BY ROWNUM )
         AS rn
FROM   table1
MINUS
SELECT Col1, Col2, Col3, /*...*/ ColN,
       ROW_NUMBER() OVER ( PARTITION BY Col1, Col2, Col3, /*...*/ ColN ORDER BY ROWNUM )
         AS rn
FROM   table2
MT0
  • 143,790
  • 11
  • 59
  • 117
  • Thanks. Tested both and sounds to work. Just changed both direction with: SELECT * FROM ( SELECT VIEW_B2.*, 'T1' AS t FROM VIEW_B2 MINUS SELECT VIEW_B3.*, 'T1' FROM VIEW_B3 ) UNION ALL ( SELECT VIEW_B3.*, 'T2' as t FROM VIEW_B3 MINUS SELECT VIEW_B2.*, 'T2' FROM VIEW_B2 ); – Hey StackExchange Jun 14 '18 at 12:58
  • Does adding distinct in the select clauses allows these solutions to work in case of duplicates ? – Hey StackExchange Jun 14 '18 at 13:01
  • 1
    @HeyStackExchange Using `DISTINCT` (or even without using `DISTINCT`) would let you check whether there are rows in the first table that are not in the second table but not that you have the same number of duplicates in each table. I've updated with a method of checking that you have the same number of duplicate rows in each table. – MT0 Jun 14 '18 at 13:08
  • Thanks! Sound no needs checksum. Tested. Marked as correct and up voted! – Hey StackExchange Jun 14 '18 at 13:44