-1

While working on DB2, I have 2 tables:

TABLE_A >

2 columns: CLAIM_ID, CODEID_A

TABLE_B >

2 columns: CLAIM_ID, CODEID_B

The goal is to write a DB2 Update statement that would check if CLAIM_ID of TABLE_A is the same as CLAIM_ID of TABLE_B, then replace CODEID_A (of that CLAIM_ID of TABLE_A) with CODEID_B (of that CLAIM_ID of TABLE_B).

TABLE_A and TABLE_B both have around 400 million rows. VSch is the schema name.

Found that correlated subqueries should work:

UPDATE MY_SCHEMA.TABLE_A apc SET CODEID_A = (SELECT TABLE_B.CODEID_B FROM SCHEMA_2.TABLE_B TABLE_B 
  WHERE TABLE_B.CLAIM_ID = apc.CLAIM_ID)
  WHERE EXISTS (SELECT 1 FROM SCHEMA_2.TABLE_B TABLE_B WHERE (TABLE_B.CLAIM_ID = apc.CLAIM_ID));

Here is the DETAILS of the Column structure and indexes in both tables - TableA and TableB: https://codeshare.io/armxAv

My question is, if we use Ingest instead of Update here, would that be faster?

If we go via DB2 Ingest route, we can first DB2 extract the list of values to be updated from TABLE_B into a FLAT file and then do a DB2 INGEST from that FLAT file into TABLE_A.


Would that be an efficient way of doing that in terms of time complexity? If yes, can someone help me with the Ingest query for the same?

P_user526
  • 65
  • 3
  • 11
  • You may find the creation and reading of the flat file will take longer than just doing it with sql. – Steve Jan 04 '21 at 21:21
  • Edit your question to improve its precision. Specifically state whether TableA or TableB have indexes, on which columns, and whether those indexes are unique. If there are no unique indexes, state what should happen then the same ClaimsId exists in both tables, but that ClaimsID has more than one different CodeId in Table-B. In almost __all__ cases, it is much faster to do things inside the database than to extract and ingest. But you have to __fully specify__ the possibilities of the data to get the query correct. Unlogged session tables can be very helpful for data marshalling. – mao Jan 05 '21 at 08:49
  • Hey @mao, can you please explain on "Unlogged session tables" concept in this case? – P_user526 Jan 05 '21 at 16:30
  • Edit your question to fully and clearly answer the clarifications I mention above. – mao Jan 05 '21 at 16:35
  • did the TableA is a row organized tables or a column organized tables ? there is a limitation in the update for column organized table, the update uses index only if it bring a uniq row. https://www.ibm.com/support/knowledgecenter/SSEPGG_11.1.0/com.ibm.db2.luw.admin.perf.doc/doc/r0061828.html (Update and delete operations that use an index scan on a column-organized table are not supported by the FETCH operator. Update and delete operations that affect only a single row are supported by using either index-only access or the nested-loop fetch approach.) – mshabou Jan 07 '21 at 14:08
  • @mao I added details and here are the columns/index details: https://codeshare.io/armxAv – P_user526 Jan 27 '21 at 23:40
  • @mshabou How can I verify this? I'm fairly new to DB2 – P_user526 Jan 27 '21 at 23:44
  • db2look -nofed -d -z -t -e – mshabou Jan 28 '21 at 15:32
  • @mshabou I'm not able to execute the db2look command in the IntelliJ DB2 console. I get the error " expected, got 'db2look'". How can I still get the info? – P_user526 Jan 28 '21 at 20:02
  • @mshabou I have uploaded Tables info in here https://codeshare.io/armxAv If this is helpful. – P_user526 Jan 28 '21 at 22:48
  • you need to run db2look from the cmd (shell) – mshabou Jan 28 '21 at 22:59

1 Answers1

1

You can also consider using the MERGE statement

MERGE INTO tablea a 
   USING (SELECT claimsid_b, codeid_b FROM tableb) b 
   ON (a.claimsid_a = b.claimsid_b) 
   WHEN MATCHED THEN  
     UPDATE SET 
        a.codeid_a = b.codeid_b 

PS. not tested, typed directly here.. but you got the idea...

Samuel Pizarro
  • 267
  • 1
  • 8
  • In terms of effectiveness (time complexity), which one (amongst Merge, Update or Ingest) would be better for this case scenario? – P_user526 Jan 04 '21 at 22:39
  • This is not really an answer to the question. They asked if ingest would be more efficient than using straight sql. – Steve Jan 04 '21 at 22:43