0

I have a table1:

u_a_id  d_c_s   c_nm    c_seq   r_c_p
1       908     Test1   1       20
10      908     Test1   1       21
11      908     Test1   1       12
12759   908     Test1   1       31
12759   908     Test1   1       32
12861   878     Test2   1       43
12861   878     Test2   1       44

I have SQL which finds the min r_c_p and max r_c_p which are repeated:

select u_a_id, d_c_s, c_nm, c_seq, count(*) cnt, min(r_c_p) min_rcp, max(r_c_p) max_rcp
from table1
where c_nm not in ('VOID', 'WRONG')
group by u_a_id, d_c_s, c_nm, c_seq
having count(*) > 1;

I need to update d_c_s and c_nm using output of above SQL as per following:

  • for min(r_c_p) -- d_c_s = d_c_s + 12000 and c_nm = c_nm || '- 1'
  • for max(r_c_p) -- d_c_s = d_c_s + 12000 and c_nm = c_nm || '- 2'

The output would be:

u_a_id  d_c_s   c_nm       c_seq    r_c_p
1       908     Test1      1        20
10      908     Test1      1        21
11      908     Test1      1        12
12759   12908   Test1 - 1  1        31
12759   12908   Test1 - 2  1        32
12861   12878   Test2 - 1  1        43
12861   12878   Test2 - 2  1        44

DB Fiddle - Link

dang
  • 2,342
  • 5
  • 44
  • 91

3 Answers3

1

You must join your query to the table and apply the conditions in CASE statements:

select 
  t.u_a_id,
  case when g.u_a_id is null then t.d_c_s else 12000 + t.d_c_s end d_c_s,  
  case 
    when t.r_c_p = g.min_rcp then t.c_nm || '-1'
    when t.r_c_p = g.max_rcp then t.c_nm || '-2'
    else t.c_nm
  end c_nm,
  t.c_seq, 
  t.r_c_p
from table1 t left join (
  select u_a_id, d_c_s, c_nm, c_seq, count(*) cnt, min(r_c_p) min_rcp, max(r_c_p) max_rcp
  from table1
  where c_nm not in ('VOID', 'WRONG')
  group by u_a_id, d_c_s, c_nm, c_seq
  having count(*) > 1
) g on 
  g.u_a_id = t.u_a_id and g.d_c_s = t.d_c_s and g.c_nm = t.c_nm and g.c_seq = t.c_seq
  and t.r_c_p in (g.min_rcp, g.max_rcp)

See the demo.
Results:

> U_A_ID | D_C_S | C_NM    | C_SEQ | R_C_P
> -----: | ----: | :------ | ----: | ----:
>      1 |   908 | Test1   |     1 |    20
>     10 |   908 | Test1   |     1 |    21
>     11 |   908 | Test1   |     1 |    12
>  12759 | 12908 | Test1-1 |     1 |    31
>  12759 | 12908 | Test1-2 |     1 |    32
>  12861 | 12908 | Test1-1 |     1 |    43
>  12861 | 12908 | Test1-2 |     1 |    44
forpas
  • 160,666
  • 10
  • 38
  • 76
  • It is taking a long time to execute. It's been 10 mins and still not completed. I have about 7.5 million records. Is there a faster query which can be used? – dang Jun 17 '19 at 10:19
  • *7.5 million records*!!! You understand that you need the query that returns min and max. Then join on multiple columns, conditions to check for each row and calculations and concatenations. The requirement is complicated, I can't see a faster way. – forpas Jun 17 '19 at 10:25
  • Is update/merge query possible? Instead of select query? – dang Jun 17 '19 at 10:33
  • Update is possible, but I the performance will be worse I think. – forpas Jun 17 '19 at 10:39
  • I have checked - out of 7.5 million, only 2409 records are to be updated. – dang Jun 17 '19 at 10:42
1

You can use following Merge statement. Not sure about performance. But you can try this and then we can think of any performance tuning if required.

MERGE INTO TABLE1 T1 
USING (
       SELECT
           U_A_ID,
           D_C_S,
           C_NM,
           C_SEQ,
           MIN(R_C_P) MIN_RCP,
           MAX(R_C_P) MAX_RCP
       FROM
           TABLE1
       WHERE
           C_NM NOT IN (
               'VOID',
               'WRONG'
           )
       GROUP BY
           U_A_ID,
           D_C_S,
           C_NM,
           C_SEQ
       HAVING
           COUNT(1) > 1
   )
T2 ON ( T1.U_A_ID = T2.U_A_ID
        AND T1.C_SEQ = T2.C_SEQ )
WHEN MATCHED THEN UPDATE SET T1.D_C_S = T1.D_C_S + 12000,
T1.C_NM = CASE
    WHEN T1.R_C_P = T2.MIN_RCP THEN T1.C_NM || '- 1'
    ELSE T1.C_NM || '- 2'
END
WHERE
    T1.D_C_S = T2.D_C_S
    AND T1.C_NM = T2.C_NM
    AND T1.U_A_ID = T2.U_A_ID

DB Fiddle demo

Cheers!!

Popeye
  • 35,427
  • 4
  • 10
  • 31
0

You can use merge:

merge into table1 t1
    using (select tt1.u_a_id, tt1.d_c_s, t1.c_nm, tt1.c_seq, 
                  min(tt1.r_c_p) as min_r_c_p,
                  max(tt1.r_c_p) as max_r_c_p
           from table1 tt1
           where tt1.c_nm not in ('VOID', 'WRONG')
           group by tt1.u_a_id, tt1.d_c_s, t1.c_nm, tt1.c_seq
           having count(*) >= 2
          ) tt1
          on (tt1.u_a_id = t1.u_a_id and
              tt1.d_c_s = t1.d_c_s and
              tt1.c_nm = t1.c_nm and
              tt1.c_seq = t1.c_seq and
              tt1.r_c_p in (min_r_c_p, max_r_c_p)
             )
    when matched then
        set d_c_s = d_c_s + 12000,
            c_nm = (case when tt1.r_c_p = min_r_c_p then c_nm || '- 1'
                         when tt1.r_c_p = max_r_c_p then 
c_nm || '- 2'
                         else tt1.r_c_p
                    end);

For performance, I would recommend an index on (u_a_id, d_c_s, c_nm, c_seq, r_c_p).

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786