2

I received sales data from a client to integrate it into his datawarehouse. Normally there are two columns to define a sales row - num_transcation and num_line_transaction. But I received a lot of rows with the same num_transcation and num_line_transaction so when I execute this query:

select * 
from 
    (select 
         ROW_NUMBER() over(partition by ll_vente_num_transaction, ll_vente_num_ligne_transaction 
                           order by ll_vente_num_transaction) rn, * 
     from my_table) t
where 
    rn > 2 

I get almost 43000 rows with the same identity (num_transcation and num_line_transaction) and the only thing that differs those rows is the sales date.

I need a way to update "num_line_transaction" to be able to load the data in the datawarhouse, otherwise it will load only one row of every "num_transcation" and "num_line_transaction" combination.

For example:

+-----------------+-----------------------+
| NUM_TRANSACTION | NUM_LIGNE_TRANSACTION |
+-----------------+-----------------------+
|       106969796 |               3148291 |
|       106969796 |               3148291 |
|       106969796 |               3148291 |
|       106969796 |               3148291 |
|       106969796 |               3148291 |
|       106969796 |               3148291 |
|       106969796 |               3148291 |
+-----------------+-----------------------+

I need to update those rows to become:

+-----------------+-------------------------+
| NUM_TRANSACTION |   NUM_LIGNE_TRANSACTION |
+-----------------+-------------------------+
|       106969796 |                 3148291 |
|       106969796 |                 3148292 |
|       106969796 |                 3148293 |
|       106969796 |                 3148294 |
|       106969796 |                 3148295 |
|       106969796 |                 3148296 |
|       106969796 |                 3148297 |
+-----------------+-------------------------+

Please, any idea will be appreciated. I'm stuck here.

Barbaros Özhan
  • 59,113
  • 10
  • 31
  • 55

1 Answers1

0

If you have a distinguishing (unique column such as sales_date as you mentioned ), the following statement will help :

with t2 as
(
select t.*,
       row_number() over (partition by num_transaction order by num_ligne_transaction)+
       num_ligne_transaction - 1
        as rn 
 from tab t
)
update t1
   set t1.num_ligne_transaction = t2.rn
  from tab t1
  join t2
    on t1.num_transaction = t1.num_transaction
   and t1.sales_date = t2.sales_date;

Demo

Barbaros Özhan
  • 59,113
  • 10
  • 31
  • 55