0

I have a table t1 on which I run analytic functions. Please consider Netezza database. This table is intermediate table so it has no keys. It is used for ETL/ELT processing before loading data to final table t2.

Now , I want to assign row_number() to each row of t1. Table t1 has structure similar to following.

    group_id  varchar(50)

    file_id   varchar(50)

    rec_num   varchar(50)

    field_4   varchar(50)

    field_5   varchar(50)

    field_6   varchar(50)

    field_7   varchar(50)

    field_8   varchar(50)

unfortunately none of the fields listed above are unique. Their combination as a whole row is unique but individually none of them are.

I am running analytic function on table t1 repeatedly 7 times. If I do following then I don't get expected results.

  create table t3 as select group_id, file_id,rec_num. field_4, ,dense_rank() over ( order by field_4) r1, row_number() over (order by group_id) from t1 ;

  create table t4 as select group_id, file_id,rec_num. field_5, ,dense_rank() over ( order by field_5) r2, row_number() over (order by group_id) from t1 ;

In above queries there is no guarantee that row_number() assigned in first query (t3) will be exact same row_number() assigned when creating t4.

So my question is " What is best way to ensure that row get assigned exact same row_number no matter how many times you run query ( with changing of analytic function output) ? "

Hope I was able to express what I wanted to mention, if not please comment below and I will clarify.

Thank you in advance for taking the time to read, understand and answer.

Cheers

Vishal
  • 279
  • 3
  • 8
  • 18

1 Answers1

1

If you want the row_number to be deterministic (assuming that the underlying data does not change, of course), you'd need to specify an order by that produces a unique order. If you need every column in the table in order to produce a unique order, you'd need to use every column in the table. So something like

row_number() over (order by group_id,
                            file_id,
                            rec_num,
                            field_4,
                            field_5,
                            field_6,
                            field_7) rn       
Justin Cave
  • 227,342
  • 24
  • 367
  • 384
  • As a note: if the combinations are unique then `row_number()` is the same as `dense_rank()` (which is what the OP uses). – Gordon Linoff Aug 27 '14 at 21:38
  • Thank you both for your comment. After playing around quite a bit I have decided that I will have to create rank on original table itself before I start using any analytic function since no other easy way seem to provide result I want. Thank you again. – Vishal Sep 01 '14 at 04:59