Add unique id to groups of ordered transactions

Question

I currently have a table with transactions that are sequentially ordered for each group like so:

| transaction_no | value |
|----------------|-------|
|              1 |     8 |
|              2 |   343 |
|              3 |    28 |
|              4 |   102 |
|              1 |    30 |
|              2 |     5 |
|              3 |   100 |
|              1 |    12 |
|              2 |    16 |
|              3 |    28 |
|              4 |   157 |
|              5 |   125 |

However I'm interested in add another column that assigns a unique ID to each grouping (set of transactions where the transaction_no starts with 1 and ends with x where the transaction_no immediately after x is 1). So the goal is a table like this:

| transaction_no | value | stmt_id |
|----------------|-------|---------|
|              1 |     8 |    1001 |
|              2 |   343 |    1001 |
|              3 |    28 |    1001 |
|              4 |   102 |    1001 |
|              1 |    30 |    1002 |
|              2 |     5 |    1002 |
|              3 |   100 |    1002 |
|              1 |    12 |    1003 |
|              2 |    16 |    1003 |
|              3 |    28 |    1003 |
|              4 |   157 |    1003 |
|              5 |   125 |    1003 |

How would I do this?

SQL tables represent *unordered* sets. Unless you have a column that specifies the ordering, you cannot do what you want. — Gordon Linoff, Oct 23 '19 at 22:54

GMB · Accepted Answer · 2019-10-23T23:12:42.003

This is a variation of the gaps-and-island problem. For it to be solvable, as commented by Gordon Linoff, you need a column that can be used to order the rows. I assume that such a column exists and is called id.

The typical solution involves ranking the records and performing a window sum. When the difference between the overal rank and the window sum changes, a new group starts.

Consider the following query:

select 
    id,
    transaction,
    value, 
    1000 
        + rn 
        - sum(case when transaction_no = lag_transaction_no + 1 then 1 else 0 end) 
            over(order by id) grp
from (
    select 
        t.*,
        row_number() over(order by id) rn,
        lag(transaction_no) over(order by id) lag_transaction_no
    from mytable t
) t

With this sample data:

id | transaction_no | value
-: | -------------: | ----:
 1 |              1 |     8
 2 |              2 |   343
 3 |              3 |    28
 4 |              4 |   102
 5 |              1 |    30
 6 |              2 |     5
 7 |              3 |   100
 8 |              1 |    12
 9 |              2 |    16
10 |              3 |    28
11 |              4 |   157
12 |              5 |   125

The query returns:

id | transaction_no | value |  grp
-: | -------------: | ----: | ---:
 1 |              1 |     8 | 1001
 2 |              2 |   343 | 1001
 3 |              3 |    28 | 1001
 4 |              4 |   102 | 1001
 5 |              1 |    30 | 1002
 6 |              2 |     5 | 1002
 7 |              3 |   100 | 1002
 8 |              1 |    12 | 1003
 9 |              2 |    16 | 1003
10 |              3 |    28 | 1003
11 |              4 |   157 | 1003
12 |              5 |   125 | 1003

Demo on SQL Server 2012 DB Fiddle

Add unique id to groups of ordered transactions

1 Answers1