1

I'm doing a query on oracle to update a token in a table to do some work on the marked rows.

My problem is that I want to limit the number of rows updated each time to a specific batch size and also to a group by result.

It is kinda hard for me to explain in abstract terms so I'll materialize it.

Say that I have a table called 'staging' with the following fields: (Id, IdFile Grouping, Name, Address1, Address2, Address3, Country, Token)

What I'm trying to do is create a query to do batches with this info.

For each batch, I update the token so that those records are marked to work later on. I want a batch by file, and each batch has a size limit, so i'm using the rownum to limit it. Another limitation I have is grouping by batch.

Let's say that my batch size is 5, and If I have 3 and only 3 rows with grouping column filled I have to create a batch with only that 3 rows.

This is my biggest pet peeve as when I'm doing the group by, I get as a result in just one batch results with and without grouping. (If I had a grouping, I had to see if the destiny address is the same for all rows and create a batch with it.)

Can you help me understand how I have to create this query?

What I'm using right now is:

update schema.staging set
token = 'token4'
where id in ( select t.id
               from (
    select stage.id, stage.grouping 
    from (select idFile 
            from (select a.idFile
                  from schema.staging a
                  inner join schema.config c on c.id = a.idcfgpriority
                  where nvl2(a.token,0,1) = 1
                  group by a.idFile, c.order
                  order by c.order desc) files 
            where rownum = 1 ) priorityFile
          inner join schema.staging stage on stage.idFile = priorityFile.idFile
        where nvl2(stage.token,0,1) = 1 
        group by stage.idFile, stage.token, stage.id, grouping sets ( (stage.grouping),
                                                                          (stage.name, stage.Address1,stage.Address2,stage.Address3,stage.Country))
        order by stage.grouping
    ) t
where rownum <= 5 )

For the following data:

 Id IdFile  Grouping    Name    Address1    Address2    Address3    Country     Token
 ====================================================================================
 1  1       null        Name1   Address1    Address1    Address1    Country     null
 2  1       1           Name1   Address1    Address1    Address1    Country     null
 3  1       1           Name1   Address1    Address1    Address1    Country     null
 4  1       1           Name1   Address1    Address1    Address1    Country     null
 5  1       2           Name1   Address1    Address1    Address1    Country     null
 6  1       2           LALAL   XPTO        Address1    Address1    Country     null
 7  1       null        Name1   Address1    Address1    Address1    Country     null
 8  1       null        Name1   Address1    Address1    Address1    Country     null
 9  1       null        Name1   Address1    Address1    Address1    Country     null
 10 2       null        Name1   Address1    Address1    Address1    Country     null
 11 2       null        Name1   Address1    Address1    Address1    Country     null
 12 2       null        Name1   Address1    Address1    Address1    Country     null
 13 2       null        Name1   Address1    Address1    Address1    Country     null
 14 2       null        Name1   Address1    Address1    Address1    Country     null
 15 2       null        Name1   Address1    Address1    Address1    Country     null

I would be expecting 6 batches.

batch#1: 1, 7, 8, 9
batch#2: 2, 3, 4
batch#3: 5
batch#4: 6
batch#5: 10, 11, 12, 13, 14
batch#6: 15

Sooo... Does anyone have any ideas?

Cheers

Peter Lang
  • 54,264
  • 27
  • 148
  • 161
Silva
  • 270
  • 4
  • 16

1 Answers1

1

I'm not sure that fully understand all requreiments in details, but based on your sample data and desired output following query will work:

SELECT dense_rank() over(ORDER BY "IdFile",
                                  "Grouping" nulls FIRST,
                                  "Name" desc,
                                  "Address1",
                                  "Address2",
                                  "Address3",
                                  "Country",
                                  "Token",
                                  row_num) batch_num,
       q.*
  FROM (SELECT trunc((row_number()
                      over(partition BY "IdFile",
                           "Grouping",
                           "Name",
                           "Address1",
                           "Address2",
                           "Address3",
                           "Country",
                           "Token" ORDER BY "Id") - 1) / 5) row_num,
               s.*
          FROM sample s) q
 ORDER BY "Id";

Result:

| BATCH_NUM | ROW_NUM | ID | IDFILE | GROUPING |  NAME | ADDRESS1 | ADDRESS2 | ADDRESS3 | COUNTRY |  TOKEN |
|-----------|---------|----|--------|----------|-------|----------|----------|----------|---------|--------|
|         1 |       0 |  1 |      1 |   (null) | Name1 | Address1 | Address1 | Address1 | Country | (null) |
|         2 |       0 |  2 |      1 |        1 | Name1 | Address1 | Address1 | Address1 | Country | (null) |
|         2 |       0 |  3 |      1 |        1 | Name1 | Address1 | Address1 | Address1 | Country | (null) |
|         2 |       0 |  4 |      1 |        1 | Name1 | Address1 | Address1 | Address1 | Country | (null) |
|         3 |       0 |  5 |      1 |        2 | Name1 | Address1 | Address1 | Address1 | Country | (null) |
|         4 |       0 |  6 |      1 |        2 | LALAL |     XPTO | Address1 | Address1 | Country | (null) |
|         1 |       0 |  7 |      1 |   (null) | Name1 | Address1 | Address1 | Address1 | Country | (null) |
|         1 |       0 |  8 |      1 |   (null) | Name1 | Address1 | Address1 | Address1 | Country | (null) |
|         1 |       0 |  9 |      1 |   (null) | Name1 | Address1 | Address1 | Address1 | Country | (null) |
|         5 |       0 | 10 |      2 |   (null) | Name1 | Address1 | Address1 | Address1 | Country | (null) |
|         5 |       0 | 11 |      2 |   (null) | Name1 | Address1 | Address1 | Address1 | Country | (null) |
|         5 |       0 | 12 |      2 |   (null) | Name1 | Address1 | Address1 | Address1 | Country | (null) |
|         5 |       0 | 13 |      2 |   (null) | Name1 | Address1 | Address1 | Address1 | Country | (null) |
|         5 |       0 | 14 |      2 |   (null) | Name1 | Address1 | Address1 | Address1 | Country | (null) |
|         6 |       1 | 15 |      2 |   (null) | Name1 | Address1 | Address1 | Address1 | Country | (null) |

SQL Fiddle.

I have played a little with ordering to fully simulate batch numbering you have provided. You can safely discard desc and nulls FIRST parts. Let me know if the query solves your problem.

Yaroslav Shabalin
  • 1,645
  • 3
  • 17
  • 29
  • I had to make some adjustments but your query has helped me to overcome my problem! Thanks! – Silva Mar 11 '14 at 17:12