-4

I'm trying to renumber a column named Count, in a SQL Server 2019 table.

The layout is like this:

date Location Store Count
2023-02-01 AU Store1 1
2023-02-01 AU Store1 3
2023-02-01 AU Store1 4
2023-02-01 UK Store1 2
2023-02-01 UK Store1 4
2023-02-01 UK Store1 5

I am trying to renumber the Count column, so, that for each distinct Date, Location and Store, the numbers for the Count column are contiguous, for that given Date, Location and Store.

For example, for the 2023-02-01|AU|Store1, the count should be changed from 1,3,4 to 1,2,3.

For the UK store for the same date should be changed from 2,4,5 to 1,2,3.

The count should always begin with one, for the first row for a given Date, Location and Store.

I thought it would be relatively easy, but my head is hurting.

Any suggestions greatly appreciated.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
joshAU
  • 39
  • 8
  • 5
    [`ROW_NUMBER`](https://learn.microsoft.com/en-us/sql/t-sql/functions/row-number-transact-sql?view=sql-server-ver16) – Thom A Aug 14 '23 at 13:23
  • Do you want to have to update this column any time a row is changed or deleted? Why have the column at all? You can always calculate this _only when you need it_. – Stuck at 1337 Aug 14 '23 at 13:34
  • Thanks at @ThomA, I'm having a look into row_number now... hmm, looks good, just trying to figure out how to update Count, rather than adding a new column. – joshAU Aug 14 '23 at 13:40
  • @Stuckat1337 This table is a temporary table I run once. Once the table is generated, I then delete rows that match certain criteria, hence why the numbering is no longer contiguous. I'm just trying to remake it contiguous. Once it is contiguous, I then process it and discard it, as I no longer need it. – joshAU Aug 14 '23 at 13:43
  • 1
    As @Stuckat1337 suggests, JoshAU, you would be better off calculating this as/when you need it. Otherwise when you `UPDATE`/`DELETE`/`INSERT` rows you will might need to `UPDATE` *all* rows for the affected `Store`s and `Location`s. If you want to be able to reference it easily then you can `CREATE` a `VIEW` where you define a column using `ROW_NUMBER` for the expression. – Thom A Aug 14 '23 at 13:43
  • Same arguments apply for a temp table, though. Why _store_ a value that anything reading from the temp table can calculate _when_ it reads the temp table? Updating a column after every operation is wasteful work. – Stuck at 1337 Aug 14 '23 at 13:46
  • If you are inserting it into a temporary table, why are you not generating the value you need at `INSERT`? Why `INSERT` and then `UPDATE` the value of `Count`, when you could just `INSERT` the correct value to *start* with. – Thom A Aug 14 '23 at 13:47
  • @ThomA Thanks again, and to Stuckat1337. This is a temporary table that is created briefly to try to test some machine learning algorithms. It has a fairly limited number of rows, that will not be updated ever. I do however delete some erroneous rows from the list as part of the data preparation, which makes the count column non-contiguous. I am, just wanting to correct that prior to moving on to the next bit of code, which promptly drops the entire table in question. – joshAU Aug 14 '23 at 13:48
  • I think you're still missing the point: the next bit of code can easily calculate the row numbers without having to update the table first (and without inserting any initial values that are wrong if you perform any deletes and you just want to update later anyway). – Stuck at 1337 Aug 14 '23 at 13:50
  • Sigh.... @Stuckat1337. I store a value in a temp table, because I am reading it as a select statement from another main table. To clarify... I am running a ML query, which accesses a database of 7million rows. It selects matching data, and puts this limited data into a temporary table. For each given Date, Location and store, the COUNT value is contiguous. However, there are multiple rows, across different dates, locations and stores, that for one reason or the other, are unsuitable or erroneous, so the earlier code deletes them from the temp table. – joshAU Aug 14 '23 at 13:53
  • That then leaves the temp table with non-contiguous numbering in the count column, which I would like to rectify. Does that help to clarify it? Sorry if I am not being clear. – joshAU Aug 14 '23 at 13:54
  • So, again, why not `INSERT` the data with the correct value for `Count` at that point of *`INSERT`ing* it? – Thom A Aug 14 '23 at 13:58
  • 2
    https://dbfiddle.uk/agpR_nvg - it's clear that you _want_ to update the count column, and there's no confusion about what you want. What we're trying to explain to you is that you don't actually _need_ what you want, and it is far more efficient to _not_ bother storing or updating a column that seems _designed_ to change anyway. – Stuck at 1337 Aug 14 '23 at 14:02
  • Within each date/location/store group, how are you determining the counting order? If you originally had rows numbered 1,2,3 and delete #2, do you need to preserve the original sequence eg *not* row #3 becomes #1 and row #1 becomes row #2 – Stu Aug 14 '23 at 14:05
  • Sorry to both of you if I sound frustrated, and am not clear....Like I said, this is making my head hurt. I do appreciate your efforts and advice, believe me. – joshAU Aug 14 '23 at 14:06
  • @ThomA - I cannot insert the data with the correct value for Count at the point of inserting it as it is contiguous when I insert it from the main table. However, the code in question, after the insert into the temp table, is an attempt at improving the ML accuracy, and therefore may delete some rows in that temp table, for the particular attempt in question. – joshAU Aug 14 '23 at 14:10
  • @Stuckat1337 But... lol. but I do need to update the count, at this point in my code....as that is the value I am trying to predict in the code that follows this, immediately after... However, due to some data not being suitable for the current prediction method, I delete them, and hence why I want to retain the ordering, but make it contiguous. – joshAU Aug 14 '23 at 14:15
  • Re your query @Stu, if I have 1,2,3 for a specific date/location/store group, and I delete 2, then I want row 3 to become row 2, so yes, I want to preserve the original sequence. – joshAU Aug 14 '23 at 14:17
  • You could create the correct count if you removed the incorrect rows at the point of insert, rather than inserting then deleting though. As this is too long for a comment, I've put together a simple example of how this *could* work https://dbfiddle.uk/55T-fVU6 – GarethD Aug 14 '23 at 14:20
  • Thank you all... and my brain is about to explode... I will have a re-read of all this tomorrow, as its after midnight here in Tassie. I appreciate all the inputs and advice. – joshAU Aug 14 '23 at 14:33
  • @Stuckat1337 I'll have a look at the dbfiddle link suggestion, thank you. – joshAU Aug 14 '23 at 14:34
  • @GarethD thanks also for you advice, I'll have a look at it and try to make sense of it. Thank you. – joshAU Aug 14 '23 at 14:36

1 Answers1

1

Just throwing this out there as what perhaps what you think you want:

update n set Count = Newcount
from (
  select *, Row_Number() over(partition by date, location, store order by count) Newcount
  from mytemptable
)n;

I think you should be implementing a view to include the count column instead of having it as a column in your table, then querying the view will always have the correct count.

create view Myview as
select *, Row_Number() over(partition by date, location, store order by <your ordering criteria>) Count
from mytemptable;

Then just use the view instead of mytemptable;

Stu
  • 30,392
  • 6
  • 14
  • 33