-1

How should I assign a unique ID (UID) to a group of rows and a different UID where there is a sudden jump in ID values. For example; assign "1" to ID values ranging from 122 to 125 and a UID of "2" where ID values range from 507 to 511 in the table below.

Name ID UID
T1 122
T1 123
T1 124
T1 125
T1 507
T1 508
T1 509
T1 510
T1 511

To be honest i don't know where to start from.

  • Are you using pandas or is this data in a nested list or a dictionary or something? – Ari Cooper-Davis Mar 28 '22 at 10:06
  • 2
    [Find a simpler problem!](https://ericlippert.com/2014/03/21/find-a-simpler-problem/) – mkrieger1 Mar 28 '22 at 10:10
  • From @surviving-grad (not enough rep. to comment): *Is this a pandas dataframe? When you say "assign "1" to ID values ranging from 122 to 125 and a UID of "2" where ID values range from 507 to 511", do you want to group the dataframe based on some condition? If yes, you can try using pandas [`groupby()`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.groupby.html) function* – Serge Ballesta Mar 28 '22 at 10:14
  • you want something called Unique ID (UID) to have a value of 1 for several rows and a value of 2 for another several rows. then why you call it "Unique ID" it's it's not unique? That will be a very misleading name. Could you modify your question and sill the UID column with the values you want? –  Mar 28 '22 at 10:15

1 Answers1

0

You do not want an unique id, but just a group id, where a group is a set of rows with consecutive ID values. Assuming pandas, you could use shift() to identify consecutive values, and then cumsum() to automatically build a group id:

df['UID'] = (df['ID'] != df['ID'].shift() + 1).cumsum()

It gives as expected:

df
  Name   ID  UID
0   T1  122    1
1   T1  123    1
2   T1  124    1
3   T1  125    1
4   T1  507    2
5   T1  508    2
6   T1  509    2
7   T1  510    2
8   T1  511    2
Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252