Dataframe is below
ID ParentID Filter Text
0 98 97 NULL AA
1 99 98 NULL BB
2 100 99 NULL CC
3 107 100 1 DD
4 9999 1231 NULL EE
5 10000 1334 NULL FF
6 10001 850 2 GG
7 850 230 NULL HH
8 230 121 NULL II
9 121 96 NULL JJ
10 96 0 NULL KK
11 97 0 NULL LL
I need to add an additional column hierarchy like this:
ID ParentID Filter Text Hierarchy
0 98 97 NULL AA
1 99 98 NULL BB
2 100 99 NULL CC
3 107 100 1 DD DD/CC/BB/AA/LL
4 9999 1231 NULL EE
5 10000 1334 NULL FF
6 10001 850 2 GG GG/HH/II/JJ/KK
7 850 230 NULL HH
8 230 121 NULL II
9 121 96 NULL JJ
10 96 0 NULL KK
11 97 0 NULL LL
The rules I am looking at are below:
Only populate hierarchy column for rows which have filter value populated, the rest of the rows don't need hierarchy done.
When a row is found having filter value not null, lookup its parentID, then search this parentid in ID column. When found reclusively keep going up till, parent id is 0.
Trying to do this with itertools but the looping is taking too long as the original dataset is huge
4)Recordset size is ~ 200k
The below solution provided kindly by mozway seems to work but for a recorset of 200k records, it takes a lot of time. Is there a tweak that can be done to this to get to the solution faster ?