1

I have the following input table (y):

parameter1 parameter2
1 12
2 23
3 66
4 98
5 90
6 14
7 7
8 56
9 1

I would like to randomly allot values from A1 to A9. The output table should look like the following:

parameter1 parameter2 parameter3
1 12 A5
2 23 A2
3 66 A4
4 98 A8
5 90 A3
6 14 A7
7 7 A1
8 56 A9
9 1 A6
n = 9

TGn = round(len(y)/n)
idx = set(y.index // TGn)

y = y.apply(lambda x: x.sample(frac=1,random_state=1234)).reset_index(drop=True)
    
treatment_groups = [f"A{i}" for i in range(1, n+1)]
y['groupAfterRandomization'] = (y.index // TGn).map(dict(zip(idx, treatment_groups)))

I am unable to fill the first row value it prints as NaN. How do I tackle this problem?

pjs
  • 18,696
  • 4
  • 27
  • 56
MuSu18
  • 159
  • 9

1 Answers1

1

Series.sample

We can use sample with frac=1 to sample the values from the column parameter1 then use radd to concatenate prefix A with the sampled values

df['parameter3'] = df['parameter1'].sample(frac=1).astype(str).radd('A').values

   parameter1  parameter2 parameter3
0           1          12         A2
1           2          23         A8
2           3          66         A1
3           4          98         A4
4           5          90         A9
5           6          14         A3
6           7           7         A6
7           8          56         A7
8           9           1         A5
Shubham Sharma
  • 68,127
  • 6
  • 24
  • 53
  • Such an elegant solution. Just a quick question. I can't realize difference between `add` and `radd` method. – ashkangh Mar 23 '21 at 15:55
  • 1
    Thanks @ashkangh. To understand better lets consider an example, if you want to add `series` + `some value` then you would normally use `add` but if you want to add `some value` + `series` in such case you would use `radd`. – Shubham Sharma Mar 23 '21 at 16:00
  • Ohhhh. Awesome!! And please rectify me if I'm wrong. Both methods also can be applied on `strings`, and if so, they concatenate both respective values. Right? – ashkangh Mar 23 '21 at 16:11
  • 1
    Both `add` and `radd` are the methods of `pandas` `Series` object so you need `Series` object to use those methods. – Shubham Sharma Mar 23 '21 at 16:23
  • Thank you very much for the response! When I try to sample and concatenate with parameter1 I am not able to get parameter3 between A1-A9, rather it generates random number. How do I rectify that? – MuSu18 Mar 23 '21 at 20:26
  • @MuSu18 Can you show the output that you are getting? – Shubham Sharma Mar 24 '21 at 12:58