Shuffling Multi Column in data frame

Question

i have a Data frame like this :

'a'                   'b'    'c'    'd'               'e'  'f'
'hello.text'           1      2      'hello2.text'     2   10
'hello3.text'          5      8      'hello4.text'     8   15

now i need shuffle or randomize 'a','b','c' columns together. some thing like this :

'a'                   'b'    'c'    'd'               'e'  'f'
'hello3.text'          5      8      'hello2.text'     2   10
'hello.text'           1      2      'hello4.text'     8   15

how can i do this?

score 2 · Accepted Answer · answered Aug 19 '19 at 08:35

Use np.random.permutation with DataFrame.apply for processing each column separately, because different types of data:

cols = ['a','b','c']

df[cols] = df[cols].apply(lambda x: np.random.permutation(x))
print (df)
               a  b  c              d  e   f
0   'hello.text'  5  2  'hello2.text'  2  10
1  'hello3.text'  1  8  'hello4.text'  8  15

score 0 · Answer 2 · answered Aug 19 '19 at 09:47

Randomizing 'a', 'b', 'c' columns together, means shuffle the rows only for rows of these specific columns? If yes, then the following does what you need:

cols = ['a','b','c']
df[cols] = df[cols].sample(frac=1.0, random_state=0).reset_index(drop=True)
print(df)

            a  b  c            d  e   f
0  hello3.txt  5  8  hello2.text  2  10
1  hello.text  1  2  hello4.text  8  15

You can control the randomization using the random_state parameter.

Shuffling Multi Column in data frame

2 Answers2