How to create dataframe by randomly selecting from another dataframe?

Question

DP 1      DP 2    DP 3     DP 4     DP 5     DP 6     DP 7     DP 8    DP 9    DP 10
(0.519)  (1.117)  (1.152)   0.772       1.490    (0.850)  (1.189)  (0.759)      
0.030    0.047     0.632   (0.608)     (0.322)   0.939     0.346    0.651       
1.290    (0.179)   0.006    0.850      (1.141)   0.758     0.682            
1.500    (1.228)   1.840   (1.594)     (0.282)   (0.907)                
(1.540)  0.689    (0.683)   0.005   0.543                   
(0.197)  (0.664)  (0.636)   0.878                       
(0.942)  0.764    (0.137)                           
0.693    1.647                              
0.197

I have above dataframe:

i need below dataframe using random value from above dataframe:

 DP 1       DP 2      DP 3    DP 4         DP 5     DP 6      DP 7     DP 8        DP 9   DP 10
     (0.664)    1.290    0.682    0.030      (0.683)  (0.636)    (0.683)   1.840     (1.540)    
     1.490     (0.907)   (0.850) (0.197)     (1.228)   0.682     1.290     0.939        
     0.047      0.682    0.346    0.689      (0.137)   1.490     0.197          
     0.047      0.878    0.651    0.047      0.047    (0.197)               
     (1.141)    0.758    0.878    1.490      0.651                  
     1.647      1.490    0.772    1.490                         
     (0.519)    0.693    0.346                          
     (0.137)    0.850                               
     0.197

I've tried this code :

df2= df1.sample(len(df1))

print(df2)

But Output is

     DP1       DP2       DP3       DP4       DP5       DP6       DP7       DP8  DP9
    OP8   0.735590  1.762630       NaN       NaN       NaN       NaN       NaN       NaN  NaN
    OP7  -0.999665  0.817949 -0.147698       NaN       NaN       NaN       NaN       NaN  NaN
    OP2   0.031430  0.049994  0.682040 -0.667445 -0.360034  1.089516  0.426642  0.916619  NaN
    OP3   1.368955 -0.191781  0.006623  0.932736 -1.277548  0.880056  0.841018       NaN  NaN
    OP1  -0.551065 -1.195305 -1.243199  0.847178  1.668630 -0.986300 -1.465904 -1.069986  NaN
    OP4   1.592201 -1.314628  1.985683 -1.749389 -0.315828 -1.052629       NaN       NaN  NaN
    OP6  -0.208647 -0.710424 -0.686654  0.963221       NaN       NaN       NaN       NaN  NaN
    OP10       NaN       NaN       NaN       NaN       NaN       NaN       NaN       NaN  NaN
    OP9   0.209244       NaN       NaN       NaN       NaN       NaN       NaN       NaN  NaN
    OP5  -1.635306  0.737937 -0.736907  0.005545  0.607974       NaN       NaN       NaN  NaN

i need to construct a dataframe that made of random selection of previous dataframe.And that should be in triangle only. — Devil, Apr 05 '21 at 11:33
What is dataframe? List of lists with decreasing size? Please be more specific or put some data — Michael Rovinsky, Apr 09 '21 at 05:33
I have a triangle dataframe and i would like to create another triangle dataframe by random selects of data from previous dataframe. I am getting the random data using sample() but its is not in the triangle shape. — Devil, Apr 09 '21 at 05:49

score 3 · Accepted Answer · answered Apr 09 '21 at 18:38

You can use np.random.choice() for the sampling.

Assuming df is something like this:

df = pd.DataFrame({'DP 1': ['(0.519)','0.030','1.290','1.500','(1.540)','(0.197)','(0.942)','0.693','0.197'],'DP 2': ['(1.117)','0.047','(0.179)','(1.228)','0.689','(0.664)','0.764','1.647',np.nan],'DP 3': ['(1.152)','0.632','0.006','1.840','(0.683)','(0.636)','(0.137)',np.nan,np.nan],'DP 4': ['0.772','(0.608)','0.850','(1.594)','0.005','0.878',np.nan,np.nan,np.nan],'DP 5': ['1.490','(0.322)','(1.141)','(0.282)','0.543',np.nan,np.nan,np.nan,np.nan],'DP 6': ['(0.850)','0.939','0.758','(0.907)',np.nan,np.nan,np.nan,np.nan,np.nan],'DP 7': ['(1.189)','0.346','0.682',np.nan,np.nan,np.nan,np.nan,np.nan,np.nan],'DP 8': ['(0.759)','0.651',np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan],'DP 9': [np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan],'DP 10': [np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan]})

#       DP 1     DP 2     DP 3     DP 4     DP 5     DP 6     DP 7     DP 8     DP 9    DP 10
# 0  (0.519)  (1.117)  (1.152)    0.772    1.490  (0.850)  (1.189)  (0.759)      NaN      NaN
# 1    0.030    0.047    0.632  (0.608)  (0.322)    0.939    0.346    0.651      NaN      NaN
# 2    1.290  (0.179)    0.006    0.850  (1.141)    0.758    0.682      NaN      NaN      NaN
# 3    1.500  (1.228)    1.840  (1.594)  (0.282)  (0.907)      NaN      NaN      NaN      NaN
# 4  (1.540)    0.689  (0.683)    0.005    0.543      NaN      NaN      NaN      NaN      NaN
# 5  (0.197)  (0.664)  (0.636)    0.878      NaN      NaN      NaN      NaN      NaN      NaN
# 6  (0.942)    0.764  (0.137)      NaN      NaN      NaN      NaN      NaN      NaN      NaN
# 7    0.693    1.647      NaN      NaN      NaN      NaN      NaN      NaN      NaN      NaN
# 8    0.197      NaN      NaN      NaN      NaN      NaN      NaN      NaN      NaN      NaN

First extract the choices from all non-null values of df:

choices = df.values[~pd.isnull(df.values)]

# array(['(0.519)', '(1.117)', '(1.152)', '0.772', '1.490', '(0.850)',
#        '(1.189)', '(0.759)', '0.030', '0.047', '0.632', '(0.608)',
#        '(0.322)', '0.939', '0.346', '0.651', '1.290', '(0.179)', '0.006',
#        '0.850', '(1.141)', '0.758', '0.682', '1.500', '(1.228)', '1.840',
#        '(1.594)', '(0.282)', '(0.907)', '(1.540)', '0.689', '(0.683)',
#        '0.005', '0.543', '(0.197)', '(0.664)', '(0.636)', '0.878',
#        '(0.942)', '0.764', '(0.137)', '0.693', '1.647', '0.197'],
#       dtype=object)

Then take a np.random.choice() from choices for all non-null cells:

df = df.applymap(lambda x: np.random.choice(choices) if not pd.isnull(x) else x)

#       DP 1     DP 2     DP 3     DP 4     DP 5     DP 6     DP 7     DP 8     DP 9    DP 10
# 0  (0.179)    0.682    0.758  (1.152)  (0.137)  (1.152)    0.939  (0.759)      NaN      NaN
# 1    1.500  (1.152)  (0.197)    0.772    1.840    1.840    0.772  (0.850)      NaN      NaN
# 2    0.878    0.005  (1.540)    0.764  (0.519)    0.682  (1.152)      NaN      NaN      NaN
# 3    0.758  (0.137)    1.840    1.647    1.647  (0.942)      NaN      NaN      NaN      NaN
# 4    0.693  (0.683)  (0.759)    1.500  (0.197)      NaN      NaN      NaN      NaN      NaN
# 5    0.006  (0.137)    0.764  (1.117)      NaN      NaN      NaN      NaN      NaN      NaN
# 6  (0.664)    0.632  (1.141)      NaN      NaN      NaN      NaN      NaN      NaN      NaN
# 7    0.543  (0.664)      NaN      NaN      NaN      NaN      NaN      NaN      NaN      NaN
# 8  (0.137)      NaN      NaN      NaN      NaN      NaN      NaN      NaN      NaN      NaN

How to create dataframe by randomly selecting from another dataframe?

1 Answers1

Linked