Combing Columns after Transposing Columns Pandas Dataframes

Question

Suppose I have a set of data frames

df1 is

   ID       C1
0  0  0.000000
1  1  0.538516
2  2  0.509902
3  3  0.648074
4  4  0.141421

df2 is

  ID        C1
0  0  0.538516
1  1  0.000000
2  2  0.300000
3  3  0.331662
4  4  0.608276

and df3 is

  ID        C1
0  0  0.509902
1  1  0.300000
2  2  0.000000
3  3  0.244949
4  4  0.509902

I then go ahead and transpose these three data frames.

df1 = df1.T
df2 = df2.T
df3 = df3.T

Now the data frames are :

df1 is

          0         1         2         3         4
ID        0         1         2         3         4
C1        0  0.538516  0.509902  0.648074  0.141421

df2 is :

                 0         1    2         3         4
ID               0         1    2         3         4
C1               0.538516  0  0.3  0.331662  0.608276

and df3 is :

                 0    1    2         3         4
ID               0    1    2         3         4
C1          0.509902  0.3  0  0.244949  0.509902

Can I somehow combine all data frames to have

0            1         2         3         4
0          0.538516  0.509902  0.648074  0.141421
0.538516     0        0.3       0.331662  0.608276
0.509902     0.3      0         0.244949  0.509902

And then sort rows individually , so that each row in the resulting data frame is sorted ?

For instance the data frame with sorted rows would be

0  0.141421  0.509902  0.538516  0.648074
0  0.3       0.331662  0.538516  0.608276
0  0.244949  0.3       0.509902  0.509902

I'm having problems with concat since I've transposed the data frames.

All help is appreciated

MaxU - stand with Ukraine · Accepted Answer · 2017-01-29T21:37:46.077

2

UPDATE: sorted data set and without transposing:

In [64]: pd.DataFrame([np.sort(x['C1'].values) for x in dfs], columns=d1.index)
Out[64]:
     0         1         2         3         4
0  0.0  0.141421  0.509902  0.538516  0.648074
1  0.0  0.300000  0.331662  0.538516  0.608276
2  0.0  0.244949  0.300000  0.509902  0.509902

Explanation:

In [67]: d1
Out[67]:
   ID        C1
0   0  0.000000
1   1  0.538516
2   2  0.509902
3   3  0.648074
4   4  0.141421

In [68]: d1['C1'].values
Out[68]: array([ 0.      ,  0.538516,  0.509902,  0.648074,  0.141421])

In [69]: np.sort(d1['C1'].values)
Out[69]: array([ 0.      ,  0.141421,  0.509902,  0.538516,  0.648074])

In [70]: [np.sort(x['C1'].values) for x in dfs]
Out[70]:
[array([ 0.      ,  0.141421,  0.509902,  0.538516,  0.648074]),
 array([ 0.      ,  0.3     ,  0.331662,  0.538516,  0.608276]),
 array([ 0.      ,  0.244949,  0.3     ,  0.509902,  0.509902])]

now we can create a DataFrame from the list of sorted NumPy arrays, specifying d1.index as columns:

In [71]: pd.DataFrame([np.sort(x['C1'].values) for x in dfs], columns=d1.index)
Out[71]:
     0         1         2         3         4
0  0.0  0.141421  0.509902  0.538516  0.648074
1  0.0  0.300000  0.331662  0.538516  0.608276
2  0.0  0.244949  0.300000  0.509902  0.509902

edited Jan 29 '17 at 21:37

answered Jan 29 '17 at 20:56

MaxU - stand with Ukraine

205,989
36
386
419

Thanks for your answer! how would I do about sorting each row ? I ask this since the C1 has been lost. I'm sorry if this is a trivial question , I'm extremely new to pandas. – Hormigas Jan 29 '17 at 21:06
@Anant, you will have the same order as in the `dfs` list – MaxU - stand with Ukraine Jan 29 '17 at 21:07
is sorting not possible ? I sorted my transposed data frame based on C1. The sorted order however was lost on using concat. I think its because indices are ordered together on concatenation. Any way to get around this ? – Hormigas Jan 29 '17 at 21:11
1

@Anant, i'm not quite sure i understand correctly... Could you post your desired DF (if it differs from the posted one)? PS why do you say `C1` is lost? – MaxU - stand with Ukraine Jan 29 '17 at 21:13
I've added some details to the question. Thanks! – Hormigas Jan 29 '17 at 21:18
@Anant, i've updated my answer - is that what you want? – MaxU - stand with Ukraine Jan 29 '17 at 21:23
1

Yes! Thanks a ton for this. I'll need to work on understanding how this works though. Many Many Thanks – Hormigas Jan 29 '17 at 21:26

piRSquared · Answer 2 · 2017-01-29T21:36:43.633

Setup

df1 = pd.DataFrame({
        'C1': [0.0, 0.538516, 0.509902, 0.648074, 0.141421],
        'ID': [0, 1, 2, 3, 4]})
df2 = pd.DataFrame({
        'C1': [0.538516, 0.0, 0.3, 0.331662, 0.608276],
        'ID': [0, 1, 2, 3, 4]})
df3 = pd.DataFrame({
        'C1': [0.509902, 0.3, 0.0, 0.244949, 0.509902],
        'ID': [0, 1, 2, 3, 4]})

Solution(s)

1

It looks as though your ID column is important an maybe that's what you want to join on. I'd set that as your index and then combine.

To sort each row, I'd leave as columns and apply a sort_values

df = pd.concat(
    [d.set_index('ID') for d in [df1, df2, df3]], ignore_index=True, axis=1)
df.apply(lambda x: x.sort_values().values).T

ID    0         1         2         3         4
0   0.0  0.141421  0.509902  0.538516  0.648074
1   0.0  0.300000  0.331662  0.538516  0.608276
2   0.0  0.244949  0.300000  0.509902  0.509902

2

Get your dataframe and sort in numpy

df = pd.concat([d.set_index('ID').T for d in [df1, df2, df3]], ignore_index=True)

v = df.values
a = v.argsort(1)
i = np.arange(a.shape[0])[:, None]
df.loc[:] = v[i, a]

df

ID    0         1         2         3         4
0   0.0  0.141421  0.509902  0.538516  0.648074
1   0.0  0.300000  0.331662  0.538516  0.608276
2   0.0  0.244949  0.300000  0.509902  0.509902

Thanks for your answer! how would I do about sorting each row ? I ask this since the C1 has been lost. Also , If I sort each row on transposing , the sorted order is lost after I concatenate the dataframes — Hormigas, Jan 29 '17 at 21:12
@Anant this was not specified in your question. I do not understand exactly what you need from your comment alone. It would be better to follow up with another posted question. That way you can include more information in regards to specifically what you want to see. — piRSquared, Jan 29 '17 at 21:14

Combing Columns after Transposing Columns Pandas Dataframes

2 Answers2

Setup

Solution(s)

1

2