1

I want to align my data

I currently have it like this:

Yan     TNSeq   Kato    Eco-GeneOrth    Essential

accA    accA    accA        accA        accA    
accB    accB    accB        accB        accB    
accC    accC    accC        accC        accC    
accD    accD    accD        accD        accD    
aceF    acpP    acpP        alaS        aceF    
acpP    acpS    acpS        argA        acpP    
acpS    adk     adk         argB        acpS    

What I want is this:

Yan     TNSeq   Kato    Eco-GeneOrth    Essential

accA    accA    accA        accA        accA    
accB    accB    accB        accB        accB    
accC    accC    accC        accC        accC    
accD    accD    accD        accD        accD    
aceF    NaN     NaN         Nan         aceF    
acpP    NaN     Nan         acpP        acpP    
NaN     acpS    NaN         NaN         acpS    

I've tried with reindex and sort, but no luck

I'm clueless

Basically what I want is to align or sort the first 4 columns with the Essential column, so that the data in the rows matches.

Yared J.
  • 231
  • 1
  • 2
  • 9

2 Answers2

1

UPDATE:

In [120]: df[df.apply(lambda x: x['Essential'] == x, axis=1)]
Out[120]:
    Yan TNSeq  Kato Eco-GeneOrth Essential
0  accA  accA  accA         accA      accA
1  accB  accB  accB         accB      accB
2  accC  accC  accC         accC      accC
3  accD  accD  accD         accD      accD
4  aceF   NaN   NaN          NaN      aceF
5  acpP   NaN   NaN          NaN      acpP
6  acpS   NaN   NaN          NaN      acpS

try this:

In [86]: df[df.apply(lambda x: x[0] == x, axis=1)]
Out[86]:
    Yan TNSeq  Kato Eco-GeneOrth Essential
0  accA  accA  accA         accA      accA
1  accB  accB  accB         accB      accB
2  accC  accC  accC         accC      accC
3  accD  accD  accD         accD      accD
4  aceF   NaN   NaN          NaN      aceF
5  acpP   NaN   NaN          NaN      acpP
6  acpS   NaN   NaN          NaN      acpS

data:

In [87]: df
Out[87]:
    Yan TNSeq  Kato Eco-GeneOrth Essential
0  accA  accA  accA         accA      accA
1  accB  accB  accB         accB      accB
2  accC  accC  accC         accC      accC
3  accD  accD  accD         accD      accD
4  aceF  acpP  acpP         alaS      aceF
5  acpP  acpS  acpS         argA      acpP
6  acpS   adk   adk         argB      acpS
MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419
  • This seems right. I don't understand OP's expected output though. – piRSquared Jun 23 '16 at 21:56
  • @piRSquared, I think there was kind of a small typo in OP's desired DF – MaxU - stand with Ukraine Jun 23 '16 at 21:58
  • @MaxU Sorry if I'm not making any sense, english is not my native language. What I want is to align the first 4 columns with the last one ( Essential column), so that the data in the rows matches, and leave blank spaces (NaN) where data doesn't match – Yared J. Jun 23 '16 at 22:22
0

IIUC use eq and pass your column with arg axis=0 to create a boolean mask of your entire df against that column:

In [49]:
df[df.eq(df['Essential'],axis=0)]

Out[49]:
    Yan TNSeq  Kato Eco-GeneOrth Essential
0  accA  accA  accA         accA      accA
1  accB  accB  accB         accB      accB
2  accC  accC  accC         accC      accC
3  accD  accD  accD         accD      accD
4  aceF   NaN   NaN          NaN      aceF
5  acpP   NaN   NaN          NaN      acpP
6  acpS   NaN   NaN          NaN      acpS
EdChum
  • 376,765
  • 198
  • 813
  • 562
  • This works but it's not what I wanted. I want to align the data so that it matches Maybe I'm not being clear. English is not my native language I want to sort the data, not replace it with blank spaces – Yared J. Jun 23 '16 at 22:42
  • Your comment is non-sensical how can something work but not be what you wanted???? You really need to explain what you want better rather than waste everybody's time – EdChum Jun 23 '16 at 22:44
  • Sorry. I accidentally hit enter without finishing and I was editing the comment, but I wasn't fast enough. Sorry if it looked rude – Yared J. Jun 23 '16 at 22:48
  • You're correct you're not clear there is no obvious correlation with your explanation with your desired output, for instance can you explain by editing your question how you arrive at your last 2 rows as it's not obvious – EdChum Jun 23 '16 at 22:49