0

I have a big dataframe looking like this (simplified) :

File 1:

1 ID   Apo1  Apo2  description Symbol
2 12   0.983 0.675  proteinA    AAG
3 34   0.876 0.123  ProteinB   BEH
4 54   0.432 0.445  proteinC   CFD
5 65   0.544 0.103  ProteinD   DDS

Now I want to sort this file out and have only the rows that contain certain "symbols" in the "symbol column ( in this case column 4)
The symbols that I want to get out are in file 2 ( simplified)

File 2:

AAG
CFD
DDS

So I want to get a new file looking like this , so containing only the rows of file 1 that are linked to the symbols of file 2 in their symbol column:

new file:

1 ID   Apo1  Apo2  description Symbol
2 12   0.983 0.675  proteinA    AAG
4 54   0.432 0.445  proteinC    CFD
5 65   0.544 0.103  ProteinD    DDS
etienne
  • 3,648
  • 4
  • 23
  • 37
Clem
  • 1
  • 1
    It's not a sort, it's a filter. What tools have you got are your disposition? I.e., what environment are you working in? – Aaron Oct 07 '16 at 08:51
  • I use R, I have already tried the subset function GeneProtein <- subset(Genefile, UnigeneID == UnigeneF) Genfile being file 1 and UnigeneF file 2 , UnigeneID is the column in Genefile where the symbol is in – Clem Oct 11 '16 at 11:37
  • Possible duplicate of [Filtering a data frame on a vector](http://stackoverflow.com/questions/9350025/filtering-a-data-frame-on-a-vector) – acylam Oct 24 '16 at 00:45

0 Answers0