I currently have this Dataframe and I would like to grouby the value that appear in the list of others.
dataframe =
foo [1,2,3,4]
bar [4]
bob [2]
ere [7]
I would like my dataframe to look like this:
dataframe =
foo,bar,bob [1,2,3,4]
ere [7]
thank you!
*** this is the code to create the dataframe*** The data comes from a fasta-like file like this
>foo
1
2
3
4
>bar
4
>bob
2
>ere
7
My code to create df
import pandas as pd
input1 = "final.fasta"
fasta = open(input1,"r")
records = [record for record in fasta]
# gets the numbers in a list
ids = [list(x[1]) for x in itertools.groupby(records,lambda x: '>' in x) if not x[0]]
#gets the name in a list
ref_seqs = [list(x[1]) for x in itertools.groupby(records,lambda x: '>' not in x) if not x[0]]
# transform into a df
df = pd.DataFrame({'refseq':ref_seqs,'ids':ids})