5645-01B 5645-01A 2002-01A 5325-01C
1812.999999 | 3265.00001 | 4723.000002 | 2190.999996
43.00000001 | 1 | 2.5 | 0
622 | 1783 | 2240.499994 | 1553.000002
1568.999996 | 850.0000002 | 757.9999998 | 948.9999999
This is a little part of my table I need to remove the last letter (A/B/C) so I can swap it on another dataframe. I used:
df1.columns = df1.columns.str.rstrip('A')
df1.columns = df1.columns.str.rstrip('B')
df1.columns = df1.columns.str.rstrip('C')
But the problem appeared to be the duplicates. As you can see above there are same numbers but different final letter (A,B or C). I need to get only the last version, it means if there's a column with C letter and there is a numeric duplicate with A or B, I have to remove the A/B column/columns completely, and the C column stays without the C. Ex. "5645-01B" must stay as 5645-01, while 5645-01A have to be deleted. The problem is that I can't just remove the letters as I did or removing all "A" because some "A" columns doesn't have a B or C and I must keep them. How do I check only for the "last versions" and keep them?
P.S the top row is the column names. Expected:
5645-01 2002-01 5325-01
1812.999999 | 4723.000002 | 2190.999996
43.00000001 | 2.5 | 0
622 | 2240.499994 | 1553.000002
1568.999996 | 757.9999998 | 948.9999999
The code that I continue with:
df1=df1.transpose()
df2 = pd.read_csv('table3.csv', index_col=['SAMPLE_ID'])
df1 = df1[df1.index.isin(df2.index)]
df1['The_ID'] = df2['EGF']
print(df1.head)
After that it print "Nans" instead of numeric values. ****SAMPLE_ID is an index which is similar to the top row above with the numbers but it doesn't include any letters so that is why I must remove them.