1

I'm trying to apply a function to every element in a column but I keep getting this error and I'm not sure how to fix it.

Code:

import pandas as pd
import pubchempy
import numpy as np

df = pd.read_csv("Data.tsv.txt", sep="\t")

.
.
.

df['CID'] = df['CID'].astype(str).apply(lambda x: x.replace('.0',''))

df['CID']= df['CID'].map(lambda x: get_properties(identifier=x, properties='MolecularWeight') if x>0 else pd.NA)

Error:

TypeError: '>' not supported between instances of 'str' and 'int'

Also, the get_properties() function is a function from pubchempy that takes the requested information (in this case, 'MolecularWeight') directly from the pubchem website.

The inputs are:

pubchempy.get_compounds(identifier, namespace=u'cid', searchtype=None, as_dataframe=False, **kwargs)

Only the properties and identifier parameters are required, the rest are optional.

Small Data Sample: enter image description here

Thanks in advance!

  • 1
    hey there and welcome to SO. we have no idea what `get_properties()` is. you might want to explain that. – mechanical_meat May 17 '22 at 15:19
  • My bad about that, completely new to this stuff. Updated! – New_to_coding May 17 '22 at 15:38
  • ah so that's a specific library thing that i'm not familiar with. i should've asked this before: what is contained in the "CID" column? what datatype? if you can edit the question to include a small data sample that'd help. – mechanical_meat May 17 '22 at 16:00
  • My apologies for the late reply, I was trying a bunch of different stuff to see if I could get it to work. I changed it a bit but now I get a different error although I feel like this one should be easier to fix. I added a picture of the data! – New_to_coding May 17 '22 at 17:33
  • What is `df.dtypes` right after the `read_csv()` call? – yut23 May 18 '22 at 05:29

0 Answers0