-1

So basically I want select all rows where Col A is equal to the string 'hey'. My problem is that Col A can contain null/nan's therefore I get a

TypeError: invalid type comparison. 

When executing:

df.loc[df['A'] == 'hey']

I then made another condition:

df.loc[df['A'].notnull() & (df['A'] == 'hey')] 

Here i get the same error.

I made a hack where I change all the null values in Col A to '' but thats not beautiful is there anyway nice to first choose all the rows where Col A isn't null and then from there all the ones who are equal to 'hey'?

Henry Ecker
  • 34,399
  • 18
  • 41
  • 57
Thor Bjorn
  • 1
  • 1
  • 1

3 Answers3

1

I guess there should be some numeric, so try convert values to strings or compare numpy array:

newDf = df[df.A.astype(str) == 'hey']

Or:

newDf = df[df.A.values == 'hey']
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
0

How about this?

df['A'] = df['A'].astype(str)
newDf = df[df.A == 'hey']

This should give you a new dataframe with all rows that contained "hey" from column A?

Ankur Sinha
  • 6,473
  • 7
  • 42
  • 73
0

For null / NaN values, your logic is fine. Below is an example. You should provide a minimal and verifiable example, indicating version numbers for Python / Pandas.

df = pd.DataFrame({'col': [np.nan, None, 'hey', 45.4352, 'somestring']})

print(df.loc[df['col'] == 'hey'])

   col
2  hey
jpp
  • 159,742
  • 34
  • 281
  • 339