0

df.loc function doesn't seem to work properly for my DataFrame. I think it has something to do with the reader library I have chosen. Since I'm importing a .sav file b' ' prefix appears in every column, so in the name column, df['name'] b'Steve' appears.

I have used .str.decode('utf-8') to remove this prefix, but I can't seem to slice my df using df.loc[df['name'] == 'Sam'] For example. What is going on here?

# Read in Data
with sRW.SavReaderNp('C:/Users/Sam/Downloads/Data.sav') as reader:
record = reader.all()
df = pd.DataFrame(record)
# Decode 
df['name'] = df['name'].str.decode('utf-8')
# Slice
df.loc[df['name'] == 'Sam']
Henry Ecker
  • 34,399
  • 18
  • 41
  • 57
  • What do you get when you run `print( (df['name'] == 'Sam').any() )`? – Craig Sep 22 '19 at 18:00
  • "False" appears –  Sep 22 '19 at 18:07
  • first you mention `b'Steve'` but later you check `'Sam'` - so which name are you really looking for ?. Maybe use `print()` to see what you have in column `"name"` - maybe you don't have `Sam` but there is `Steve`. If there is `Sam` then you would have to check if there is no space (or other similar char) in name. You can check length of name. – furas Sep 22 '19 at 18:35
  • The DataFrame Column reads as follows... b'SAM ' , Then once I put in df['name'].str.decode('utf-8'), the column reads... SAM. But once I add loc[(df.name == 'SAM')]. Every column reads 'NaN' –  Sep 22 '19 at 19:05

0 Answers0