-1

I have a column Cities inside a pandas DataFrame that has a lot of words written similarly but not exactly.

For example: "Example City", " Example City" and "Example City ".

This bothers me because when I look for the unique values inside the column it classifies this cities as different.

FBruzzesi
  • 6,385
  • 3
  • 15
  • 37

1 Answers1

1

If the problem is just spaces at the end of the strings you can use strip, if you also have multiple spaces (e.g. Example City and Example City) you can use replace and regex:

df['Cities'] = df['Cities'].str.strip()
df['Cities'] = df['Cities'].str.replace(r'\s\s+', ' ')
FBruzzesi
  • 6,385
  • 3
  • 15
  • 37