4

I've seen this question asked a few times, but the suggested answers don't seem to be working for me. I've brought in a csv with read_csv and am trying to clean up the names, which are initially:

In [89]: data.columns
Out[89]: 
Index(['Node Number', 'X [ m ]', 'Y [ m ]', 'Z [ m ]',
       'Turbulence Kinetic Energy [ m^2 s^-2 ]', 'turbulenceIntensity',
       'Velocity u [ m s^-1 ]', 'Velocity v [ m s^-1 ]',
       'Velocity w [ m s^-1 ]', 'windspeedratio'],
      dtype='object')

The simplest suggestion I've found should be:

data.rename(columns=lambda x: x.strip(), inplace=True)

But if I try that, absolutely nothing changes. Same with

data.columns = data.columns.str.strip()

Any idea why?

Benjamin Brannon
  • 295
  • 1
  • 3
  • 13
  • 1
    You want to remove all the whitespace? `strip` only removes leading and trailing whitespace, of which there is none. – sytech Feb 22 '18 at 03:42

2 Answers2

8

Seems like you need replace all ' ' to ''

df.columns.str.replace(' ','')
Out[103]: 
Index(['NodeNumber', 'X[m]', 'Y[m]', 'Z[m]',
       'TurbulenceKineticEnergy[m^2s^-2]', 'turbulenceIntensity',
       'Velocityu[ms^-1]', 'Velocityv[ms^-1]', 'Velocityw[ms^-1]',
       'windspeedratio'],
      dtype='object')
BENY
  • 317,841
  • 20
  • 164
  • 234
  • 2
    This will work if all he wants to do is print but it won't work if he wants a permanent removal of white spaces from column names. You'll need to assign your line above back to df.columns. – hikerjobs Feb 22 '18 at 03:59
  • Ha, thanks that's an easy one then. I was reading the strip(0 documentation to quickly and missed that it was only leading and trailing whitespace. I was following this particular question which worked for them, but wasn't clear about only leading and trailing: https://stackoverflow.com/questions/21606987/how-can-i-strip-the-whitespace-from-pandas-dataframe-headers – Benjamin Brannon Feb 22 '18 at 04:06
  • @BenjaminBrannon yw~ :-) – BENY Feb 22 '18 at 04:44
1

Strip only removes leading and trailing whitespace. If you want to remove all whitespace (including all kinds of whitesapce like tabs, newlines, spaces, etc), the following will work

import string
def remove_whitespace(x):
    table = {ord(char): None for char in string.whitespace}
    return x.translate(table)

data.rename(columns=remove_whitespace, inplace=True)

As Brad mentions, the following can be used to the same effect.

df.columns.str.replace(r'\s+', '')
sytech
  • 29,298
  • 3
  • 45
  • 86