How to remove whitespace from df column headers (strip isn't working)

Question

I've seen this question asked a few times, but the suggested answers don't seem to be working for me. I've brought in a csv with read_csv and am trying to clean up the names, which are initially:

In [89]: data.columns
Out[89]: 
Index(['Node Number', 'X [ m ]', 'Y [ m ]', 'Z [ m ]',
       'Turbulence Kinetic Energy [ m^2 s^-2 ]', 'turbulenceIntensity',
       'Velocity u [ m s^-1 ]', 'Velocity v [ m s^-1 ]',
       'Velocity w [ m s^-1 ]', 'windspeedratio'],
      dtype='object')

The simplest suggestion I've found should be:

data.rename(columns=lambda x: x.strip(), inplace=True)

But if I try that, absolutely nothing changes. Same with

data.columns = data.columns.str.strip()

Any idea why?

You want to remove all the whitespace? `strip` only removes leading and trailing whitespace, of which there is none. — sytech, Feb 22 '18 at 03:42

score 8 · Accepted Answer · answered Feb 22 '18 at 03:43

8

Seems like you need replace all ' ' to ''

df.columns.str.replace(' ','')
Out[103]: 
Index(['NodeNumber', 'X[m]', 'Y[m]', 'Z[m]',
       'TurbulenceKineticEnergy[m^2s^-2]', 'turbulenceIntensity',
       'Velocityu[ms^-1]', 'Velocityv[ms^-1]', 'Velocityw[ms^-1]',
       'windspeedratio'],
      dtype='object')

answered Feb 22 '18 at 03:43

BENY

317,841
20
164
234

2

This will work if all he wants to do is print but it won't work if he wants a permanent removal of white spaces from column names. You'll need to assign your line above back to df.columns. – hikerjobs Feb 22 '18 at 03:59
Ha, thanks that's an easy one then. I was reading the strip(0 documentation to quickly and missed that it was only leading and trailing whitespace. I was following this particular question which worked for them, but wasn't clear about only leading and trailing: https://stackoverflow.com/questions/21606987/how-can-i-strip-the-whitespace-from-pandas-dataframe-headers – Benjamin Brannon Feb 22 '18 at 04:06
@BenjaminBrannon yw~ :-) – BENY Feb 22 '18 at 04:44

sytech · Answer 2 · 2018-02-23T15:27:59.557

1

Strip only removes leading and trailing whitespace. If you want to remove all whitespace (including all kinds of whitesapce like tabs, newlines, spaces, etc), the following will work

import string
def remove_whitespace(x):
    table = {ord(char): None for char in string.whitespace}
    return x.translate(table)

data.rename(columns=remove_whitespace, inplace=True)

As Brad mentions, the following can be used to the same effect.

df.columns.str.replace(r'\s+', '')

edited Feb 23 '18 at 15:27

answered Feb 22 '18 at 03:47

sytech

29,298
3
45
86

1

Thanks this works too, but the answer from @Wen was more succinct. – Benjamin Brannon Feb 22 '18 at 04:08
Sure thing. Since Wen covered the simple case, my intention was to demonstrate a solution that works for all whitespace, not just the space character. Cheers. – sytech Feb 22 '18 at 05:02
1

That would just be `f.columns.str.replace(r'\s+', '')` – Brad Solomon Feb 22 '18 at 14:51
Nice @BradSolomon - Wasn't aware it would take a regex pattern string. – sytech Feb 23 '18 at 15:12

How to remove whitespace from df column headers (strip isn't working)

2 Answers2