6

I have a CSV file that looks something like this -

    Location ID      Location Name
        3543459      A
         20541       B
          C320       C
           ...       ..

When I read the file using pd.read_csv, I get something like this -

Location ID      Location Name
   03543459      A
   0020541       B
   000C320       C
       ...       ..

How to avoid leading zeros? I did some research, all the questions I could ifind were based on producing the leading zeros in the df.

smci
  • 32,567
  • 20
  • 113
  • 146
Karvy1
  • 959
  • 6
  • 14
  • 25
  • 2
    `df["Location ID"].str.lstrip("0")`? – Rakesh Aug 03 '18 at 07:30
  • 1
    If you make `Location ID` an integer column, they'll disappear automatically. Is `Location ID` supposed to be integer or string (or Categorical)? – smci Aug 03 '18 at 07:35

4 Answers4

10

Use post processing by str.lstrip:

df['Location ID'] = df['Location ID'].str.lstrip('0')
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
2

I had mixed type below line worked ..

 df['col'] = df['col'].apply(lambda x:x.lstrip('0') if type(x) == str else x)
Hietsh Kumar
  • 1,197
  • 9
  • 17
1
df['Location ID'] = df['Location ID'].apply(lambda x: x.lstrip('0'))
U13-Forward
  • 69,221
  • 14
  • 89
  • 114
0

For anyone with more complex strings (e.g., 'AB00003423'), you can use Series.str.extract() and a regular expression:

extractedNumbers = df.ID_col.str.extract('^[A-Z]+0+([0-9]+)$')

This will return a column of whatever is inside the parentheses (or "capture group(s)") of the regular expression.

Normally a dataframe is returned with 1 column per capture group, use expand=False to return a Series instead.

johnDanger
  • 1,990
  • 16
  • 22