3

I have a column in my dataframe that I would like to convert to datatype int. However it is throwing an error because some of the rows have letters in their entries. I would like to create a new dataframe that only has entries in this column with pure numeric entries (or at least no letters).

So my question is: Is there a way to do something like the following,

df=df[df['addzip'].str.contains("a")==False]

But with a list where the "a" is? See the example below,

df=df[df['addzip'].str.contains(list(str(string.ascii_lowercase)+str(string.ascii_uppercase)))==False]

I know that this very possible to do with an apply command but I would like to keep this as vectorized as possible so that is not what I am looking for. So far I haven't found any solutions anywhere else on stack overflow.

sfortney
  • 2,075
  • 6
  • 23
  • 43

1 Answers1

6

Just use a regular expression

df = df[~df['addzip'].str.contains("[a-zA-Z]").fillna(False)]
Alex
  • 18,484
  • 8
  • 60
  • 80
  • 1
    @Alex Excellent. This is exactly what I was looking for. :) Worked great. I knew there had to be a way to do this without resorting to an apply function. Thank you so much. – sfortney Jan 30 '15 at 22:46