0

Refer the following code

# import
import pandas as pd
import numpy as np
import string 

# create data frame
data = {'Name': ['Jas,on', 'Mo.lly', 'Ti;na', 'J:ake', '!Amy', "Myself"]}
df = pd.DataFrame(data, columns = ['Name'])
df

# get cleanName - Function
def getCleanName(pName):
    vRetVals = pName.translate(str.maketrans(" ", " ", string.punctuation))
    return(vRetVals)  

# clean Name
print("PreClean Good Rows", df.shape[0] - df.Name.map(lambda v:v.isalpha()).sum())
df['Name'] = [getCleanName for n in df.Name]
print("PostClean Good Rows", df.shape[0] - df.Name.map(lambda v: v.isalpha()).sum())

Issue

When the below line is run for the first time, it runs properly:

print("PreClean Good Rows", df.shape[0] - df.Name.map(lambda v: v.isalpha()).sum())

when the same line is run for the second time, it gives the following error

AttributeError: 'function' object has no attribute 'isalpha'

Any ideas, what is causing the issue?

martineau
  • 119,623
  • 25
  • 170
  • 301
Cyrus Lentin
  • 163
  • 1
  • 7
  • 1
    Where you have `[getCleanName for n in df.Name]`, perhaps you mean `[getCleanName(n) for n in df.Name]`. Otherwise you are just putting a load of references to the function into a list, instead of calling the function. – khelwood Mar 22 '19 at 15:48

1 Answers1

2

You forgot to call getCleanName, so your list ends up a bunch of identical references to the function. Change it to:

df['Name'] = [getCleanName(n) for n in df.Name]
#                         ^^^ changed

to actually call the function and use the results.

ShadowRanger
  • 143,180
  • 12
  • 188
  • 271