1

I have two functions, one which creates a dataframe from a csv and another which manipulates that dataframe. There is no problem the first time I pass the raw data through the lsc_age(import_data()) functions. However, I get the above-referenced error (TypeError: 'DataFrame' object is not callable) upon second+ attempts. Any ideas for how to solve the problem?

def import_data(csv,date1,date2):
    global data
    data = pd.read_csv(csv,header=1)
    data = data.iloc[:,[0,1,4,6,7,8,9,11]]
    data = data.dropna(how='all')
    data = data.rename(columns={"National: For Dates 9//1//"+date1+" - 8//31//"+date2:'event','Unnamed: 1':'time','Unnamed: 4':'points',\
              'Unnamed: 6':'name','Unnamed: 7':'age','Unnamed: 8':'lsc','Unnamed: 9':'club','Unnamed: 11':'date'})
    data = data.reset_index().drop('index',axis=1)
    data = data[data.time!='Time']
    data = data[data.points!='Power ']
    data = data[data['event']!="National: For Dates 9//1//"+date1+" - 8//31//"+date2]
    data = data[data['event']!='USA Swimming, Inc.']
    data = data.reset_index().drop('index',axis=1)
    for i in range(len(data)):
        if len(str(data['event'][i])) <= 3:
            data['event'][i] = data['event'][i-1]
        else:
            data['event'][i] = data['event'][i]
    data = data.dropna()
    age = []
    event = []
    gender = []
    for row in data.event:
        gender.append(row.split(' ')[0])
        if row[:9]=='Female 10':
            n = 4
            groups = row.split(' ')
            age.append(' '.join(groups[1:n]))
            event.append(' '.join(groups[n:]))
        elif row[:7]=='Male 10':
            n = 4
            groups = row.split(' ')
            age.append(' '.join(groups[1:n]))
            event.append(' '.join(groups[n:]))
        else:
            n = 2
            groups = row.split(' ')
            event.append(' '.join(groups[n:]))
            groups = row.split(' ')
            age.append(groups[1])
    data['age_group'] = age
    data['event_simp'] = event
    data['gender'] = gender
    data['year'] = date2
    return data

def lsc_age(data_two):
    global lsc, lsc_age, top, all_performers
    lsc = pd.DataFrame(data_two['event'].groupby(data_two['lsc']).count()).reset_index().sort_values(by='event',ascending=False)
    lsc_age = data_two.groupby(['year','age_group','lsc'])['event'].count().reset_index().sort_values(by=['age_group','event'],ascending=False)
    top = pd.concat([lsc_age[lsc_age.age_group=='10 & under'].head(),lsc_age[lsc_age.age_group=='11-12'].head(),\
                 lsc_age[lsc_age.age_group=='13-14'].head(),lsc_age[lsc_age.age_group=='15-16'].head(),\
                 lsc_age[lsc_age.age_group=='17-18'].head()],ignore_index=True)
    all_performers = pd.concat([lsc_age[lsc_age.age_group=='10 & under'],lsc_age[lsc_age.age_group=='11-12'],\
                            lsc_age[lsc_age.age_group=='13-14'],lsc_age[lsc_age.age_group=='15-16'],\
                            lsc_age[lsc_age.age_group=='17-18']],ignore_index=True)
    all_performers = all_performers.rename(columns={'event':'no. top 100'})
    all_performers['age_year_lsc'] = all_performers.age_group+' '+all_performers.year.astype(str)+' '+all_performers.lsc
    return all_performers

years = [i for i in range(2008,2018)]
for i in range(len(years)-1):
    lsc_age(import_data(str(years[i+1])+"national100.csv",\
    str(years[i]),str(years[i+1])))
Arpit Solanki
  • 9,567
  • 3
  • 41
  • 57
KriKors
  • 13
  • 1
  • 1
  • 3
  • Do you know which line the error is occurring on? – LondonRob Aug 23 '17 at 17:00
  • include the full traceback of the error in the question by editing the question – Arpit Solanki Aug 23 '17 at 17:03
  • 4
    As a side note, I would *strongly* discourage the use of globals. Either pass the arguments to the function or else use a class. – Alexander Aug 23 '17 at 17:07
  • 2
    1) Don't name exactly the same functions and data objects. 2) Don't make object global and return it by 'return' in the same function. It's all lead to some confusion. – CrazyElf Aug 23 '17 at 17:09
  • @KriKors in case one of the answers helped to solve your problem, please mark the one you choose as accepted. Thanks. – silvanoe Aug 24 '17 at 15:49
  • Hi all - thanks for the feedback. Removing the global line solved the problem. Will read more about functions as objects. Much appreciated! – KriKors Aug 24 '17 at 22:27

1 Answers1

6

During the first call to your function lsc_age() in line

lsc_age = data_two.groupby(['year','age_group','lsc'])['event'].count().reset_index().sort_values(by=['age_group','event'],ascending=False)

you are overwriting your function object with a dataframe. This is happening since you imported the function object from the global namespace with

global lsc, lsc_age, top, all_performers

Functions in Python are objects. Please see more information about this here.

To solve your problem, try to avoid the global imports. They do not seem to be necessary. Try to pass your data around through the arguments of the function.

silvanoe
  • 131
  • 6