0

I have a pandas df with a column (let's say col3) containing a number. These numbers are used in multiple rows and I want to run a function for rows of each number separatly.

So I wrote each number once into an array like this:

l = df.col3.unique()

Then a for loop is used to run a function for each number:

for i in l:
   a,b = func(df[df.col3 == i])

So the function gets rows where col3 contains the value of i of each run. The function returns two data frames (a and b in this case). I need these two returned data frames of each run.

I want to be able to identify them properly. For that I would like to save returned data frames within the loop like this:

First run: a123, b123 Second run a456, b456 Third run: a789, b789

Means the name of the dataframe contains the current value of i.

I already read I should not use global variables for dynamic variable names but do not know how to realize this instead.

Thank you :)

MaMo
  • 569
  • 1
  • 10
  • 27
  • How do you use these data frames? – Lambda Mar 20 '18 at 03:16
  • I will compare the results. Why is that important? I just want the current value of i within a run to be part of the names of the two data frames. – MaMo Mar 20 '18 at 10:17
  • Why is the name of the variables so important? You can use a dict with the col3's value as the key to save the dataframes. – Lambda Mar 20 '18 at 10:48
  • I need it because if I just call them df1, df2,... then I always need to have a look into them to remember which one contains what data. With the names I would immediatly see which data is in data frame a123. yes, I read about dictonaries but I can't manage to use them properly for my issue. That is why I'm asking this community. – MaMo Mar 20 '18 at 14:26

1 Answers1

1

Solution A (recommended):

dfs = {}

for i in l:
    dfs["a"+str(i)], dfs["b"+str(i)] = func(df[df.col3 == i])
...

And then you can use the dataframes like this:

func2(dfs["a1"]) # dfs["a1"] represents func(df[df.col3 == i])'s first return.
...

Solution B (not recommended)

If you absolutely want to use local variables, you need:

for i in l:
    locals()["a"+str(i)], locals()["b"+str(i)] = func(df[df.col3 == i])

And then you can use the dataframes with their variable names a1,b1 etc.

Lambda
  • 1,392
  • 1
  • 9
  • 11
  • Used your first solution and splitted the list afterwards to have separate dataframes like I need it. Thanks a lot! :) – MaMo Mar 21 '18 at 10:30