1

Regards,

Apologies if this question appears be to a duplicate of other questions. But I could find an answer that addresses my problem in its exactitude.

I split a dataframe, called "data", into multiple subsets that are stored in a dictionary of dataframes named "dfs" as follows:

# Partition DF

dfs = {}
chunk = 5

for n in range((data.shape[0] // chunk + 1)):
    df_temp = data.iloc[n*chunk:(n+1)*chunk]
    df_temp = df_temp.reset_index(drop=True)
    dfs[n] = df_temp

Now, I would like to apply a pre-defined helper function called "fun_c" to EACH of the dataframes (that are stored in the dictionary object called "dfs").

Is it correct for me to apply the function to the dfs in one go, as follows(?):

result = fun_c(dfs)

If not, what would be the correct way of doing this?

Redowan Delowar
  • 1,580
  • 1
  • 14
  • 36
john_mon
  • 487
  • 1
  • 3
  • 13
  • sounds like you just want a function for a single dataframe and you can iterate through the dataframes so "fun_c(df)" makes more sense to me. i recommend just trying it out and maybe the question will answer itself. – dtc Feb 09 '20 at 07:50

2 Answers2

1

Since you want this:

Now, I would like to apply a pre-defined helper function called "fun_c" to EACH of the dataframes (that are stored in the dictionary object called "dfs").

Let's say your dataframe dict looks like this and your helper function takes in a single dataframe.

dfs = {0 : df0, 1: df1, 2: df2, 3:df3}

Let's iterate through the dictionary, apply the fun_c function on each of the dataframes, and save the results in another dictionary having the same keys:

dfs_result = {k:fun_c[v] for k, v in dfs.items()}
Redowan Delowar
  • 1,580
  • 1
  • 14
  • 36
1

it depends on the output you're looking for:

  • If you want a dict in the output, then you should apply the function to each dict item
result = dict({key: fun_c(val) for key, val in dfs.items()})
  • If you want a list of dataframes/values in the output, then apply the function to each dict value
result = [fun_c(val) for val in dfs.items()]

But this style isnt wrong either, you can iterate however you like inside the helper function as well:

def fun_c(dfs):

    result = None
    # either
    for key, val in dfs.items():
        pass
    # or
    for val in dfs.values():
        pass
    return result

Let me know if this helps!

Ramsha Siddiqui
  • 460
  • 6
  • 20