0

I need to apply some processes for each month of a dataframe. The data has to be filtered monthly or else the results will be wrong.

My idea was to create a loop to then append those DF's and finally generate what is needed for my reports but it seems it is not possible to do it via f strings and i am out of ideas:


for i, a, m in it.zip_longest(range(1,3,1), range(1,3,1), range(1, 13,1)):
   f"df{i}" = df\
        .select("var0","var1","var2","varN","datetime")
        .filter((col("datetime") == f"202{str(a)}-{str(m).zfill(2)}-01") & 
                   col("datetime")==f"202{str(a)}-{str(m).zfill(2)}-01")

The code idea is to create DF's sequentially, up to the number of months needed, starting in 2022-01 up to 2022-12 then switching to 2023-01 up to 2023-12.

I have the following error:

SyntaxError: can't assign to function call

  • `f"df{i}"` creates a string not a dynamically named variable. – JonSG Mar 23 '23 at 13:37
  • Does this answer your question? [How do I create variable variables?](https://stackoverflow.com/questions/1373164/how-do-i-create-variable-variables) – JonSG Mar 23 '23 at 13:39
  • Will definetely give a look at creating dicts for this problem but to assign a DF as a value of the dict will not make it troublesome down the chain? I will need to transform it into a pandas DF to create graphs @JonSG – Cesar Pereira Mar 23 '23 at 14:03

1 Answers1

0

So i found a solution that does what i need

for a in range(22,24,1):
    for m in range(1, 13,1):
        globals()[f'df_{a}_{m}'] = df\
        .select("var")\
        .filter((col("datetime")==f"20{a}-{str(m).zfill(2)}-01") & col("datetime")==f"20{a}-{str(m).zfill(2)}-31")

This creates a new DF for each combination of year and month using the specified range. Then you can add whatever you need for each loop.