0

I'm trying to merge 2 Dataframes. Due to Blobtrigger i have to check which file is it that's being read. Also i used Async because it jumped from one line to another (Multithread) and right now python executes commands line by line, which makes it easier for me to navigate but if that's redundant pls do tell. When it get's to pd.merge i get this error:

local variable 'Deb' referenced before assignment

async def main(myblob: func.InputStream,  outputblob: func.Out[str]) -> None:
  if  myblob.name.__contains__("Deb"):
      logging.info("Deb was found")
      Deb = read_excel_files("x", "Deb.xlsx")
      logging.info("Starting cleaning Process")        
      .....
      logging.info("Cleaning Deb is finished")
  if myblob.name.__contains__("Sach"):
      logging.info("Sach was found")
      Sach = read_excel_files("x", "Sach.xlsx")
      logging.info("Starting cleaning Process")
      ........
      logging.info("Cleaning Sach is finished")
      Konten = pd.merge(Sach, Deb, how="outer")
      outputblob.set(Konten.to_string())
      logging.info("Konten is uploaded")
    

i thought the Variables that's been used in first IF can be Accessed in second IF. i have just Observed that after this line

Sach = read_excel_files("x", "Sach.xlsx")

Deb which has a value will be Unassigned. should i used .Copy instead?

Mostafa Bouzari
  • 9,207
  • 3
  • 16
  • 26
  • 2
    "i thought the Variables that's been used in first IF can be Accessed in second IF" ...only if the first if check passed, otherwise that code did not run and the var is unbound – Anentropic Aug 04 '22 at 15:38
  • @Anentropic the seconed if always gets executed – Mostafa Bouzari Aug 04 '22 at 15:38
  • Say the first `if` statement doesn't get run because it is `False` , then the `Deb` variable was never defined. So the second `if`, even though it is `True` and is executed, has no idea what the variable `Deb` is (hence "referenced before assignment"). The `merge` in your second `if` statement should only run if both `blob contains "Deb"` and `blob contains "Sach"` is `True`. – Michael S. Aug 04 '22 at 15:48
  • @MostafaBouzari yes, but if the code inside _first_ `if` is not executed then you can't access the `Deb` variable inside the second `if`, because it hasn't been set yet (it doesn't exist) – Anentropic Aug 04 '22 at 15:49
  • Just Ran my code again and have Observed that after this line Sach = read_excel_files("x", "Sach.xlsx") Deb which has a value will be Unassigned. should i used .Copy instead? – Mostafa Bouzari Aug 04 '22 at 15:50
  • @MichaelS. i think when it wants to load the next file all the Data will be lost in BlobTrigger – Mostafa Bouzari Aug 04 '22 at 16:19
  • @Anentropic ..... – Mostafa Bouzari Aug 04 '22 at 16:19

1 Answers1

1

Try using an else after the first if statement to assign a value to Deb, such as None, same for the second and Sach. Then move your merge under a 3rd if statement that checks for truthiness of both Deb and Sach before attempting to merge

Something like:

  if  myblob.name.__contains__("Deb"):
      ...
      Deb = read_excel_files("x", "Deb.xlsx")
      ...
  else:
      Deb = pd.DataFrame()

  if myblob.name.__contains__("Sach"):
      ...
      Sach = read_excel_files("x", "Sach.xlsx")
      ...
  else:
      Sach = pd.DataFrame()

  if not Deb.empty and not Sach.empty:
      Konten = pd.merge(Sach, Deb, how="outer")
Joe Carboni
  • 421
  • 1
  • 6
  • ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). – Mostafa Bouzari Aug 04 '22 at 16:07
  • 2
    For the above, setting the variables equal to an empty dataframe in the `else` statements and then checking `if not Deb.empty and not Sach.empty` makes more sense. – Michael S. Aug 04 '22 at 16:21
  • @MichaelS. AttributeError: 'NoneType' object has no attribute 'empty' – Mostafa Bouzari Aug 04 '22 at 16:42
  • 1
    Did you remember to set `Deb` and `Sach` equal to `pd.DataFrame()` (an empty dataframe) in the `else` statements? – Michael S. Aug 04 '22 at 16:44
  • @MichaelS. always one of them is Empty. which means Blobtrigger Runs the code for each file i upload once from the beginning – Mostafa Bouzari Aug 04 '22 at 16:51
  • Good catch everyone. I changed to assignments of None to an empty frame and now the third if checks for emptiness. Should work now @MostafaBouzari – Joe Carboni Aug 05 '22 at 11:46