0

Scenario:

  1. Read an excel file with multiple sheets
  2. For each sheet at input read a few specific columns and manipulate the existing data
  3. Create an output dataframe
  4. Save this dataframe in one existing output file as a new sheet

Below is the snippet of my code:

for sheet in all_sheets: # all_sheets is the list of sheets from the input file
    print(sheet) # Here i transform the data but removed the 
    writer = pd.ExcelWriter('Output.xlsx', engine='xlsxwriter')
    df_final_4.to_excel(writer, sheet_name=sheet, index=True)
    writer.save()

Problem Statement:

  1. The file is written with only one sheet which contains the data for the last iteration.
  2. The previous iteration gets overwritten and only one sheet remains in the output file.
  3. I have tried multiple approches but each yeild the same result.

Can anyone please guide me where i am going wrong?

Other approches:

for sheet in all_sheets:
    writer = pd.ExcelWriter('Output.xlsx', engine='xlsxwriter')
    df_final_4.to_excel(writer, sheet_name=sheet, encoding='utf-8', index=True)
    writer.save()

for sheet in all_sheets[8:10]:
    print(sheet)
    writer = pd.ExcelWriter('newfile.xlsx', engine='xlsxwriter')
    df_final_4.to_excel(writer, sheet_name=sheet, index=True)
    writer.save()

wb = openpyxl.Workbook()
wb.save('Output.xlsx')
book = load_workbook('Output.xlsx')
writer = pd.ExcelWriter('Output.xlsx', engine='openpyxl') 
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
df_final_4.to_excel(writer, sheet, index=True)
writer.save()
writer.close()
    
RSM
  • 645
  • 10
  • 25
  • in your for loops, try pulling writer.save() out. of the loop. That should be done after all the sheets have been written. – Jonathan Leon Jun 20 '22 at 04:16
  • tried that, but it still keeps overwriting previous sheets resulting in a workbook with only one sheet with the final iteration name. – RSM Jun 20 '22 at 04:40

2 Answers2

2

I was able to solve the issue by changing the sequence of the code:

writer = ExcelWriter('Output.xlsx')
for sheet in all_sheets:
    <do calculations>
    <do manipulations>
    <do slicing>
    
    df.to_excel(writer, sheet_name=sheet)
writer.save()

This is basically because we want to write multiple sheets in the same file so the initialization or definition of the writer should be only done once.

However, in case, only one sheet is required to be written then the writer can be defined anywhere inside the loop.

RSM
  • 645
  • 10
  • 25
0

I managed to solve this using dictionaries. create a dictionary with basic keys and the below code hellped:

df_dict = {}
for i in range(len(df)):
    df_dict[i] =  df[i]
writer = pd.ExcelWriter(path,engine = 'xlsxwriter')

for name,value in df_dict.items():
    value.to_excel(writer,sheet_name = 'x'+str(name))
    print("sheet {} written".format(name))

writer.save()

this created new sheets with the multiple dataframes.