1

My coworker asked me to look at his code and try to make it go faster. The goal of the code is to read some large excel sheets, do some operations with the data and then write to an excel sheet.

I realized the main problem in the code was the pd.ExcelFile() operation, on top of itself being a slow method it was being called a number of unnecessary times. Fixing this gave a 12x speed up to the code.

Nevertheless, when running this on my coworkers computer the improvement is not there. I've run a profile of the code (@ his PC) and found out that in his computer the most expensive operation is that of creating a dictionary for every sheet of the excel. Something like:

dts = {sheet_name: road_segment_file[i].parse(sheet_name) for sheet_name in road_segment_file[i].sheet_names}

I'm using Ubuntu and running Python 3.7.6 on the shell , while he's using windows and running Python 3.8.5 with Spyder.

I have two questions then:

  1. Any ideas on why the dictionary creation is way slower on his run?
  2. How can in general a dictionary creation be sped up.

Thank you!

benr
  • 45
  • 6

0 Answers0