0

I have a large Excel/vba workbook (~70 sheets, most of them data tables computing loads of values) that will need converting into an actual language, so it can be packaged as a web application. Some of the logic is in VBA, however most of it is actually in Excel cells/formulas/tables.

It feels large enough to consider a programmatic approach to this. So I have been looking at Pyxll. What I have in mind is along the following lines:

  1. Write a python/pyxll script that will convert sheets/tables into classes
  2. For each of those tables, columns would become either other classes and/or attributes
  3. The formulas in the cell would be extracted, named ranges would be converted to variables etc.
  4. Write out the python files based on this processing.
  5. Manual editing/code writing to make the skeleton created actually work as per the original Excel program.

This seems like a viable approach, given that pyxll claims to be able to fully interact with the workbook and to be able to do pretty much anything that VBA can do. However the docs/use case seem to be mostly explained in terms of going the other way around (e.g. implement stuff in python that can be called within Excel).

Any pyxll power-use out there? Is the plan outlined above realistic, or am I more likely to have an easier time just re-writing everything from scratch?

logicOnAbstractions
  • 2,178
  • 4
  • 25
  • 37
  • I agree that pyxll is mostly about the other way - writing Excel UDFs and GUI components in Python. It's also not free. To extract data from sheets you could manipulate Excel via the COM interface using [pywin32](https://pypi.org/project/pywin32/), [openpyxl](https://openpyxl.readthedocs.io/en/stable/) to read xlsx directly or [xlrd](https://pypi.org/project/xlrd/) for xls. – stevecu Jul 23 '21 at 23:20
  • @stevecu I can pay for a license if need be for this one-off operation. Thanks for the alternatives mentioned. Any of the 1st 2 you'd recommand over the other/reasons to do so (xlsx files involved)? Considering it's a one-off as well. The smaller the learning curve the better I'd say, as long as I can access all formulas/VBA/tables defined in excel I'm happy. – logicOnAbstractions Jul 25 '21 at 14:50
  • 1
    Openpyxl (or [xlsxwriter](https://xlsxwriter.readthedocs.io/)) can't get at the VBA modules, but Excel's COM API (with the same syntax as VBA's Application object) can via the [CodeModule](https://learn.microsoft.com/en-us/office/vba/language/reference/visual-basic-add-in-model/properties-visual-basic-add-in-model#codemodule) property. Then it will be easier to go back to a python xlsx reader library. You'll need a formula parser which [openpyxl ](https://openpyxl.readthedocs.io/en/stable/formula.html?highlight=token#parsing-formulas) can do. – stevecu Jul 26 '21 at 00:02

0 Answers0