I have to port an algorithm from an Excel sheet to python code but I have to reverse engineer the algorithm from the Excel file.
The Excel sheet is quite complicated, it contains many cells in which there are formulas that refer to other cells (that can also contains a formula or a constant).
My idea is to analyze with a python script the sheet building a sort of table of dependencies between cells, that is:
A1 depends on B4,C5,E7 formula: "=sqrt(B4)+C5*E7"
A2 depends on B5,C6 formula: "=sin(B5)*C6"
...
The xlrd python module allows to read an XLS workbook but at the moment I can access to the value of a cell, not the formula.
For example, with the following code I can get simply the value of a cell:
import xlrd
#open the .xls file
xlsname="test.xls"
book = xlrd.open_workbook(xlsname)
#build a dictionary of the names->sheets of the book
sd={}
for s in book.sheets():
sd[s.name]=s
#obtain Sheet "Foglio 1" from sheet names dictionary
sheet=sd["Foglio 1"]
#print value of the cell J141
print sheet.cell(142,9)
Anyway, It seems to have no way to get the formul from the Cell object returned by the .cell(...) method. In documentation they say that it is possible to get a string version of the formula (in english because there is no information about function name translation stored in the Excel file). They speak about formulas (expressions) in the Name and Operand classes, anyway I cannot understand how to get the instances of these classes by the Cell class instance that must contains them.
Could you suggest a code snippet that gets the formula text from a cell?