I'm having difficulties in reading a particular cell value from Excel in xlrd. Whatever value I'm reading (date value) is getting converted to a number. I know there are solutions to convert it into a python date format, but can I read directly the string value in xlrd?
-
2That's the way Excel works. Sadly, many values are stored by Excel as numbers. Please provide the details of the cell value you're reading in XLRD. Print the cell type in particular and included it in your question. – S.Lott Apr 18 '11 at 20:09
3 Answers
xlrd does NOT convert dates to float. Excel stores dates as floats.
Quoting from the xlrd documentation (scroll down a page):
Dates in Excel spreadsheets
In reality, there are no such things. What you have are floating point numbers and pious hope. There are several problems with Excel dates:
(1) Dates are not stored as a separate data type; they are stored as floating point numbers and you have to rely on (a) the "number format" applied to them in Excel and/or (b) knowing which cells are supposed to have dates in them. This module helps with (a) by inspecting the format that has been applied to each number cell; if it appears to be a date format, the cell is classified as a date rather than a number.
(2) ... When using this package’s
xldate_as_tuple()
function to convert numbers from a workbook, you must use thedatemode
attribute of theBook
object.
See also the section on the Cell class to learn about the type of cells, and the various Sheet methods which extract the type of a cell (text, number, date, boolean, etc).
Check out python-excel.org for info on other Python Excel packages.

- 5,113
- 1
- 37
- 51

- 81,303
- 11
- 141
- 189
-
I wonder how Excel knows when cells should be interpreted as dates -- it appears that xlrd is forced to guess. – Brent Bradburn Dec 21 '12 at 02:31
-
@nobar: Excel has exactly the same information available as xlrd does. Both rely on introspecting the "number format. – John Machin Dec 28 '12 at 20:39
-
4Actually, excel files store a cell type in addition to each cell value. Each type is stored as an integer, which you can access in xlrd with sheet.cell_type(rowNumber, columnNumber). 0 is blank, 1 is text, 2 is a number, 3 is a date. – Brian Mar 08 '13 at 21:16
-
See https://secure.simplistix.co.uk/svn/xlrd/trunk/xlrd/doc/xlrd.html#sheet.Cell-class for more. – Brian Mar 08 '13 at 21:49
-
1@Brian: Actually, excel files have no such thing. The 0 is empty (NOT blank), 1 is text, etc is derived by xlrd from information available in the file: for XLS files, floats are stored in NUMBER or RK or MULRK records, test is stored directly in LABEL records or indirectly in a shared string table accessed via LABELSST records, no record at all -> "empty", blank (formatted but no data) cells appear in BLANK or MULBLANKS records, etc. In XLSX files, the types are available in element attributes in the XML. – John Machin Mar 09 '13 at 03:35
-
1Ah, I've only been dealing with xlsx files and didn't realize xls didn't store that. Thanks for clarifying, John. – Brian Mar 11 '13 at 14:21
-
1@Brian: ... and the types are NOT stored as integers in XLSX files. The possible values of the t (type) attribute of the c (cell) element are: n, s, str, b, e, inlineStr. They are more of how to unpick the XML element than what data type. The integers have to be derived, just as with XLS files. – John Machin Mar 11 '13 at 20:53
well, as you say:
# reading from a xls file (no .xlsx files, no writing!)
import xlrd # install xlrd from http://pypi.python.org/pypi/xlrd
wb = xlrd.open_workbook("YOUR_FILE.xls") # xls file to read from
sh1 = wb.sheet_by_index(0) # first sheet in workbook
sh2 = wb.sheet_by_name('colors') # sheet called colors
# print all rows in first sheet
print "content of", sh1.name # name of sheet
for rownum in range(sh1.nrows): # sh1.nrows -> number of rows (ncols -> num columns)
print sh1.row_values(rownum)
# rowx and colx (x for Excel) start at 1!
print "row3 col 2:", sh1.cell(rowx=3,colx=2).value
col = sh1.col_values(0) # column 0 as a list of string or numbers
print '"A" column content:' # python index 0, 1.colunm, called A
for cell in col: print cell
print sh1.col_values(1) # 2. column, note mix of string (header) and numbers!
FOR THIS EXAMPLE THE XLS is:
sheet 1:listing
name latitude longitude status color date
Mount Hood 45.3736 121.6925 active red 01-ene-01
Mount Jefferson 44.6744 121.7978 dormant yellow 23-sep-05
Three-Fingered 44.478 121.8442 extinct green
Mount Washington 4.3325 121.8372 extinct green
South Sister 44.1036 121.7681 active red
Diamond Peak 43.5206 122.1486 extinct green
Mount Thielsen 43.1531 122.0658 extinct green
Mount Scott 42.923 122.0163 dormant yellow
Mount McLoughlin 2.445 122.3142 dormant yellow
sheet 2:colors
status color
active red
dormant yellow
extinct green

- 24,683
- 80
- 235
- 386
-
When i do a sh1.row_values(rownum) I'm getting list with the values already converted to floating point. – Rnet Apr 18 '11 at 20:02
Excel stores dates as numbers both internally and in .xls files and then formats them accordingly when displaying. Thus, if you read them naively with xlrd, you will get either numbers or strings. What you should do is check what the type of a cell is and then convert the number yourself. Either using xlrd's built-in functions, such as xldate_as_tuple()
, or your own function.
Refer to this question for some more details.