I'm using xlrd to sniff through a directory structure and pull out spreadsheets, reading the second row (row 1) down to "do stuff." The issue is that I don't know how to stop the reading/printing at the first empty row. I understand that rows are not "empty" objects, but I'd appreciate a little help showing how to check if all cells are empty. Here is the code I'm working with:
import xlrd
import os
def excel_file_filter(filename, extensions=['.xls', '.xlsx']):
return any(filename.endswith(e) for e in extensions)
def get_filenames(root):
filename_list = []
for path, subdirs, files in os.walk(root):
for filename in filter(excel_file_filter, files):
filename_list.append(os.path.join(path, filename))
return filename_list
spreadsheets = get_filenames('C:\\Temp')
for s in spreadsheets:
with xlrd.open_workbook(s) as wb:
cs = wb.sheet_by_index(0)
num_cols = cs.ncols
for row_index in range(1, cs.nrows):
print('Row: {}'.format(row_index))
for col_index in range(0, num_cols):
cell_object = cs.cell(row_index, col_index)
if cell_obj is not xlrd.empty_cell:
print('Col #: {} | Value: {}'.format(col_index, cell_obj))
What ends up happening is that it prints all the way through almost 1000 rows, when only the first say, 25 rows have content in them. The amount of content between the spreadsheets varies, so a generic solution (that does not rely on other optional libraries) that helps me understand how to detect an empty row and then break, would be appreciated.