-1

I am unfamilar with the parsing of excel files and it has come to my attention that I should use xlrd.

I wish to pull out all the line with the "predicted" keyword.

The excel array is on sheet1 it is made of 4 columns (A-D) and I need to pull columns C and D (although the whole row would work if it is easier) for the rows which contain the keyword.

The file is large over 1700 rows and is .xlsx and I am writing in Canopy Enthought using python 3.3

Thankyou for any help.

M.Smith12
  • 51
  • 6

1 Answers1

0

The following should help to get you started:

import xlrd
from operator import itemgetter

keyword = "test"

workbook = xlrd.open_workbook(r"input.xlsx")
sheet = workbook.sheet_by_index(0)
rows = [sheet.row(row) for row in range(sheet.nrows)]   # Read in all rows

rows = [row for row in rows if keyword in ' '.join(str(col.value) for col in itemgetter(2, 3)(row))]    # Filter rows containing keyword somewhere

for row in rows:
    print(row)

It opens the first sheet in the xlsx file, and creates a list containing all of the rows. For each row, it merges all of the cells together from the required columns (C and D) and checks to see if the keyword is present. If it is the row is copied. Lastly it displays all of the matching rows.

Note, the data in rows is still in xlrd format allowing you to do further processing on them.

Martin Evans
  • 45,791
  • 17
  • 81
  • 97