KeyError when filtering excel data with pandas

Question

I am trying to read and filter from excel data in Python. I used the code below:

import pandas as pd
import numpy as np
df = pd.read_excel('file.xlsx') 
df['apples'] = (pd.cut(df['apples'],bins=[-np.inf,2,5,np.inf],labels=['WOW','ok','BOB']))
print(df)

This is my excel file

But KeyError: 'apples' occurs. Do you have any advice about how can I fix this?

When I tried this one, "TypeError: list indices must be integers, not list" error occurs @MaxU — OykuA, Feb 17 '17 at 11:49
So please provide the requested output to find out what is actually being read. — languitar, Feb 17 '17 at 11:57
@OykuA, i'd suggest you to post a link to uploaded Excel file, so we could reproduce this error... — MaxU - stand with Ukraine, Feb 17 '17 at 12:06
My actual data is seems almost like this : https://i.stack.imgur.com/iXhRt.png @languitar — OykuA, Feb 17 '17 at 12:06
@OykuA, do you expect people to type your data manually from that screenshot? ;-) — MaxU - stand with Ukraine, Feb 17 '17 at 12:10
It seems you need `df = pd.read_excel('file.xlsx', skiprows=1)` or `df = pd.read_excel('file.xlsx', header=1)` — jezrael, Feb 17 '17 at 12:11
@OykuA, please read [how to make good reproducible pandas examples](http://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) — MaxU - stand with Ukraine, Feb 17 '17 at 12:14
Thank you for your response @jezrael ` df = pd.read_excel('file.xlsx', skiprows=1, parse_cols= "Q") ` worked. — OykuA, Feb 17 '17 at 12:33

score 1 · Answer 1 · answered Feb 17 '17 at 12:01

Do you also want to modify the xlsx file? Or you just want to read it and apply some code to it? In the second case you could do:

df = df.drop(['apples'])

And you can input:

inputX = df.loc[:, ['oranges', 'lemons']].as_matrix()

It depends what do you want to do with it.

score 1 · Accepted Answer · answered Feb 17 '17 at 12:38

1

There is problem you have header with 2 rows, so by default columns of DataFrame are created by first row.

So need skip this first row by:

df = pd.read_excel('file.xlsx', skiprows=1)

Or:

df = pd.read_excel('file.xlsx', header=1)

answered Feb 17 '17 at 12:38

jezrael

822,522
95
1,334
1,252

KeyError when filtering excel data with pandas

2 Answers2