Skipping range of rows after header through pandas.read_excel

Question

I know the argument usecols in pandas.read_excel() allows you to select specific columns.

Say, I read an Excel file in with pandas.read_excel(). My excel spreadsheet has 1161 rows. I want to keep the 1st row (with index 0), and skip rows 2:337. Seems like the argument skiprows works only when 0 indexing is involved. I tried several different ways but my code always produces an output where all my 1161 rows are read rather than only after the 337th row on. Such as this:

documentationscore_dataframe = pd.read_excel("Documentation Score Card_17DEC2015 Rev 2 17JAN2017.xlsx",
                                        sheet_name = "Sheet1",
                                        skiprows = "336",
                                        usecols = "H:BD")

Here is another attempt:

documentationscore_dataframe = pd.read_excel("Documentation Score Card_17DEC2015 Rev 2 17JAN2017.xlsx",
                                        sheet_name = "Sheet1",
                                        skiprows = "1:336",
                                        usecols = "H:BD")

I would like the dataframe to exclude rows 2 through 337 in the original Excel import.

jpp · Accepted Answer · 2018-04-12T16:05:06.967

34

As per the documentation for pandas.read_excel, skiprows must be list-like.

Try this instead to exclude rows 1 to 336 inclusive:

df = pd.read_excel("file.xlsx",
                   sheet_name = "Sheet1",
                   skiprows = range(1, 337),
                   usecols = "H:BD")

Note: range constructor is considered list-like for this purpose, so no explicit list conversion is necessary.

edited Apr 12 '18 at 16:05

answered Apr 12 '18 at 16:01

jpp

159,742
34
281
339

4

+1 for expliciting it's list-like, resolved my problem to read the header but skip first row just after it (`pd.read_excel(path, skiprows=[1])`) – mrbTT Feb 19 '19 at 14:10

score 0 · Answer 2 · answered Feb 17 '23 at 23:57

0

You can also pass a function to skiprows=. For example, to skip the first 336 rows (after the header row):

df = pd.read_excel('Book1.xlsx', sheet_name='Sheet1', skiprows=lambda x: 1<=x<=336)

answered Feb 17 '23 at 23:57

cottontail

10,268
18
50
51

Skipping range of rows after header through pandas.read_excel

2 Answers2

Linked