Pandas read excel sheet with multiple header in row and columns and merged cells

Question

I'm new using pandas and I'm tring to use pandas read excel to work with a file as a df. The spreadsheet looks like this:

Excel Matrix

The problem is that this file contains double headers in the colums and rows and the first header for each of them include merged cells. I tried this:

file = 'country_sector_py.xlsx' 

matrix = pd.read_excel(file, sheet_name = 'matrix', header=[0, 1], index_col=[0, 1])

the error I get is "ValueError: Length of new names must be 1, got 2." I've read some related posts that says it's due to some the headers are repeated, but I haven't been able to solve it. any guide would be much appreciated.

References:

Pandas read excel sheet with multiple header when first column is empty

Error when using pandas read_excel(header=[0,1])

MDR · Accepted Answer · 2021-08-07T20:34:31.490

Not an answer, but to post more details than comments allow...

Using your code I cannot recreate.

import pandas as pd

df = pd.read_excel('matrix.xlsx', sheet_name = 'matrix', header=[0,1], index_col=[0, 1])

df

Worst I get is copying 'region 2' twice doesn't show again and also messes up the sub-columns numbering. Example:

Must be something else in your file. Share it if you can, else look around inside it, or even open and perhaps save as a different Excel version (maybe XLSM or if that then not than).

Maybe worth checking the version of Pandas with pip show pandas

>>>># pip show pandas
Name: pandas
Version: 1.3.0
Summary: Powerful data structures for data analysis, time series, and statistics
Home-page: https://pandas.pydata.org
Author: The Pandas Development Team
Author-email: pandas-dev@python.org

It was because I had an old version of pandas, thank you – Cris_LbpdA Aug 09 '21 at 14:53 — Cris_LbpdA, Aug 09 '21 at 14:53

Pandas read excel sheet with multiple header in row and columns and merged cells

1 Answers1