0

I'm new using pandas and I'm tring to use pandas read excel to work with a file as a df. The spreadsheet looks like this:

Excel Matrix

The problem is that this file contains double headers in the colums and rows and the first header for each of them include merged cells. I tried this:

file = 'country_sector_py.xlsx' 

matrix = pd.read_excel(file, sheet_name = 'matrix', header=[0, 1], index_col=[0, 1])

the error I get is "ValueError: Length of new names must be 1, got 2." I've read some related posts that says it's due to some the headers are repeated, but I haven't been able to solve it. any guide would be much appreciated.

References:

Pandas read excel sheet with multiple header when first column is empty

Error when using pandas read_excel(header=[0,1])

Cris_LbpdA
  • 3
  • 1
  • 2

1 Answers1

1

Not an answer, but to post more details than comments allow...

Using your code I cannot recreate.

import pandas as pd

df = pd.read_excel('matrix.xlsx', sheet_name = 'matrix', header=[0,1], index_col=[0, 1])

df

Worst I get is copying 'region 2' twice doesn't show again and also messes up the sub-columns numbering. Example:

enter image description here

Must be something else in your file. Share it if you can, else look around inside it, or even open and perhaps save as a different Excel version (maybe XLSM or if that then not than).

Maybe worth checking the version of Pandas with pip show pandas

>>>># pip show pandas
Name: pandas
Version: 1.3.0
Summary: Powerful data structures for data analysis, time series, and statistics
Home-page: https://pandas.pydata.org
Author: The Pandas Development Team
Author-email: pandas-dev@python.org
MDR
  • 2,610
  • 1
  • 8
  • 18