I have this file which is specified below,
As you can see it consists of many header layers, how can i read this file in R / Python so that i could get it in proper format for processing it?
I have this file which is specified below,
As you can see it consists of many header layers, how can i read this file in R / Python so that i could get it in proper format for processing it?
You can manually specify column names when reading with Pandas.
import pandas as pd
file_name = r"/foo/bar/data.xlsx"
columns = ["Foo", "Bar", "Baz"]
df = pd.read_excel(file_name, header=None, skiprows=7, names=columns)
To set mutli-level columns:
df = pd.DataFrame({'Foo':[1,2,3],'Bar':[2,4,6], "Baz": [3, 6, 9]})
columns = [("Cereals", "Rice", "Autumn"), ("Cereals", "Rice", "Summer"), ("Cereals", "Wheat", "Winter")]
df.columns = pd.MultiIndex.from_tuples(columns)
In pandas you could look at the Hierarchical indexing (MultiIndex) http://pandas.pydata.org/pandas-docs/stable/advanced.html
But as your after Proper heading then do as "Batman" said above by reading in and applying your own column headings