I have an ascii file which contains in the first column the id of the day and in the following columns it contains values of each hour of the day (24 hours/columns). A sample of the file is shown below.
day h1 h2 h3 h4 h5 h6 h7 h8 h9 h10 h11 h12 h13 h14 h15 h16 h17 h18 h19 h20 h21 h22 h23 h24
1 0.4 0.4 0.2 0.2 0.4 0.8 0.9 0.8 1.2 1.4 1.8 1.4 1.7 2 1.8 2 1.5 1.4 1.4 1.1 1.5 1.5 1.2 0.9
2 0.7 0.6 0.3 0.4 0.8 1.4 2 1.7 1.5 1 1.2 0.8 0.8 0.8 0.6 2.2 3.2 2.7 4 5.2 6 5.6 5.4 5.6
I want to create a file that all the data values (24 hours) will be written in one column:
day hour val
1 h1 0.4
1 h2 0.4
1 h3 0.2
1 h4 0.2
1 h5 0.4
1 h6 0.8
1 h7 0.9
1 h8 0.8
1 h9 1.2
1 h10 1.4
1 h11 1.8
1 h12 1.4
1 h13 1.7
1 h14 2
1 h15 1.8
1 h16 2
1 h17 1.5
1 h18 1.4
1 h19 1.4
1 h20 1.1
1 h21 1.5
1 h22 1.5
1 h23 1.2
1 h24 0.9
2 h1 0.7
2 h2 0.6
2 h3 0.3
2 h4 0.4
2 h5 0.8
2 h6 1.4
2 h7 2
2 h8 1.7
2 h9 1.5
2 h10 1
2 h11 1.2
2 h12 0.8
2 h13 0.8
2 h14 0.8
2 h15 0.6
2 h16 2.2
2 h17 3.2
2 h18 2.7
2 h19 4
2 h20 5.2
2 h21 6
2 h22 5.6
2 h23 5.4
2 h24 5.6
I use df.melt as following :
df = pd.read_csv(i)
df_transposed = df.melt(id_vars='day', var_name='hour', value_name='val').sort_values('day')
df_transposed.to_csv(outfile)
but I get the following error:
df_transposed = df.melt(id_vars='day', var_name='hour', value_name='val').sort_values('day')
File "/usr/local/lib/python3.9/site-packages/pandas/core/frame.py", line 7166, in melt
return melt(
File "/usr/local/lib/python3.9/site-packages/pandas/core/reshape/melt.py", line 122, in melt
id_data = frame.pop(col)
File "/usr/local/lib/python3.9/site-packages/pandas/core/frame.py", line 4371, in pop
return super().pop(item=item)
File "/usr/local/lib/python3.9/site-packages/pandas/core/generic.py", line 661, in pop
result = self[item]
File "/usr/local/lib/python3.9/site-packages/pandas/core/frame.py", line 2906, in __getitem__
indexer = self.columns.get_loc(key)
File "/usr/local/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 2900, in get_loc
raise KeyError(key) from err
KeyError: 'day'
Do I have to define in another way the df.melt command?
Edit:
The df.columns command gives the following:
Index(['day\t"H1"\t"H2"\t"H3"\t"H4"\t"H5"\t"H6"\t"H7"\t"H8"\t"H9"\t"H10"\t"H11"\t"H12"\t"H13"\t"H14"\t"H15"\t"H16"\t"H17"\t"H18"\t"H19"\t"H20"\t"H21"\t"H22"\t"H23"\t"H24"'], dtype='object')
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 2898, in get_loc
return self._engine.get_loc(casted_key)
File "pandas/_libs/index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 101, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 1675, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 1683, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'day'
Edit2: The above problem solved with this:
df = pd.read_csv(i, sep='\t')
print (df.columns)
df_transposed = df.melt(id_vars='day', var_name='hour', value_name='val').sort_values('day')
df_transposed.to_csv(outfile, sep='\t')
However, I want also to order the output values by day and hour. I tried the following:
df.melt(id_vars='day', var_name='hour', value_name='val').sort_values(['day','hour'],ascending=True)
but the output file is something like below. Is it possible to order the hours by H1 to H24 and not alphabetically?
day hour val
0 1 H1 0.4
3285 1 H10 -9999.0
3650 1 H11 -9999.0
4015 1 H12 -9999.0
4380 1 H13 -9999.0
4745 1 H14 -9999.0
5110 1 H15 -9999.0
5475 1 H16 -9999.0
5840 1 H17 -9999.0
6205 1 H18 -9999.0
6570 1 H19 -9999.0
365 1 H2 0.4
6935 1 H20 -9999.0
7300 1 H21 -9999.0
7665 1 H22 -9999.0
8030 1 H23 -9999.0
8395 1 H24 -9999.0
730 1 H3 0.6
1095 1 H4 0.6
1460 1 H5 0.4
1825 1 H6 0.4
2190 1 H7 0.3
2555 1 H8 -9999.0
2920 1 H9 -9999.0
1 2 H1 -9999.0
3286 2 H10 -9999.0
3651 2 H11 -9999.0
4016 2 H12 -9999.0
4381 2 H13 -9999.0
4746 2 H14 -9999.0
5111 2 H15 -9999.0
5476 2 H16 -9999.0
5841 2 H17 -9999.0
6206 2 H18 -9999.0
6571 2 H19 -9999.0
366 2 H2 -9999.0
6936 2 H20 -9999.0
7301 2 H21 -9999.0
7666 2 H22 -9999.0
8031 2 H23 -9999.0
8396 2 H24 -9999.0
731 2 H3 -9999.0
1096 2 H4 -9999.0
1461 2 H5 -9999.0
1826 2 H6 -9999.0
2191 2 H7 -9999.0
2556 2 H8 -9999.0
2921 2 H9 -9999.0