I have a list of dictionaries, with spectral data inside the responses field. I also have array of wavelengths for labelling the columns for the spectral data. The list/input looks like this:
data = [ {
'date': '2018-01-01',
'measurement': 100,
'responses': [(1, 1, np.array([1, 2, 3])),
(2, 1, np.array([4, 5, 6])),
]
},
{
'date': '2018-01-02',
'measurement': 200,
'responses': [(3, 1,np.array([5, 6, 7])),
(4, 1, np.array([8, 9, 10])),
]
},
]
And the column names for wavelengths to match:
wavelengths = [400,401,402]
I would like to convert this list to two pandas dataframes;
- One where the array in the response tuples are averaged, and
- One where they are separate, where the first to numbers in the tuple in responses are included.
The desired output for both is below:
__Average Dataframe__
index | date | measurement | 400 | 401 | 402 |
0 | '2018-01-01' | 100 | 2.5 | 3.5 | 4.5 |
1 | '2018-01-02' | 200 | 6.5 | 7.5 | 8.5 |
__Seperate Dataframe__
index | date | measurement | prong | scan| 400 | 401 | 402 |
0 | '2018-01-01' | 100 | 1 | 1 | 1 | 2 | 3 |
1 | '2018-01-01' | 100 | 2 | 1 | 4 | 5 | 6 |
2 | '2018-01-02' | 200 | 3 | 1 | 5 | 6 | 7 |
3 | '2018-01-02' | 200 | 4 | 1 | 8 | 9 | 10 |
What is the most efficient way to do this in pandas?