How do I down-sample (linearly) one dataframe (counts at some distribution of diameters, logged at the lower bound, so the first entry is 0 counts between 296.54 and 303.14 nm, the second entry is 1 count between 303.14 and 311.88 nm etc).
296.54 303.14 311.88 320.87 ... 359.49 369.86 380.52 391.49
a 0 1 2 3 ... 7 8 9 10
b 11 12 13 14 ... 18 19 20 21
c 22 23 24 25 ... 29 30 31 32
d 33 34 35 36 ... 40 41 42 43
e 44 45 46 47 ... 51 52 53 54
f 55 56 57 58 ... 62 63 64 65
g 66 67 68 69 ... 73 74 75 76
h 77 78 79 80 ... 84 85 86 87
i 88 89 90 91 ... 95 96 97 98
j 99 100 101 102 ... 106 107 108 109
to a new dataframe by resampling the counts to a coarser set of diameters. Like this (fist entry is counts between 300 up to 325 nm, etc):
300 325 350 375
a 4.34 interp sum btwn 325 and 350 btwn 350 and 375 btwn 375 and 400
b and so on
c
d
e
f
g
h
i
j
Is there a Pandas interpolate function, but downsampling by a linear sum, rather than upsampling?
I tried something like this:
test_array=(np.arange(110)).reshape(10,11)
index_list=list(string.ascii_lowercase)[:10]
df=pd.DataFrame(test_array, index=index_list)
df.columns= [296.54,303.14,311.88,320.87,330.12,339.63,349.42,359.49,369.86,380.52,391.49]
new_columns=[300,325,350,375]
new_df=test_df.groupby(new_columns, axis=1).sum()
But that doesn't work. Gives the obvious key error. One proposed solution was to use the index location, but that didn't interpolate across partial values.
many thanks