I have a large .txt file and I'd like to read each column as a list. The file has 9 columns of delimited floats, the first line (of a few thousand) is:
0.49406565E-323 0.29532530E+003 0.89244837E+001 0.20901651E-002 0.34989878E+001 0.11594090E+000 0.34025716E-001 0.33723126E+001 0.27954433E+000 0.80757378E-001 0.50813056E+001
I'm reading my file like this:
colnames = ['weight', 'likelihood', 'A_0', 'w_0', 'p_0', 'A_1', 'w_1', 'p_1', 'A_2', 'w_2', 'p_2']
data = pandas.read_csv('data.txt', names=colnames)
weights = data.weight.tolist()
A_0 = data.A_0.tolist()
The first column is the weight, and the rest are parameters and I want to perform a weighted average calculation of all the parameters with respect to their weights.
But if I print weights
, for example, it returns the entire file, and weights[0]
is the first row of the file.
For completion my weighted average would be something like:
weighted_A_0 = numpy.average(A_0, weights=weights)
Perhaps there's a neater way with pandas and numpy?
Thanks!