I have a simple HDF5 file (created by PyTables) with ten columns and 100000 rows. For every value I have to apply a simple linear equation, with different parameters per column and write the stuff to CSV.
My naive approach was to loop over the table:
for row in table.iterrows():
print "%f,%f,..." % (row['a'] * 1.0 + 2.0, row['b'] * 3.0 + 4.0, ...)
But I wonder, whether it would be more efficient to select the columns and calculate them that way and later iterate over the resulting arrays:
a = numpy.add(numpy.multiply(table.cols.a, 1.0), 2.0)
b = numpy.add(numpy.multiply(table.cols.b, 3.0), 4.0)
But this is even slower, it seems.
What is the best way to do this?