I'm looking for a way to retrieve sorted records from an hdf
table. Here is a python MWE:
import tables
import numpy as np
class Measurement(tables.IsDescription):
time = tables.Float64Col()
value = tables.Float64Col()
h5 = tables.open_file('test.hdf', 'w')
h5.create_table('/', 'test', Measurement)
table = h5.root.test
data = np.array([(0, 6), (5, 1), (1, 8)], dtype=[('time', '<f8'), ('value', '<f8')])
table.append(data)
table.cols.time.createCSIndex()
Now I'd like to retrieve all records with time > 0
, sorted by time
. If I do:
table.read_where('time > 0')
then it gets:
array([(5.0, 1.0), (1.0, 8.0)], dtype=[('time', '<f8'), ('value', '<f8')])
which is not sorted by time
. If I attempt to use read_sorted
then I get the entire table instead of a subset (there's no condition argument to read_sorted
).
What is the common practice? Should I ensure that my tables are stored sorted in the database? Or should I sort myself the retrieved set after read_where
?