I'm trying to do a simple filter on an HDF5-formatted table, using Python pandas. Works fine when I query by the 'subject' column alone:
> df_test = pd.read_hdf(result_file, where=['subject==andrew'])
> print(df_test)
Which gives the output:
subject condition time pupil_diam luminance gaze_x gaze_y
... ... ... ... ... ... ... ...
180519 andrew light 5885480250 2.50 0.768958 1723.85 267.11
180520 andrew light 5885482247 2.50 0.769088 1723.33 266.81
180521 andrew light 5885484249 2.51 0.769405 1718.93 267.91
Also works when I query by the 'luminance' column alone:
> df_test = pd.read_hdf(params['result_file'], where=['luminance>0'])
> print(df_test)
subject condition time pupil_diam luminance gaze_x gaze_y
79005 mary light 3813968998 3.22 0.225418 257.11 761.28
79006 mary light 3813970992 3.22 0.227119 256.38 761.13
79007 mary light 3813972992 3.21 0.227119 256.13 760.53
... ... ... ... ... ... ... ...
But putting these together with an "&" gives an empty result (as you can see above, there are definitely rows where both conditions are true):
> df_test = pd.read_hdf(params['result_file'], where=['subject==andrew & luminance>0'])
> print(df_test)
Empty DataFrame
Columns: [subject, condition, time, pupil_diam, luminance, gaze_x, gaze_y]
Index: []
Although this query works when I use:
> df_test = pd.read_hdf(params['result_file'], where=['subject==mary & luminance>0'])
> print(df_test)
subject condition time pupil_diam luminance gaze_x gaze_y
79005 mary light 3813968998 3.22 0.225418 257.11 761.28
79006 mary light 3813970992 3.22 0.227119 256.38 761.13
79007 mary light 3813972992 3.21 0.227119 256.13 760.53
... ... ... ... ... ... ... ...
New to pandas, so it could be I'm missing something wrt. syntax, but haven't yet found a decent solution/explanation in the docs or online forums...