0

I have just created and filled my first PyTables file. Trying to query the data, I ran into a problem. There is a column ic_name which is of type StringCol(500) and I have created an index for this column. The following code works fine:

count = 0
for x in f.root.raw.projects:
    if x['ic_name']=="XXX":
        count += 1

The value of count is a bit more than 200.000 afterwards, which is the correct value. To speed up the query, I created the index and wanted to query like this:

f.root.raw.projects.where('ic_name == "XXX"')

Now I get back only 180 results!? Any hint what might be going on here?

Achim
  • 15,415
  • 15
  • 80
  • 144
  • 1
    Can you provide a minimal script which reproduces the error? It is difficult to know what exactly the problem is given the information provided. My initial guess is that the column has a length of 500 but you are comparing it against a length 3 string so of course it will fail. However this doesn't explain why *some* cases succeed. You might also try running a null terminated condition, `'ic_name == "XXX\0"'`. – Anthony Scopatz Oct 28 '13 at 21:54
  • Did you figure this out? – Joel Vroom Nov 14 '13 at 19:52

1 Answers1

0

You didn't provide any test script so I'm not sure if this applies to your problem. But I recently came across a similar problem. For some reason, sometimes the search for string values succeeded and sometimes not. My solution was to call the flush() method on tables after adding any data (using your example: f.root.raw.projects.flush()).

Another problem might be that the search expression takes only bytes, not string. In your example, the correct way should be one of the following:

f.root.raw.projects.where('ic_name == b"XXX"')

f.root.raw.projects.where('ic_name == ' + str("XXX".encode()))

f.root.raw.projects.where('ic_name == ' + str("XXX".encode(encoding)))
rlat
  • 492
  • 4
  • 12