Questions tagged [pytables]

A Python library for working with extremely large hierarchical (HDF5) datasets.

PyTables is a package for managing hierarchical (HDF5) datasets and designed to efficiently and easily cope with extremely large amounts of data. PyTables is available as a free download.

PyTables is built on top of the HDF5 library, using the Python language and the NumPy package. It features an object-oriented interface that, combined with C extensions for the performance-critical parts of the code (generated using Cython), makes it a fast, yet extremely easy to use tool for interactively browse, process and search very large amounts of data.

Links to get started:
- Documentation
- Tutorials
- Library Reference
- Downloads

617 questions
0
votes
1 answer

Is there a way to view a list of PyTable file marks?

In the Pytable file class you can get_current_mark(), goto(mark), undo(mark), mark(), etc. but is there a way to see a list of all available marks? Reference: http://pytables.github.io/usersguide/libref/file_class.html
Joel Vroom
  • 1,611
  • 1
  • 16
  • 30
0
votes
1 answer

Set type in pytables

I have data in the following form: "blue red" "blue magenta cyan" "yellow red" "black" The max number of elements in each row is 10 but there can be thousands of labels/categories/colors. I would like to insert this data somehow in a pytables…
elyase
  • 39,479
  • 12
  • 112
  • 119
0
votes
1 answer

Cannot open matlab files using the latest HDF5

I recently upgrade tables on my python installation and some strange things seem to be happening with the HDF5 libraries. I've got a bunch of data that was originally saved as a .mat file, which uses the HDF5 format. I've been reading this into…
choldgraf
  • 3,539
  • 4
  • 22
  • 27
0
votes
1 answer

String comparison in PyTables / Numexpr

I have just created and filled my first PyTables file. Trying to query the data, I ran into a problem. There is a column ic_name which is of type StringCol(500) and I have created an index for this column. The following code works fine: count =…
Achim
  • 15,415
  • 15
  • 80
  • 144
0
votes
1 answer

knn search using HDF5

I'm trying to do knn search on big data with limited memory. I'm using HDF5 and python. I tried bruteforce linear search(using pytables) and kd-tree search (using sklearn) It's suprising but kd-tree method takes more time(maybe kd-tree will work…
mrgloom
  • 20,061
  • 36
  • 171
  • 301
0
votes
1 answer

pytables: how to fill in a table row with binary data

I have a bunch of binary data in N-byte chunks, where each chunk corresponds exactly to one row of a PyTables table. Right now I am parsing each chunk into fields, writing them to the various fields in the table row, and appending them to the…
Jason S
  • 184,598
  • 164
  • 608
  • 970
0
votes
1 answer

get information about pytables metadata

Is there any way to iterate over the fields of a table metaclass object? (NOT the table itself, I need to do some preliminary analysis before a table is even instantiated) I'm not really familiar with metaclasses in Python, so this is mystery stuff…
Jason S
  • 184,598
  • 164
  • 608
  • 970
0
votes
1 answer

Python PyTables API Bridge for Version 2.3.1 and 3.0.0

Did somebody already implement an open source bridge to make python programs work with PyTables 2.3.1 and PyTables 3.0.0 at the same time? Although PyTables promises to work with the old API until 3.1.0, I encountered some glitches. For example,…
SmCaterpillar
  • 6,683
  • 7
  • 42
  • 70
0
votes
1 answer

python pandas native select_as_multiple

Suppose I have a DataFrame that is block sparse. By this I mean that there are groups of rows that have disjoint sets of non-null columns. Storing this a huge table will use more memory in the values (nan filling) and unstacking the table to rows…
mathtick
  • 6,487
  • 13
  • 56
  • 101
0
votes
1 answer

Time complexity of pytables File.get_node() operation

what is the time complexity of the pytables file operation get_node? Let's say I query mynode = myfile.get_node(where='group0/group1/..../groupN', name ='mynode') How does this operation scale with N the number of parent groups of mynode…
SmCaterpillar
  • 6,683
  • 7
  • 42
  • 70
0
votes
0 answers

What's the pythonic way to reconstruct a serialized object

the more I work with python, the more I want to do it the pythonic way, i.e. try to avoid isinstance queries etc. I am developing a framework for scientific parameter exploration for numeric simulation. I work with two hard constraints: First, I…
SmCaterpillar
  • 6,683
  • 7
  • 42
  • 70
0
votes
1 answer

Pytables, HDF5 Attribute setting and deletion,

I am working a lot with pytables and HDF5 data and I have a question regarding the attributes of nodes (the attributes you access via pytables 'node._v_attrs' property). Assume that I set such an attribute of an hdf5 node. I do that over and over…
SmCaterpillar
  • 6,683
  • 7
  • 42
  • 70
0
votes
1 answer

pyTables py2exe example does not run

I have this code based on this example: from tables import * class Particle(IsDescription): name = StringCol(16) # 16-character String idnumber = Int64Col() # Signed 64-bit integer ADCcount = UInt16Col() # Unsigned short integer …
Pablo
  • 983
  • 10
  • 24
0
votes
0 answers

pandas HDFStore.append data_columns after append operation

I am new to data analysis and python and have virtually no experience with numpy, pytable, pandas etc. I am reading a csv file to a dataframe chunk by chunk and appending it to an HDFStore as the entire data cannot fit in memory. In my append…
smartexpert
  • 2,625
  • 3
  • 24
  • 41
0
votes
1 answer

Checking for list membership with Pytables where method

I'm trying to select rows based on multiple criterion that cannot be easily expressed with the conditional statements that [pytables allow] (http://pytables.github.io/usersguide/condition_syntax.html). I also don't want to format a really long…
aarslan
  • 159
  • 1
  • 11