Questions tagged [pytables]

A Python library for working with extremely large hierarchical (HDF5) datasets.

PyTables is a package for managing hierarchical (HDF5) datasets and designed to efficiently and easily cope with extremely large amounts of data. PyTables is available as a free download.

PyTables is built on top of the HDF5 library, using the Python language and the NumPy package. It features an object-oriented interface that, combined with C extensions for the performance-critical parts of the code (generated using Cython), makes it a fast, yet extremely easy to use tool for interactively browse, process and search very large amounts of data.

Links to get started:
- Documentation
- Tutorials
- Library Reference
- Downloads

617 questions
0
votes
1 answer

Conditional expression in PyTables where method

I want to use conditional expression in PyTables where method. In SQL, I would use CASE expression (PostgreSQL, "CASE WHEN a=b THAN 1 ELSE 0"), if usual python, I would use conditional expression "1 if a==b else 0". But I couldn't find how it can be…
Hyungyong Kim
  • 45
  • 1
  • 5
0
votes
1 answer

fastest way to add multiple pytables arrays of the same shape

I have several large tables.carray data structures of the same shape (300000x300000). I want to add all the data and store it in a master matrix. Right now, I create a new carray and fill it with a simple loop: shape = (300000,300000) #... open all…
haehn
  • 967
  • 1
  • 6
  • 19
0
votes
1 answer

How can I define a nested node structure using the same type for parent and child in PyTables?

Based on the PyTables documentation, it appears that the only way to define nested types is either to create a class level/static field with the nested type instance or to define the nested class in the parent class. The problem is, a very common…
mahonya
  • 9,247
  • 7
  • 39
  • 68
0
votes
1 answer

Using PyTables from a cython module

I am solving a set of Coupled ODEs and facing two problems: speed and memory storage. As such I use cython_gsl to create a module which solves my ODEs. Until now I had simply written the data to a .txt file but I think it will be more useful to use…
Greg
  • 11,654
  • 3
  • 44
  • 50
0
votes
1 answer

Ordering of nested structures in PyTable table

Suppose I have the following PyTable column descriptor: import numpy as np import tables as pt class Date_t(pt.IsDescription): year = pt.Int32Col(shape=(), dflt=2013, pos=0) month = pt.Int32Col(shape=(), dflt=1, pos=1) day =…
Joel Vroom
  • 1,611
  • 1
  • 16
  • 30
0
votes
1 answer

HDFStore error 'correct atom type -> [dtype->uint64'

using read_hdf for first time love it want to use it to combine a bunch of smaller *.h5 into one big file. plan on calling append() of a HDFStore. later will add chunking to conserve memory. Example table looks like this Int64Index: 220189…
Jim Knoll
  • 115
  • 2
  • 6
0
votes
1 answer

reverse iterate pytable with itersorted and negative step produces OverflowError

I'm trying to sort a pytable according to a column, and then reverse iterate with itersorted using a negative step. This is possible according to the…
martinako
  • 2,690
  • 1
  • 25
  • 44
0
votes
1 answer

How to use easy_install to install locally?

I try to install PyTables package using easy_install. My problem is that I am not root on the system and am not allowed to write to /usr/local/lib/python2.7/dist-packages/ directory. To solve this problem I decided to install locally. For that I…
Roman
  • 124,451
  • 167
  • 349
  • 456
0
votes
1 answer

PyTables - condition syntax - string slicing possible?

I naively tried this while querying a table: rows = [ x['title'] for x in table.where("""title[-11:] == 'string ends'""") ] resulting in: TypeError: 'VariableNode' object has no attribute 'getitem' Reading up on Condition Syntax doc, there is no…
devboell
  • 1,180
  • 2
  • 16
  • 34
0
votes
1 answer

PyTables in-kernel search on Time64Col

I'm using PyTables 2.4.0 and Python 2.7 I've got a database that contains the following typical table: /anc/asc_wind_speed (Table(87591,), shuffle, blosc(3)) 'Wind speed' description := { "value_seconds": Time64Col(shape=(), dflt=0.0, pos=0), …
0
votes
1 answer

Pytables on Enthought Python for OS X 10.8.2

I've been struggling to get pytables and the underlying HDF5 library working on python in OS X, so thought I'd give the Enthought distribution a go (which will also greatly simplify deployment across platforms later on). I installed EPD 7.3 for…
user2205880
  • 233
  • 2
  • 8
0
votes
3 answers

How to determine size (in bytes) of a PyTables array?

How can I determine the size (in bytes) of a PyTables Array?
ChaimKut
  • 2,759
  • 3
  • 38
  • 64
0
votes
1 answer

How to change the an HDF5 table title (created using pytables)

I was wondering if there was a way to change the title of an HDF5 table, that I created in my python code, using pyTables. I gave the wrong title string, and I need too change it now, so when I open it again in python, I can distinguish it from…
Reza
  • 147
  • 1
  • 7
0
votes
2 answers

Efficiently store a large sparse matrix (float)

I am looking for a solution to store about 10 million floating point (double precision) numbers of a sparse matrix. The matrix is actually a two-dimensional triangular matrix consisting of 1 million by 1 million elements. The element (i,j) is the…
nopper
  • 825
  • 11
  • 18
0
votes
1 answer

Nested Iteration of HDF5 using PyTables

I am have a fairly large dataset that I store in HDF5 and access using PyTables. One operation I need to do on this dataset are pairwise comparisons between each of the elements. This requires 2 loops, one to iterate over each element, and an…
dvreed77
  • 2,217
  • 2
  • 27
  • 42