Questions tagged [pytables]

A Python library for working with extremely large hierarchical (HDF5) datasets.

PyTables is a package for managing hierarchical (HDF5) datasets and designed to efficiently and easily cope with extremely large amounts of data. PyTables is available as a free download.

PyTables is built on top of the HDF5 library, using the Python language and the NumPy package. It features an object-oriented interface that, combined with C extensions for the performance-critical parts of the code (generated using Cython), makes it a fast, yet extremely easy to use tool for interactively browse, process and search very large amounts of data.

Links to get started:
- Documentation
- Tutorials
- Library Reference
- Downloads

617 questions
0
votes
1 answer

Define column names of a table using pytables using a for loop inside the class definition

We know that if we need to define the column names of a table using pytables we can do it by the following way: class Project(IsDescription): alpha = StringCol(20) beta = StringCol(20) gamma = StringCol(20) where alpha, beta and gamma…
Zaman
  • 37
  • 8
0
votes
1 answer

having problems installing pytables

I'm trying to install pytable, but first i must install numpy and numexpr to my windows 7 machine, I tried to install numexpr-2.2.2 an this is what happen Warning: Assuming default configuration (numexpr\tests/{setup_tests,setup}.py was not…
dmb
  • 35
  • 1
  • 1
  • 3
0
votes
1 answer

Storing ragged (variable length) arrays objects w/ pytables

I'll preface this question by noting that I'm happy to consider alternatives to pytables, but I would prefer to use pytables in order to benefit from the numexpr features. I'm looking for a solution for storing/exploring/analyzing my data, for…
chase
  • 370
  • 3
  • 12
0
votes
1 answer

Issues with using argparse with listcomprehensions

I'm using list comprehension to find specific datasets within a PyTable. However when trying to combine with arguments from argparser it returns no values. Here is the section of code: if args.Scount: print args.Scount, args.Scount[0],…
Joe McGuire
  • 21
  • 1
  • 3
0
votes
1 answer

Query term is not valid [[Condition : [None]]]

I can't seem to be able to query the simplest DataFrame in an HDFStore: In [1]: import pandas as pd pd.__version__ Out[1]: '0.15.1' In [2]: df = pd.DataFrame.from_dict({'A':[1,2],'B':[100,200], 'C':[42,11]}) df_a = df.set_index('A') df_a Out[2]:…
roldugin
  • 922
  • 5
  • 19
0
votes
0 answers

hdf5 error when format=table, pandas pytables

It seems that I get an error when format=table but no error with format=fixed. Here is the command. What's weird is that it still seems to load the data. I just have to figure out a way to move past this. And it would give me peace of mind to not…
user3659451
  • 1,913
  • 9
  • 30
  • 43
0
votes
1 answer

Python ORM to NumPy arrays

I am building data simulation framework with numpy ORM, where it is much more convenient to work with classes and objects instead of numpy arrays directly. Nevertheless, output of the simulation should be numpy array. Also blockz is quite…
Cron Merdek
  • 1,084
  • 1
  • 14
  • 25
0
votes
1 answer

save currencies to pandas store with precision (as Decimal?)

In pandas I work a lot with currencies. Up to this point I've been using the default floats, but dealing with the lack of precision is annoying and error prone. I'm trying to switch over to using Decimal for some pieces, which while it likely makes…
fantabolous
  • 21,470
  • 7
  • 54
  • 51
0
votes
1 answer

Pandas reading HDFStore in Bottle - DeprecationWarning?

I am attempting to read a few Pandas created HDF5 files in a simple web application using Bottle. In doing so, I'm receiving a DeprecationWarning when reading an HDFStore that was created outside of the Bottle app server. Environment: OSX: 10.9.4…
bazel
  • 299
  • 7
  • 20
0
votes
0 answers

Using Pytables with Pandas or just Numpy?

Here's my use case: 1. Initially, I have around 20GB of JSON files that I need to store for processing. I'll parse them and the initial table would be like: requestId A B C Ap Bp …
user1265125
  • 2,608
  • 8
  • 42
  • 65
0
votes
1 answer

Efficient calculation on complete columns (pytables, hdf5, numpy)

I have a simple HDF5 file (created by PyTables) with ten columns and 100000 rows. For every value I have to apply a simple linear equation, with different parameters per column and write the stuff to CSV. My naive approach was to loop over the…
user923543
0
votes
1 answer

How to use ptdump

This is almost certainly another "feel like an idiot" questions, but I'm at a loss here. Trying to simply get ptdump to work (even just ptdump -h) Python 3.4.1 was originally installed on this Windows machine using the Anaconda…
dan_g
  • 2,712
  • 5
  • 25
  • 44
0
votes
1 answer

Correct way to deal with a list of associated data items associated with several index values with pandas/pytables

I was wondering what the correct way to deal with storing/reading through a list of items such as the following example dealing with a rockstar, where the list is known to hold a maximum number of values to hdf5: Date_of_Birth Bands[] - where the…
Cenoc
  • 11,172
  • 21
  • 58
  • 92
0
votes
3 answers

pytables crash with threads

The following code shows a problem in the interaction between pytables and threading. I'm creating an HDF file and reading it with 100 concurrent threads: import threading import pandas as pd from pandas.io.pytables import HDFStore,…
Emanuele Paolini
  • 9,912
  • 3
  • 38
  • 64
0
votes
1 answer

Benefits of Pytables / databases over file system for data organization?

I'm currently in the process of trying to redesign the general workflow of my lab, and am coming up against a conceptual roadblock that is largely due to my general lack of knowledge in this subject. Our data currently is organized in a typical…
dan_g
  • 2,712
  • 5
  • 25
  • 44