1

I am creating and filling a PyTables Carray the following way:

#a,b = scipy.sparse.csr_matrix
f = tb.open_file('../data/pickle/dot2.h5', 'w')
filters = tb.Filters(complevel=1, complib='blosc')
out = f.create_carray(f.root, 'out', tb.Atom.from_dtype(a.dtype),
        shape=(l, n), filters=filters)

bl = 2048
l = a.shape[0]
for i in range(0, l, bl):
    out[:,i:min(i+bl, l)] = (a.dot(b[:,i:min(i+bl, l)])).toarray()

The script was running fine for nearly two days (I estimated that it would need at least 4 days more).

However, suddenly I received this error stack trace:

File "prepare_data.py", line 168, in _tables_dot
out[:,i:min(i+bl, l)] = (a.dot(b[:,i:min(i+bl, l)])).toarray()
File "/home/psinger/venv/local/lib/python2.7/site-packages/tables/array.py", line 719, in __setitem__
self._write_slice(startl, stopl, stepl, shape, nparr)
File "/home/psinger/venv/local/lib/python2.7/site-packages/tables/array.py", line 809, in _write_slice
self._g_write_slice(startl, stepl, countl, nparr)
File "hdf5extension.pyx", line 1678, in tables.hdf5extension.Array._g_write_slice (tables/hdf5extension.c:16287)
tables.exceptions.HDF5ExtError: HDF5 error back trace

File "../../../src/H5Dio.c", line 266, in H5Dwrite
can't write data
File "../../../src/H5Dio.c", line 671, in H5D_write
can't write data
File "../../../src/H5Dchunk.c", line 1840, in H5D_chunk_write
error looking up chunk address
File "../../../src/H5Dchunk.c", line 2299, in H5D_chunk_lookup
can't query chunk address
File "../../../src/H5Dbtree.c", line 998, in H5D_btree_idx_get_addr
can't get chunk info
File "../../../src/H5B.c", line 362, in H5B_find
can't lookup key in subtree
File "../../../src/H5B.c", line 340, in H5B_find
unable to load B-tree node
File "../../../src/H5AC.c", line 1322, in H5AC_protect
H5C_protect() failed.
File "../../../src/H5C.c", line 3567, in H5C_protect
can't load entry
File "../../../src/H5C.c", line 7957, in H5C_load_entry
unable to load entry
File "../../../src/H5Bcache.c", line 143, in H5B_load
wrong B-tree signature

End of HDF5 error back trace

Internal error modifying the elements (H5ARRAYwrite_records returned errorcode -6)

I am really clueless what the problem is as it was running fine for about a quarter of the dataset. Disk space is available.

fsociety
  • 1,791
  • 4
  • 22
  • 32
  • IIRC that sort of error can be caused by file corruption. Can you check the integrity of the HDF5 file by opening it using an HDF5 viewer (e.g. [ViTables](http://vitables.org/)), or by using [h5check](http://www.hdfgroup.org/products/hdf5_tools/h5check.html)? – ali_m Aug 07 '14 at 14:22
  • Witht his error it is corrupt now anyhow... – fsociety Aug 07 '14 at 14:29
  • You aren't trying to write to the same file in multiple threads, are you? – ali_m Aug 07 '14 at 14:33
  • No I am not, I know that this doesn't work with PyTables ad hoc. – fsociety Aug 07 '14 at 16:59
  • Is the file stored locally or remotely? – ali_m Aug 07 '14 at 17:05
  • Remotely but the code is also executed remotely, so overall locally. – fsociety Aug 07 '14 at 17:57
  • Maybe the scipy dot function does some multithreading? But I think the scipy implementation does not do that compare to numpy's dot function. Anyhow, the writing should not happen simultaniously then. I could store the result of the dot beforehand and then assign the result, but this is a memory overhead. – fsociety Aug 07 '14 at 19:08
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/58908/discussion-between-ali-m-and-ph-singer). – ali_m Aug 07 '14 at 19:10

0 Answers0