2

I execute a third party program (here refered as program B for sake of simplicity) through a script which is handled by a python program (main program). In order to give you a global overview of what I am trying to do, here is a very simplified list of executed tasks by the main program:

  1. Execute the program B and wait for it to finish.

  2. Once B has finished, it reads the B outputs which have been stored into an ASCII file by B.

  3. Format the outputs into an ensemble of 1 dimensional array that are then stored into an HDF5 file, using the Pytables modules.

  4. Go back to step 1, using a new set of parameters for B, until an exit condition is True.

My problem is in step 3. Pytables seems to handle very well tables of known shapes. In my case, I know the shape of the outputs of B only after execution of B. And from one iteration to the other, the shape of my outputs varies.

Below is the code that I wrote for handling fixed shape outputs of B, and using some solutions provided in stackoverflow for similar (but not identical) issues. This solution is not satisfactory in my case because here, the shapes must be invariant. So my question is how would you adapt this code in the case of each row having a different shape? I saw in another post some possibilities (In PyTables, how to create nested array of variable length?), but I am not yet well familiar with EArray, VLArray. Furthermore, it seems not to be really efficient methods.

def makemytable1D(filepointer, group, tablename, labels, shapes):
    #Declare the dictionary
    template = {}
    # make all columns
    for i in np.arange(len(labels)):
        template[labels[i]]=tables.Float64Col(shape=shapes[i], pos=i)

    table = filepointer.create_table(group,tablename,template)
    return table, template

def fillmytable1D(table, labels, data, Ndata):
   tablerow=table.row
   for i in np.arange(Ndata):
       tablerow[labels[i]]=data[i]
   tablerow.append()
   table.flush()

# ----------- Execution -----------
import numpy as np
import tables

labels=np.array(['Field1','Field2','Field3','Field4','Field5']) # example of labels
data=np.array([[0,1], [2,2,2,2], [3,3,3], [4,4], [5,5]]) # example of data
shapes=[]
for d in data:
    shapes.append(np.array(d).shape)
Ndata=len(data)
try:
   saveFile=tables.open_file('save.hdf','w')
   group=saveFile.create_group('/', 'group1', 'Model 1')
   tab, template=makemytable1D(saveFile, group, 'test', labels, shapes)
   for i in range(10): # The iteration. In my real life problem, data has a shape that varies at each iteration.The current example would not work here.
       fillmytable1D(tab, labels, data, Ndata)
finally:
   saveFile.close()
Community
  • 1
  • 1
tomahna
  • 57
  • 1
  • 6

0 Answers0