I have data in a long HDF5 file, and the class I would like to use (igraph.Graph) seems to insist on a list of tuples in its init function. I have tried for loops, list(dataset), read_direct(dataset).tolist(), and [mylist.append(tuple(x) for x in dataset]. All of them have been too slow to be useful. So far, things have mostly been CPU bound, although there is some waiting for I/O, the the 40G RAM + 40G swap I am working with can be limiting. It seems strange to me if there is not a fast way to do this, but maybe it is a sign that it is time to move to C/C++.
(I know that questions about going from numpy arrays to lists have been asked. My problem is at a large enough scale that those solutions seem to be too slow.)