0

I have a file in .npz format. Data stored in dictionary formatlooks like:

{'ffa7e85e21c9000215574a8e2c24c30d': array([[ 0.07772359,  0.04581502, -0.00930751, ..., -0.05222392,
          0.02600432,  0.00974964],
        [ 0.1211272 , -0.0978327 ,  0.01816959, ..., -0.02647112,
         -0.02802687, -0.01136648]], dtype=float32),
 'ffad907e58ea5bdaf66470214c6040a9': array([[-0.00462537,  0.04290746,  0.04099328, ...,  0.00076487,
          0.03863411,  0.02304979],
        [ 0.03177807, -0.00942735,  0.06652466, ..., -0.01213444,
         -0.05064949, -0.00099202]], dtype=float32)}

I am loading this data from file as below:

dc_data = np.load(file.npz)
val=list(zip(* dc_data.values()))
ids=list(dc_data.keys())

but as the data is very huge it took large time to load.Could anyone help to load file.npz in efficient way.

  • The npz is a zip archive. Each `dc_data[key]` is loaded individually. There's no "efficient" way around that. – hpaulj Sep 08 '21 at 11:35

1 Answers1

0

I suggest you use python's context manager with:

with np.load('file.npz') as d:
    k = d.keys()
    v = d.values()