What does ''Failed to allocated to CPU memory" mean in MXNet?

Question

It is weird that no such questions yet on stackoverflow.

I am using MXNet, trying to run the VQA example on my PC, using mx.io.DataIter to read the data, but

mxnet.base.MXNetError: [11:05:07] c:\projects\mxnet-distro-win\mxnet-build\src\storage\./cpu_device_storage.h:70: Failed to allocate CPU Memory

on these two code

File "E:\PyProjects\TestVqa\VQAtrainIter.py", line 71, in reset
self.nd_img.append(mx.ndarray.array(self.img, dtype=self.dtype))
File "E:\PyProjects\Pro\venv\lib\site-packages\mxnet\ndarray\utils.py", line 146, in array
return _array(source_array, ctx=ctx, dtype=dtype)

The above is my code and the below is the library code

So what is exactly the "Failed to allocated to CPU memory" mean?

The part of code of VQAtrainIter.py is as below

def reset(self):
    self.curr_idx = 0
    self.nd_text = []
    self.nd_img = []
    self.ndlabel = []
    for buck in self.data:
        label = self.answer # self.answer, self.img are get from npy file
        self.nd_text.append(mx.ndarray.array(buck, dtype=self.dtype))
        self.nd_img.append(mx.ndarray.array(self.img, dtype=self.dtype))
        self.ndlabel.append(mx.ndarray.array(label, dtype=self.dtype))

One of the possibilities is a genuine out-of-memory, another is an alignment issue. It's hard to say without exact repro steps. I can try to repro if you post VQAtrainIter.py and any other material needed to repro. — Vishaal, Aug 27 '18 at 21:55
@VishaalKapoor If I use a smaller size of `.npz` file, then this error go away so it seems to be a memory insufficience. But according to the description of `DataIter`, the method get the data by batch. Anyway I posted the code, and also, if I use a machine with larger memory and more cpu, this error also go away — Litchy, Aug 31 '18 at 03:44
what batch size are you using with DataIterator? You could try using a smaller batch size. — Vishaal, Aug 31 '18 at 20:33
@VishaalKapoor The origin batch size is 64 and I changed it to 10, then to 1, the error is still `Failed to allocate CPU memory` — Litchy, Sep 01 '18 at 02:10
Maybe it is a configuration problem of pc? Since the remote linux machine would not have it problem... — Litchy, Sep 01 '18 at 02:11

score 1 · Answer 1 · answered Sep 18 '18 at 02:48

1

Based on what you mentioned, you're using npz, which is always loaded fully into memory. DataIter does get the data by batch, but if the source of that is npz, which is in-memory itself, then everything has to fit into memory. What DataIter does, it that it allows you to not require twice as much memory because it copies a small part of the data from numpy array into mxnet.nd.NDArray for processing.

answered Sep 18 '18 at 02:48

Sina Afrooze

960
6
11

First, what you mean is that `DataIter` could save at most half of the memory if you do not use it. Then if I have a really large dataset, I still have to split it into many different `npz` files. Second, the `npz` I use is really small but it could still sometimes cause the `Failed to allocate CPU memory` problem. But this is occasionally, not always. So I doubt is the CPU usage problem, rather than the size of the data? – Litchy Sep 19 '18 at 02:01

What does ''Failed to allocated to CPU memory" mean in MXNet?

1 Answers1