In pytorch, I want to save the the output in every epoch for late caculation. But it leads to OUT OF MEMORY ERROR after several epochs,

Question

In pytorch, I want to save the output in every epoch for late caculation. But it leads to OUT OF MEMORY ERROR after several epochs. The code is like below:

    L=[]
    optimizer.zero_grad()
    for i, (input, target) in enumerate(train_loader):
        output = model(input)
        L.append(output)
    *** updata my model to minimize a loss function. List L will be used here.

I know the reason is because pytorch save all computation graphs from every epoch. But the loss function can only be calculated after obtaining all of the prediction results

Is there a way I can train my model?

"the loss function can only be calculated after obtaining all of the prediction results" - what do you mean by this? Are you calculating loss and updating parameters only once after whole `train_loader`'s data passed? If so, could you post this code as well? — Szymon Maszke, Sep 17 '20 at 15:17
Yes, I do mean it. The loss funciton is loke L=L(torch.sum(L), Label) — zixunsilu, Sep 18 '20 at 05:26
So what's the point of appending outputs to `list` if you sum it anyway? What's `L`, here it is a `list`, you can't compute loss function via `list`... Could you please add the relevant part of code as well? — Szymon Maszke, Sep 18 '20 at 08:09
As I see it, appending output to a list or summing actually make no difference. Both of them will cause OOM. — zixunsilu, Sep 18 '20 at 13:51

score 0 · Answer 1 · answered Sep 17 '20 at 14:16

0

are you training on a GPU?

If so, you could move it main memory like

    L.append(output.detach().cpu())

answered Sep 17 '20 at 14:16

ezekiel

427
5
20

In pytorch, I want to save the the output in every epoch for late caculation. But it leads to OUT OF MEMORY ERROR after several epochs,

1 Answers1