-1

In pytorch, I want to save the output in every epoch for late caculation. But it leads to OUT OF MEMORY ERROR after several epochs. The code is like below:

    L=[]
    optimizer.zero_grad()
    for i, (input, target) in enumerate(train_loader):
        output = model(input)
        L.append(output)
    *** updata my model to minimize a loss function. List L will be used here. 

I know the reason is because pytorch save all computation graphs from every epoch. But the loss function can only be calculated after obtaining all of the prediction results

Is there a way I can train my model?

talonmies
  • 70,661
  • 34
  • 192
  • 269
zixunsilu
  • 1
  • 2
  • "the loss function can only be calculated after obtaining all of the prediction results" - what do you mean by this? Are you calculating loss and updating parameters only once after whole `train_loader`'s data passed? If so, could you post this code as well? – Szymon Maszke Sep 17 '20 at 15:17
  • Yes, I do mean it. The loss funciton is loke L=L(torch.sum(L), Label) – zixunsilu Sep 18 '20 at 05:26
  • So what's the point of appending outputs to `list` if you sum it anyway? What's `L`, here it is a `list`, you can't compute loss function via `list`... Could you please add the relevant part of code as well? – Szymon Maszke Sep 18 '20 at 08:09
  • As I see it, appending output to a list or summing actually make no difference. Both of them will cause OOM. – zixunsilu Sep 18 '20 at 13:51

1 Answers1

0

are you training on a GPU?

If so, you could move it main memory like

    L.append(output.detach().cpu())
ezekiel
  • 427
  • 5
  • 20