Questions tagged [pytorch-lightning]

585 questions
15
votes
7 answers

Unable to import pytorch_lightning on google colab

I have done the following: !pip install pytorch_lightning -qqq import pytorch_lightning But get the following error: ImportError Traceback (most recent call last) in () ----> 1…
14
votes
1 answer

PyTorch Lightning move tensor to correct device in validation_epoch_end

I would like to create a new tensor in a validation_epoch_end method of a LightningModule. From the official docs (page 48) it is stated that we should avoid direct .cuda() or .to(device) calls: There are no .cuda() or .to() calls. . . Lightning…
Szymon Knop
  • 509
  • 1
  • 6
  • 16
10
votes
5 answers

Pytorch Lightning duplicates main script in ddp mode

When I launch my main script on the cluster with ddp mode (2 GPU's), Pytorch Lightning duplicates whatever is executed in the main script, e.g. prints or other logic. I need some extended training logic, which I would like to handle myself. E.g. do…
dlsf
  • 332
  • 2
  • 13
10
votes
1 answer

RuntimeError: Given groups=1, weight of size [32, 3, 16, 16, 16], expected input[100, 16, 16, 16, 3] to have 3 channels, but got 16 channels instead

RuntimeError: Given groups=1, weight of size [32, 3, 16, 16, 16], expected input[100, 16, 16, 16, 3] to have 3 channels, but got 16 channels instead This is the portion of code I think where the problem is. def __init__(self): …
Red
  • 299
  • 1
  • 4
  • 16
9
votes
0 answers

VS Code Python TensorBoard integration doesn't work

The Python extension for VS Code recently released TensorBoard integration, but it doesn't seem to work for me. Whenever I run "Python: Launch TensorBoard" from the command palate, I get this screen. It's the TensorBoard page with the default…
9
votes
4 answers

output prediction of pytorch lightning model

This is potentially a very easy question. I just started with PyTorch lightning and can't figure out how to receive the output of my model after training. I am interested in both predictions of y_train and y_test as an array of some sort (PyTorch…
Tom S
  • 591
  • 1
  • 5
  • 21
8
votes
2 answers

PyTorch Lightning training console output is weird

When training a PyTorch Lightning model in a Jupyter Notebook, the console log output is awkward: Epoch 0: 100%|█████████▉| 2315/2318 [02:05<00:00, 18.41it/s, loss=1.69, v_num=26, acc=0.562] Validating: 0it [00:00, ?it/s] Validating: 0%| …
Jivan
  • 21,522
  • 15
  • 80
  • 131
8
votes
1 answer

Proper way to log things when using Pytorch Lightning DDP

I was wondering what is the proper way of logging metrics when using DDP. I noticed that if I want to print something inside validation_epoch_end it will be printed twice when using 2 GPUs. I was expecting validation_epoch_end to be called only on…
Jovan Andonov
  • 436
  • 3
  • 12
8
votes
3 answers

How to dump confusion matrix using TensorBoard logger in pytorch-lightning?

The official doc only states >>> from pytorch_lightning.metrics import ConfusionMatrix >>> target = torch.tensor([1, 1, 0, 0]) >>> preds = torch.tensor([0, 1, 0, 0]) >>> confmat = ConfusionMatrix(num_classes=2) >>> confmat(preds, target) This…
Gulzar
  • 23,452
  • 27
  • 113
  • 201
7
votes
2 answers

How to extract loss and accuracy from logger by each epoch in pytorch lightning?

I want to extract all data to make the plot, not with tensorboard. My understanding is all log with loss and accuracy is stored in a defined directory since tensorboard draw the line graph. %reload_ext tensorboard %tensorboard --logdir…
Wakame
  • 413
  • 2
  • 5
  • 15
7
votes
3 answers

RuntimeError: all elements of input should be between 0 and 1

I want to use an RNN with bilstm layers using pytorch on protein embeddings. It worked with Linear Layer but when i use Bilstm i have a Runtime error. Sorry if its not clear its my first publication and i will be grateful if someone can help…
KhalilBR
  • 75
  • 1
  • 1
  • 5
7
votes
1 answer

Run validation on 1 GPU while Train on multi-GPU Pytorch Lightning

Is there any way I can execute validation_step method on single GPU while training_step with multiple GPU using DDP. The reason I want to do is because there are several metrics which I want to implement which requires complete access to the data,…
pseudo_teetotaler
  • 1,485
  • 1
  • 15
  • 35
7
votes
2 answers

Validate on entire validation set when using ddp backend with PyTorch Lightning

I'm training an image classification model with PyTorch Lightning and running on a machine with more than one GPU, so I use the recommended distributed backend for best performance ddp (DataDistributedParallel). This naturally splits up the dataset,…
6
votes
5 answers

Cannot import name 'rank_zero_only' from 'pytorch_lightning.utilities.distributed'

I am using VQGAN+CLIP_(Zooming)_(z+quantize_method_with_addons).ipynb Google Repository and when I click the cell "Loading of libraries and definitions" It sent an error : ImportError Traceback (most recent call…
6
votes
2 answers

How to convert a generator to a Pytorch Dataloader?

I have a generator that creates synthetic data. How can I convert this into a PyTorch dataloader?
Rylan Schaeffer
  • 1,945
  • 2
  • 28
  • 50
1
2 3
38 39