Questions tagged [pytorch-lightning]
585 questions
15
votes
7 answers
Unable to import pytorch_lightning on google colab
I have done the following:
!pip install pytorch_lightning -qqq
import pytorch_lightning
But get the following error:
ImportError Traceback (most recent call last)
in ()
----> 1…

user15357068
- 161
- 1
- 2
- 4
14
votes
1 answer
PyTorch Lightning move tensor to correct device in validation_epoch_end
I would like to create a new tensor in a validation_epoch_end method of a LightningModule. From the official docs (page 48) it is stated that we should avoid direct .cuda() or .to(device) calls:
There are no .cuda() or .to() calls. . . Lightning…

Szymon Knop
- 509
- 1
- 6
- 16
10
votes
5 answers
Pytorch Lightning duplicates main script in ddp mode
When I launch my main script on the cluster with ddp mode (2 GPU's), Pytorch Lightning duplicates whatever is executed in the main script, e.g. prints or other logic. I need some extended training logic, which I would like to handle myself. E.g. do…

dlsf
- 332
- 2
- 13
10
votes
1 answer
RuntimeError: Given groups=1, weight of size [32, 3, 16, 16, 16], expected input[100, 16, 16, 16, 3] to have 3 channels, but got 16 channels instead
RuntimeError: Given groups=1, weight of size [32, 3, 16, 16, 16], expected input[100, 16, 16, 16, 3] to have 3 channels, but got 16 channels instead
This is the portion of code I think where the problem is.
def __init__(self):
…

Red
- 299
- 1
- 4
- 16
9
votes
0 answers
VS Code Python TensorBoard integration doesn't work
The Python extension for VS Code recently released TensorBoard integration, but it doesn't seem to work for me.
Whenever I run "Python: Launch TensorBoard" from the command palate, I get
this screen. It's the TensorBoard page with the default…

David Davini
- 141
- 1
- 4
9
votes
4 answers
output prediction of pytorch lightning model
This is potentially a very easy question. I just started with PyTorch lightning and can't figure out how to receive the output of my model after training.
I am interested in both predictions of y_train and y_test as an array of some sort (PyTorch…

Tom S
- 591
- 1
- 5
- 21
8
votes
2 answers
PyTorch Lightning training console output is weird
When training a PyTorch Lightning model in a Jupyter Notebook, the console log output is awkward:
Epoch 0: 100%|█████████▉| 2315/2318 [02:05<00:00, 18.41it/s, loss=1.69, v_num=26, acc=0.562]
Validating: 0it [00:00, ?it/s]
Validating: 0%| …

Jivan
- 21,522
- 15
- 80
- 131
8
votes
1 answer
Proper way to log things when using Pytorch Lightning DDP
I was wondering what is the proper way of logging metrics when using DDP. I noticed that if I want to print something inside validation_epoch_end it will be printed twice when using 2 GPUs. I was expecting validation_epoch_end to be called only on…

Jovan Andonov
- 436
- 3
- 12
8
votes
3 answers
How to dump confusion matrix using TensorBoard logger in pytorch-lightning?
The official doc only states
>>> from pytorch_lightning.metrics import ConfusionMatrix
>>> target = torch.tensor([1, 1, 0, 0])
>>> preds = torch.tensor([0, 1, 0, 0])
>>> confmat = ConfusionMatrix(num_classes=2)
>>> confmat(preds, target)
This…

Gulzar
- 23,452
- 27
- 113
- 201
7
votes
2 answers
How to extract loss and accuracy from logger by each epoch in pytorch lightning?
I want to extract all data to make the plot, not with tensorboard. My understanding is all log with loss and accuracy is stored in a defined directory since tensorboard draw the line graph.
%reload_ext tensorboard
%tensorboard --logdir…

Wakame
- 413
- 2
- 5
- 15
7
votes
3 answers
RuntimeError: all elements of input should be between 0 and 1
I want to use an RNN with bilstm layers using pytorch on protein embeddings. It worked with Linear Layer but when i use Bilstm i have a Runtime error. Sorry if its not clear its my first publication and i will be grateful if someone can help…

KhalilBR
- 75
- 1
- 1
- 5
7
votes
1 answer
Run validation on 1 GPU while Train on multi-GPU Pytorch Lightning
Is there any way I can execute validation_step method on single GPU while training_step with multiple GPU using DDP.
The reason I want to do is because there are several metrics which I want to implement which requires complete access to the data,…

pseudo_teetotaler
- 1,485
- 1
- 15
- 35
7
votes
2 answers
Validate on entire validation set when using ddp backend with PyTorch Lightning
I'm training an image classification model with PyTorch Lightning and running on a machine with more than one GPU, so I use the recommended distributed backend for best performance ddp (DataDistributedParallel). This naturally splits up the dataset,…

Alexander Pacha
- 9,187
- 3
- 68
- 108
6
votes
5 answers
Cannot import name 'rank_zero_only' from 'pytorch_lightning.utilities.distributed'
I am using VQGAN+CLIP_(Zooming)_(z+quantize_method_with_addons).ipynb Google Repository and when I click the cell "Loading of libraries and definitions"
It sent an error :
ImportError Traceback (most recent call…

Ines Sieulle
- 61
- 1
- 1
- 2
6
votes
2 answers
How to convert a generator to a Pytorch Dataloader?
I have a generator that creates synthetic data. How can I convert this into a PyTorch dataloader?

Rylan Schaeffer
- 1,945
- 2
- 28
- 50