Questions tagged [multi-gpu]

This refers to one application's using multiple graphics-processing units, either in traditional (graphical) or general-purpose (GPGPU) applications.

387 questions
0
votes
2 answers

DataParallel multi-gpu RuntimeError: chunk expects at least a 1-dimensional tensor

I am trying to run my model on multiple gpus using DataParallel by setting model = nn.DataParallel(model).cuda(), but everytime getting this error - RuntimeError: chunk expects at least a 1-dimensional tensor (chunk at …
Anubhav Garg
  • 1
  • 1
  • 1
0
votes
0 answers

How use multi gpu?

I want to use the tensorflow multi gpu. Currently, there are two GPUs #0, #1, and if you run nvidia-smi and see GPU usage rate, only #0 GPU is used ~35%. I wonder how you can use both 0# and 1# 99%. This code is added and executed before the…
MKH
  • 11
  • 4
0
votes
1 answer

tensorflow.GraphDef was modified concurrently during serialization

I use python and tensorflow, My GPU is Tesla V100, has 4 GPUs, when I set: os.environ['CUDA_VISIBLE_DEVICES'] = '0' or os.environ['CUDA_VISIBLE_DEVICES'] = '0,1' or os.environ['CUDA_VISIBLE_DEVICES'] = '0,1,2' the code can run without any…
zhu
  • 1
0
votes
1 answer

Tensorflow / keras multi_gpu_model is not splitted to more than one gpu

I'm encountered the problem, that I can not successfully split my training batches to more than one GPU. If multi_gpu_model from tensorflow.keras.utils is used, tensorflow allocates the full memory on all available (for example 2) gpus, but only the…
johni07
  • 761
  • 1
  • 11
  • 31
0
votes
1 answer

Tensorflow Multi-GPU loss

I am studying how to implement multi-GPU training on Tensorflow. Now I am reading this source as recommended in the documentation. As far as I understand, at line 178 variable loss accounts the loss for only one GPU (as the comment states). Thus, at…
lorenzop
  • 525
  • 4
  • 19
0
votes
0 answers

Keras anomaly in its training time

I am using Keras in multi-gpu, with Tensorflow backend on 2 gpus. I am using a generator (keras.utils.Sequence) to load my data in batch mode (BS = 64). Therefore I am using the fit_generator class, providing it with my train and validation data and…
0
votes
1 answer

Pytorch 3-GPUs, just can only use 2 of them to train

I have three 1080TI, but when train I can only use 2 of them.. Code: device = torch.device("cuda" if torch.cuda.is_available() else "cpu") model.cuda() criterion = nn.CrossEntropyLoss().cuda() optimizer_conv =…
王康年
  • 11
  • 2
0
votes
1 answer

Passing input in Tensorflow Mirrored Strategy distirbuted computing

So I am following the example code on tensorflow MirroredStrategy . However, I am getting the following error raise ValueError('model_fn (%s) must include features argument.' % model_fn) ValueError: model_fn (
0
votes
0 answers

Pytorch using Multi GPU / accuracy is too low(10%)

if torch.cuda.is_available(): for epoch in range(epoch_num): for i,(images, labels) in enumerate(trainloader): images=images.to(device) labels=labels.to(device) optimizer.zero_grad() #Forward Backward…
C. Eunbi
  • 11
  • 2
0
votes
0 answers

keras fine-tuning a pre-trained model does not change weights when using multi_gpu_model and layers.trainable = True

I'm loading the VGG16 pretrained model, adding a couple of dense layers and fine tuning the last 5 layers of the base VGG16. I'm training my model on mutliple gpus. I saved the model before and after training. The weights are the same inspite of…
0
votes
0 answers

Why multi-gpu faster than single gpu in caffe training?

In the same hardware/software env, With the same net and solver, just differ in command line. While command line is: caffe-master/build/tools/caffe train --solver=solver_base.prototxt --gpu=6 It tasks about 50 seconds per 100 iters. While command…
HiYuan
  • 63
  • 4
0
votes
2 answers

Saved multi-GPU trained model loaded into single-GPU; inconsistent results

I'm seeing strange results when loading a saved model that was trained on multiple GPUs into a single GPU model. I'm operating in a shared environment so I'm doing training on 4 GPUs but running tests using a single GPU. What I'm seeing is the tests…
0
votes
0 answers

tf.layers.batch_normalization() gamma does not exist error when trained on multiple GPU

When I train Neural network on multiple GPU the batch normalization is not working, I use tf.layers.batch_normalization(), but when I run, I get following error: ValueError: Variable batch_normalization_4/gamma does not exist, or was not created…
BryanF
  • 111
  • 1
  • 5
0
votes
0 answers

How to do multiple gpu programming in tensorflow

I am using following code to do parallelization, however, I am not able to do that. It throws some random errors. I am using amazon aws for training. So when I run this training putty stops responding. codes : for d in…
user169703
  • 66
  • 1
  • 5
0
votes
2 answers

When to set reuse=True for multi GPU training in tensorflow?

I am trying to train a network with tensorflow with multiple towers. I had set reuse = True for all the towers. But in the cifar10 multi gpu train of tensorflow tutorials, the reuse variable has set after the first tower was created: with…
Mehraban
  • 3,164
  • 4
  • 37
  • 60