GPU memory usage of cudNN lstm and acceleration

Question

I have a couple of questions about a statefull cuDNN LSTM model I'm trying to fit in R using keras library. I have tensorflow-gpu installed and it seems to be running sucessfully. The first thing I'm wondering about is the speed of model training which only seems to increase by a factor 1.3 using cuDNN lstm instead of ordinary LSTM. I have read other cases where people got models that train 10 or even 15 times faster when using cudnn lstm compared to normal lstm. I will post some code below. Moreover I'm wondering about the percentage of memory usage of GPU. When code is run, it only seems to take roughly 8 % of GPU memory which seems a bit low. Can this be connected with the lack of increased speed.

dim(x.train) = (208, 1, 4) dim(y.train) = (208 , 1)

For validation sets its the same except tat 208 is replaced with 42.

     batch_size = 1

     model <- keras_model_sequential() 

     model %>% layer_cudnn_lstm(units = 1, batch_input_shape = c(1,1,4), 
                           stateful = TRUE, return_sequences = FALSE) %>% 
          layer_dropout(rate = dropout) %>% 
          layer_dense(units = 0.01)


    model %>% compile(
     loss = 'mean_squared_error',
      optimizer = optimizer_adam(lr= 0.01, decay = 1e-8),  
      metrics = c('mean_squared_error')
    )


    Epochs <- 500 

     hist_temp <-  model %>% fit(x.train, y.train, epochs=1,      batch_size=batch_size, verbose=1, shuffle=FALSE,
                             validation_data = list(x.train_val, y.test))

    model %>% reset_states()

Im expecting it to be much faster and more demanding on the GPU memory. What have I missed here?

score 1 · Answer 1 · answered Jun 13 '19 at 12:59

1

this could have multiple reasons for example:

You have created a bottleneck while reading the data. You should check the cpu, memory and disk usage. Also you can increase the batche-size to maybe increase the GPU usage, but you have a rather small sample size. Morover a batch-size of 1 isn't realy common;)

2.You have a very small network so that you don't profit from GPU accleration as much. You can try to increase the size of the network to test if the GPU usage increases.

I hope this helps.

answered Jun 13 '19 at 12:59

Fabian

756
5
12

Thanks for your input, there are some issues 1. Increasing batch_size from 1 to 2 made gpu usage drop from 8 % to 4 % (roughly) and time consumption for training the lstm was cut in half as well. 2. Increasing the amount of lstm layers increased time consumpiton and number of model parameters but did not affect gpu usage at all, which seems really strange. How can I fully utilise the GPU? Perhaps the sample size is too small for an lstm network in general. – ShadySam Jun 13 '19 at 14:30
This might actually be. You could, just to test a bigger sample size, artificial enhance your dataset (e.g. copy and paste the samples multiple times), to test if the gpu usage rises with a bigger sample size. – Fabian Jun 17 '19 at 06:15
As for the batch_size, I'm guessing, that because your dataset is small, that it causes the gpu to idle more if you double your the batch_size because the cpu isn't as busy. Could you maybe provide a screenshot of nvidia-smi and top (or taskmanager if you use windows) to see how the usage of the different components of your machine are while training? – Fabian Jun 17 '19 at 06:20

GPU memory usage of cudNN lstm and acceleration

1 Answers1