Questions tagged [learning-rate]

83 questions
0
votes
1 answer

learning rate AdamW Optimizer

I train with BERT (from huggingface) sentiment analysis which is a NLP task. My question refers to the learning rate. EPOCHS = 5 …
0
votes
1 answer

Neural Neworks - Different learning rate for each weight

I have few questions regarding the theory behind neural networks' gradient descent. First question: Lets say we have 5 weights one for each of the 5 features. And now we want to compute the gradient. How does the algorithm internally do it? Does it…
Solmyros
  • 29
  • 3
0
votes
1 answer

Too small learning rate for multiple linear regression

I'm trying to build a multiple linear regression model for boston dataset in scikit-learn. I use Stochastic Gradient Descent (SGD) to optimize the model. And it seems like I have to use very small learning rate(0.000000001) to make model learn. If I…
0
votes
1 answer

Learning rate too large, how does this affect the loss function for logistic regression using batch gradient descent

Question: If the learning rate (a) is too large, what happens to the graph and how could this affect the loss function with iterations I've read somewhere that the graph may not converge or there could be many fluctuations in the graph, I would just…
0
votes
1 answer

Validation loss bounces randomly when training a Keras model regardless of the used optimiser

I am retraining InceptionV3 model on 200 images and I am using Adam optimiser: opt = Adam(lr=0.0001, decay=0.0001 / 100) I noticed the loss bounces specially the validation. I thought that is down to the learning rate as I saw in some answers…
owise
  • 1,055
  • 16
  • 28
0
votes
1 answer

Learning rate has no effect

I'm using a MLP with Keras, optimized with sgd. I want to tune the learning rate but it seems to have no effect whatsoever on training. I tried small learning rates (.01) as well as very large (up to 1e28), and the effects are barely notable.…
R B
  • 1
  • 1
-1
votes
1 answer

Model loss remains unchaged

I would like to understand what could be responsible for this model loss behaviour. Training a CNN network, with 6 hidden-layers, the loss shoots up from around 1.8 to above 12 after the first epoch and remains constant for the remaining 99…
-2
votes
3 answers

Why does by torch.optim.SGD method learning rate change?

With SGD learning rate should not be changed during epochs but it is. Help me understand why it happens please and how to prevent this LR changing? import torch params = [torch.nn.Parameter(torch.randn(1, 1))] optimizer = torch.optim.SGD(params,…
Alex
  • 71
  • 1
  • 6
1 2 3 4 5
6