Questions tagged [deep-learning]

Deep Learning is an area of machine learning whose goal is to learn complex functions using special neural network architectures that are "deep" (consist of many layers). This tag should be used for questions about implementation of deep learning architectures. General machine learning questions should be tagged "machine learning". Including a tag for the relevant software library (e.g., "keras", "tensorflow","pytorch","fast.ai" etc) is helpful.

Deep Learning is a branch of machine-learning aimed at building neural-networks to learn complex functions using special neural network architectures with many layers (hence the term "deep").

Deep neural network architectures allow for more complex tasks to be learned because, in addition to these neural networks having more layers to perform transformations, the larger number of layers and more complex architectures of the neural network allow a hierarchical organization of functionality to emerge.

Deep Learning was introduced into machine learning research with the intention of moving machine learning closer to artificial intelligence. A significant impact of deep learning lies in feature learning, mitigating much of the effort going into manual feature engineering in non-deep learning neural networks.

NOTE: If you want to use this tag for a question not directly concerning implementation, then consider posting on Cross Validated, Data Science, or Artificial Intelligence instead; otherwise your question is probably off-topic. Please choose one site only and do not cross-post to more than one - see Is cross-posting a question on multiple Stack Exchange sites permitted if the question is on-topic for each site? (tl;dr: no).

Resources

Papers

Deep Learning in Neural Networks: An Overview

Books

Neural Networks and Deep Learning By Michael Nielsen - this is a free book with associated Python source code on Github
Deep Learning
Deep Learning Made Easy with R: A Gentle Introduction For Data Science
Deep Learning: Methods and Applications
Autonomous Robotics and Deep Learning
Deep Learning with Python
Probabilistic Deep Learning with Python

Videos

Neural Networks Demystified - accompanied by a set of Jupyter Notebooks

Stack Exchange Sites

Other StackExchange sites with Deep Learning tag:

27406 questions

114

votes

4 answers

multi-layer perceptron (MLP) architecture: criteria for choosing number of hidden layers and size of the hidden layer?

If we have 10 eigenvectors then we can have 10 neural nodes in input layer.If we have 5 output classes then we can have 5 nodes in output layer.But what is the criteria for choosing number of hidden layer in a MLP and how many neural nodes in 1…

machine-learning neural-network deep-learning perceptron

asked May 12 '12 at 17:18

Abhishek kumar

2,586
5
32
38

110

votes

4 answers

What's the difference between torch.stack() and torch.cat() functions?

OpenAI's REINFORCE and actor-critic example for reinforcement learning has the following code: REINFORCE: policy_loss = torch.cat(policy_loss).sum() actor-critic: loss = torch.stack(policy_losses).sum() + torch.stack(value_losses).sum() One is…

python machine-learning deep-learning pytorch

asked Jan 22 '19 at 11:24

Gulzar

23,452
27
113
201

109

votes

5 answers

What's the difference between "hidden" and "output" in PyTorch LSTM?

I'm having trouble understanding the documentation for PyTorch's LSTM module (and also RNN and GRU, which are similar). Regarding the outputs, it says: Outputs: output, (h_n, c_n) output (seq_len, batch, hidden_size * num_directions): tensor…

deep-learning pytorch lstm recurrent-neural-network tensor

asked Jan 17 '18 at 13:54

N. Virgo

7,970
11
44
65

108

votes

6 answers

Keras, how do I predict after I trained a model?

I'm playing with the reuters-example dataset and it runs fine (my model is trained). I read about how to save a model, so I could load it later to use again. But how do I use this saved model to predict a new text? Do I use models.predict()? Do I…

python theano deep-learning keras

asked Jun 18 '16 at 00:00

bky

1,314
3
11
14

104

votes

4 answers

What does global_step mean in Tensorflow?

In this is tutorial code from TensorFlow website, could anyone help explain what does global_step mean? I found on the Tensorflow website written that global step is used count training steps, but I don't quite get what exactly it means. Also,…

tensorflow deep-learning

asked Dec 15 '16 at 14:32

GabrielChu

6,026
10
27
42

103

votes

4 answers

How to do gradient clipping in pytorch?

What is the correct way to perform gradient clipping in pytorch? I have an exploding gradients problem.

python machine-learning deep-learning pytorch gradient-descent

asked Feb 15 '19 at 20:09

Gulzar

23,452
27
113
201

102

votes

2 answers

What is the intuition of using tanh in LSTM?

In an LSTM network (Understanding LSTMs), why does the input gate and output gate use tanh? What is the intuition behind this? It is just a nonlinear transformation? If it is, can I change both to another activation function (e.g., ReLU)?

machine-learning deep-learning lstm recurrent-neural-network activation-function

asked Nov 23 '16 at 10:00

DNK

1,448
2
13
12

102

votes

8 answers

How big should batch size and number of epochs be when fitting a model?

My training set has 970 samples and validation set has 243 samples. How big should batch size and number of epochs be when fitting a model to optimize the val_acc? Is there any sort of rule of thumb to use based on data input size?

python machine-learning deep-learning

asked Jan 28 '16 at 00:21

pr338

8,730
19
52
71

101

votes

3 answers

What is the difference between sparse_categorical_crossentropy and categorical_crossentropy?

What is the difference between sparse_categorical_crossentropy and categorical_crossentropy? When should one loss be used as opposed to the other? For example, are these losses suitable for linear regression?

python tensorflow machine-learning keras deep-learning

asked Oct 25 '19 at 20:33

xpertdev

1,293
2
6
12

101

votes

6 answers

Using a pre-trained word embedding (word2vec or Glove) in TensorFlow

I've recently reviewed an interesting implementation for convolutional text classification. However all TensorFlow code I've reviewed uses a random (not pre-trained) embedding vectors like the following: with tf.device('/cpu:0'),…

python numpy tensorflow deep-learning

asked Feb 28 '16 at 20:11

user3147590

1,231
2
10
16

votes

10 answers

Does Any one got "AttributeError: 'str' object has no attribute 'decode' " , while Loading a Keras Saved Model

After Training, I saved Both Keras whole Model and Only Weights using model.save_weights(MODEL_WEIGHTS) and model.save(MODEL_NAME) Models and Weights were saved successfully and there was no error. I can successfully load the weights simply using…

python machine-learning keras deep-learning

asked Dec 12 '18 at 10:07

Rizwan

1,210
2
9
21

votes

4 answers

How to stack multiple lstm in keras?

I am using deep learning library keras and trying to stack multiple LSTM with no luck. Below is my code model = Sequential() model.add(LSTM(100,input_shape =(time_steps,vector_size))) model.add(LSTM(100)) The above code returns error in the third…

tensorflow deep-learning keras lstm keras-layer

asked Oct 30 '16 at 17:07

Tamim Addari

7,591
9
40
59

votes

10 answers

How to add regularizations in TensorFlow?

I found in many available neural network code implemented using TensorFlow that regularization terms are often implemented by manually adding an additional term to loss value. My questions are: Is there a more elegant or recommended way of…

python neural-network tensorflow deep-learning

asked May 09 '16 at 03:04

Lifu Huang

11,930
14
55
77

votes

2 answers

how to format the image data for training/prediction when images are different in size?

I am trying to train my model which classifies images. The problem I have is, they have different sizes. how should i format my images/or model architecture ?

deep-learning

asked Jan 28 '17 at 07:58

Asif Mohammed

1,323
1
15
29

votes

5 answers

Calculate the output size in convolution layer

How do I calculate the output size in a convolution layer? For example, I have a 2D convolution layer that takes a 3x128x128 input and has 40 filters of size 5x5.

machine-learning deep-learning pytorch conv-neural-network

asked Dec 02 '18 at 12:09

Monk247uk

1,170
1
8
15

Prev 1 2

…

99 100 Next