Questions tagged [deep-learning]

Deep Learning is an area of machine learning whose goal is to learn complex functions using special neural network architectures that are "deep" (consist of many layers). This tag should be used for questions about implementation of deep learning architectures. General machine learning questions should be tagged "machine learning". Including a tag for the relevant software library (e.g., "keras", "tensorflow","pytorch","fast.ai" etc) is helpful.

Deep Learning is a branch of aimed at building to learn complex functions using special neural network architectures with many layers (hence the term "deep").

Deep neural network architectures allow for more complex tasks to be learned because, in addition to these neural networks having more layers to perform transformations, the larger number of layers and more complex architectures of the neural network allow a hierarchical organization of functionality to emerge.

Deep Learning was introduced into machine learning research with the intention of moving machine learning closer to artificial intelligence. A significant impact of deep learning lies in feature learning, mitigating much of the effort going into manual feature engineering in non-deep learning neural networks.

NOTE: If you want to use this tag for a question not directly concerning implementation, then consider posting on Cross Validated, Data Science, or Artificial Intelligence instead; otherwise your question is probably off-topic. Please choose one site only and do not cross-post to more than one - see Is cross-posting a question on multiple Stack Exchange sites permitted if the question is on-topic for each site? (tl;dr: no).

Resources

Papers

Books

Videos

Stack Exchange Sites

Other StackExchange sites with Deep Learning tag:

27406 questions
74
votes
2 answers

What is a multi-headed model? And what exactly is a 'head' in a model?

What is a multi-headed model in deep learning? The only explanation I found so far is this: Every model might be thought of as a backbone plus a head, and if you pre-train backbone and put a random head, you can fine tune it and it is a good…
spacer.34
  • 864
  • 1
  • 6
  • 10
73
votes
9 answers

Keras model.summary() object to string

I want to write a *.txt file with the neural network hyperparameters and the model architecture. Is it possible to write the object model.summary() to my output file? (...) summary = str(model.summary()) (...) out = open(filename +…
lmpeixoto
  • 853
  • 1
  • 6
  • 7
73
votes
9 answers

Error when checking model input: expected convolution2d_input_1 to have 4 dimensions, but got array with shape (32, 32, 3)

I want to train a deep network starting with the following layer: model = Sequential() model.add(Conv2D(32, 3, 3, input_shape=(32, 32, 3))) using history = model.fit_generator(get_training_data(), samples_per_epoch=1,…
Oblomov
  • 8,953
  • 22
  • 60
  • 106
71
votes
4 answers

Keras - Difference between categorical_accuracy and sparse_categorical_accuracy

What is the difference between categorical_accuracy and sparse_categorical_accuracy in Keras? There is no hint in the documentation for these metrics, and by asking Dr. Google, I did not find answers for that either. The source code can be found…
jcklie
  • 4,054
  • 3
  • 24
  • 42
71
votes
4 answers

How to find Number of parameters of a keras model?

For a Feedforward Network (FFN), it is easy to compute the number of parameters. Given a CNN, LSTM etc is there a quick way to find the number of parameters in a keras model?
Anuj Gupta
  • 6,328
  • 7
  • 36
  • 55
70
votes
3 answers

Evaluating pytorch models: `with torch.no_grad` vs `model.eval()`

When I want to evaluate the performance of my model on the validation set, is it preferred to use with torch.no_grad: or model.eval()?
Tom Hale
  • 40,825
  • 36
  • 187
  • 242
70
votes
4 answers

Unbalanced data and weighted cross entropy

I'm trying to train a network with an unbalanced data. I have A (198 samples), B (436 samples), C (710 samples), D (272 samples) and I have read about the "weighted_cross_entropy_with_logits" but all the examples I found are for binary…
Sergiodiaz53
  • 1,268
  • 2
  • 14
  • 23
68
votes
5 answers

Dimension of shape in conv1D

I have tried to build a CNN with one layer, but I have some problem with it. Indeed, the compilator says me that ValueError: Error when checking model input: expected conv1d_1_input to have 3 dimensions, but got array with shape (569, 30) This is…
protti
  • 871
  • 1
  • 9
  • 12
67
votes
2 answers

How to append data to one specific dataset in a hdf5 file with h5py

I am looking for a possibility to append data to an existing dataset inside a .h5 file using Python (h5py). A short intro to my project: I try to train a CNN using medical image data. Because of the huge amount of data and heavy memory usage during…
Midas.Inc
  • 1,730
  • 3
  • 13
  • 25
67
votes
3 answers

TensorFlow - regularization with L2 loss, how to apply to all weights, not just last one?

I am playing with a ANN which is part of Udacity DeepLearning course. I have an assignment which involves introducing generalization to the network with one hidden ReLU layer using L2 loss. I wonder how to properly introduce it so that ALL weights…
66
votes
3 answers

Pytorch: nn.Dropout vs. F.dropout

There are two ways to perform dropout: torch.nn.Dropout torch.nn.functional.Dropout I ask: Is there a difference between them? When should I use one over the other? I don't see any performance difference when I switched them around.
CutePoison
  • 4,679
  • 5
  • 28
  • 63
66
votes
6 answers

Keras Text Preprocessing - Saving Tokenizer object to file for scoring

I've trained a sentiment classifier model using Keras library by following the below steps(broadly). Convert Text corpus into sequences using Tokenizer object/class Build a model using the model.fit() method Evaluate this model Now for scoring…
66
votes
8 answers

OpenCL / AMD: Deep Learning

While "googl'ing" and doing some research I were not able to find any serious/popular framework/sdk for scientific GPGPU-Computing and OpenCL on AMD hardware. Is there any literature and/or software I missed? Especially I am interested in deep…
daniel451
  • 10,626
  • 19
  • 67
  • 125
66
votes
4 answers

2-D convolution as a matrix-matrix multiplication

I know that, in the 1D case, the convolution between two vectors, a and b, can be computed as conv(a, b), but also as the product between the T_a and b, where T_a is the corresponding Toeplitz matrix for a. Is it possible to extend this idea to…
65
votes
3 answers

Saving best model in keras

I use the following code when training a model in keras from keras.callbacks import EarlyStopping model = Sequential() model.add(Dense(100, activation='relu', input_shape = input_shape)) model.add(Dense(1)) model_2.compile(optimizer='adam',…
dJOKER_dUMMY
  • 699
  • 2
  • 6
  • 5