Questions tagged [deep-learning]

Deep Learning is an area of machine learning whose goal is to learn complex functions using special neural network architectures that are "deep" (consist of many layers). This tag should be used for questions about implementation of deep learning architectures. General machine learning questions should be tagged "machine learning". Including a tag for the relevant software library (e.g., "keras", "tensorflow","pytorch","fast.ai" etc) is helpful.

Deep Learning is a branch of aimed at building to learn complex functions using special neural network architectures with many layers (hence the term "deep").

Deep neural network architectures allow for more complex tasks to be learned because, in addition to these neural networks having more layers to perform transformations, the larger number of layers and more complex architectures of the neural network allow a hierarchical organization of functionality to emerge.

Deep Learning was introduced into machine learning research with the intention of moving machine learning closer to artificial intelligence. A significant impact of deep learning lies in feature learning, mitigating much of the effort going into manual feature engineering in non-deep learning neural networks.

NOTE: If you want to use this tag for a question not directly concerning implementation, then consider posting on Cross Validated, Data Science, or Artificial Intelligence instead; otherwise your question is probably off-topic. Please choose one site only and do not cross-post to more than one - see Is cross-posting a question on multiple Stack Exchange sites permitted if the question is on-topic for each site? (tl;dr: no).

Resources

Papers

Books

Videos

Stack Exchange Sites

Other StackExchange sites with Deep Learning tag:

27406 questions
173
votes
9 answers

What does tf.nn.embedding_lookup function do?

tf.nn.embedding_lookup(params, ids, partition_strategy='mod', name=None) I cannot understand the duty of this function. Is it like a lookup table? Which means to return the parameters corresponding to each id (in ids)? For instance, in the…
Poorya Pzm
  • 2,123
  • 3
  • 12
  • 9
172
votes
5 answers

Why do we "pack" the sequences in PyTorch?

I was trying to replicate How to use packing for variable-length sequence inputs for rnn but I guess I first need to understand why we need to "pack" the sequence. I understand why we "pad" them but why is "packing" (via pack_padded_sequence)…
aerin
  • 20,607
  • 28
  • 102
  • 140
170
votes
6 answers

What is the use of verbose in Keras while validating the model?

I'm running the LSTM model for the first time. Here is my model: opt = Adam(0.002) inp = Input(...) print(inp) x = Embedding(....)(inp) x = LSTM(...)(x) x = BatchNormalization()(x) pred = Dense(5,activation='softmax')(x) model =…
rakesh
  • 1,667
  • 2
  • 11
  • 12
162
votes
8 answers

RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same

This: device = torch.device("cuda" if torch.cuda.is_available() else "cpu") model.to(device) for data in dataloader: inputs, labels = data outputs = model(inputs) Gives the error: RuntimeError: Input type (torch.FloatTensor) and weight…
Guillermina
  • 3,127
  • 3
  • 15
  • 24
159
votes
2 answers

Many to one and many to many LSTM examples in Keras

I try to understand LSTMs and how to build them with Keras. I found out, that there are principally the 4 modes to run a RNN (the 4 right ones in the picture) Image source: Andrej Karpathy Now I wonder how a minimalistic code snippet for each of…
145
votes
21 answers

How to avoid "CUDA out of memory" in PyTorch

I think it's a pretty common message for PyTorch users with low GPU memory: RuntimeError: CUDA out of memory. Tried to allocate X MiB (GPU X; X GiB total capacity; X GiB already allocated; X MiB free; X cached) I tried to process an image by…
voilalex
  • 2,041
  • 2
  • 13
  • 18
129
votes
13 answers

Keras split train test set when using ImageDataGenerator

I have a single directory which contains sub-folders (according to labels) of images. I want to split this data into train and test set while using ImageDataGenerator in Keras. Although model.fit() in keras has argument validation_split for…
Nitin
  • 2,572
  • 5
  • 21
  • 28
128
votes
4 answers

How to unpack pkl file?

I have a pkl file from MNIST dataset, which consists of handwritten digit images. I'd like to take a look at each of those digit images, so I need to unpack the pkl file, except I can't find out how. Is there a way to unpack/unzip pkl file?
ytrewq
  • 3,670
  • 9
  • 42
  • 71
123
votes
9 answers

How to fix RuntimeError "Expected object of scalar type Float but got scalar type Double for argument"?

I'm trying to train a classifier via PyTorch. However, I am experiencing problems with training when I feed the model with training data. I get this error on y_pred = model(X_trainTensor): RuntimeError: Expected object of scalar type Float but got…
Shawn Zhang
  • 1,719
  • 2
  • 14
  • 20
123
votes
6 answers

Common causes of nans during training of neural networks

I've noticed that a frequent occurrence during training is NANs being introduced. Often times it seems to be introduced by weights in inner-product/fully-connected or convolution layers blowing up. Is this occurring because the gradient computation…
117
votes
8 answers

How do I split a custom dataset into training and test datasets?

import pandas as pd import numpy as np import cv2 from torch.utils.data.dataset import Dataset class CustomDatasetFromCSV(Dataset): def __init__(self, csv_path, transform=None): self.data = pd.read_csv(csv_path) self.labels =…
nirvair
  • 4,001
  • 10
  • 51
  • 85
117
votes
2 answers

Which parameters should be used for early stopping?

I'm training a neural network for my project using Keras. Keras has provided a function for early stopping. May I know what parameters should be observed to avoid my neural network from overfitting by using early stopping?
AizuddinAzman
  • 1,307
  • 2
  • 9
  • 5
117
votes
10 answers

keras: how to save the training history attribute of the history object

In Keras, we can return the output of model.fit to a history as follows: history = model.fit(X_train, y_train, batch_size=batch_size, nb_epoch=nb_epoch, validation_data=(X_test,…
jwm
  • 4,832
  • 10
  • 46
  • 78
117
votes
8 answers

How to apply gradient clipping in TensorFlow?

Considering the example code. I would like to know How to apply gradient clipping on this network on the RNN where there is a possibility of exploding gradients. tf.clip_by_value(t, clip_value_min, clip_value_max, name=None) This is an example that…
Arsenal Fanatic
  • 3,663
  • 6
  • 38
  • 53
116
votes
2 answers

What is the role of TimeDistributed layer in Keras?

I am trying to grasp what TimeDistributed wrapper does in Keras. I get that TimeDistributed "applies a layer to every temporal slice of an input." But I did some experiment and got the results that I cannot understand. In short, in connection to…
Buomsoo Kim
  • 1,283
  • 2
  • 9
  • 5