Questions tagged [neural-network]

Network structure inspired by simplified models of biological neurons (brain cells). Neural networks are trained to "learn" by supervised and unsupervised techniques, and can be used to solve optimization problems, approximation problems, classify patterns, and combinations thereof.

NOTE: If you want to use this tag for a question not directly concerning implementation, then consider posting on Cross Validated, Data Science, or Artificial Intelligence instead; otherwise you're probably off-topic. Please choose one site only and do not cross-post to more than one - see Is cross-posting a question on multiple Stack Exchange sites permitted if the question is on-topic for each site?

Neural networks have many practical applications within the software realm.

An application of neural networks for supervised learning would be training a neural network for optical character recognition or handwriting recognition. The network would be trained on exemplars of characters, and given enough data which are a representative sample of the population, the network can generalize to a wider spectrum of cases that were not encountered during training. The procedure of training a neural network in a supervised learning manner involves a learning algorithm for finding the optimal weights of the neurons in the network that minimize its error at performing a task. Gradient Descent is an example for a learning algorithm common for adjusting the weights of a neural network. It is often accompanied by the backpropagation technique in order to measure the contribution of each weight to the error signal and determine the gradients that guides the learning algorithm in adjusting each weight.

For an example of a backpropagation network in action, see the source of GNU Backgammon

A frequently used network topology in unsupervised learning is the Self-Organizing Map, often attributed to Kohonen. These networks can be used for clustering data, and in general, providing a lower dimensional representation of a higher dimensional space.

See this code project article for an application of the Self-Organizing Map in clustering different images to find all of the unique faces.

Introductory Video

Neural Networks Demystified (Jupyter Notebooks)

Resources/ Recommendations

Neural Networks - Michael Nielsen

19989 questions

957

votes

18 answers

What is the role of the bias in neural networks?

I'm aware of the gradient descent and the back-propagation algorithm. What I don't get is: when is using a bias important and how do you use it? For example, when mapping the AND function, when I use two inputs and one output, it does not give the…

asked Mar 19 '10 at 21:18

Karan

11,509
8
34
38

476

votes

14 answers

Epoch vs Iteration when training neural networks

What is the difference between epoch and iteration when training a multi-layer perceptron?

machine-learning neural-network deep-learning artificial-intelligence terminology

asked Jan 20 '11 at 21:11

mohammad

4,905
4
16
13

429

votes

10 answers

What is the meaning of the word logits in TensorFlow?

In the following TensorFlow function, we must feed the activation of artificial neurons in the final layer. That I understand. But I don't understand why it is called logits? Isn't that a mathematical function? loss_function =…

tensorflow machine-learning neural-network deep-learning cross-entropy

asked Jan 04 '17 at 02:02

Milad P.

4,707
3
12
9

414

votes

3 answers

Keras input explanation: input_shape, units, batch_size, dim, etc

For any Keras layer (Layer class), can someone explain how to understand the difference between input_shape, units, dim, etc.? For example the doc says units specify the output shape of a layer. In the image of the neural net below hidden layer1…

neural-network deep-learning keras keras-layer tensor

asked Jun 25 '17 at 14:29

scarecrow

6,624
5
20
39

395

votes

6 answers

What are advantages of Artificial Neural Networks over Support Vector Machines?

ANN (Artificial Neural Networks) and SVM (Support Vector Machines) are two popular strategies for supervised machine learning and classification. It's not often clear which method is better for a particular project, and I'm certain the answer is…

machine-learning neural-network classification svm

asked Jul 24 '12 at 13:59

Channel72

24,139
32
108
180

331

votes

7 answers

Why do we need to call zero_grad() in PyTorch?

Why does zero_grad() need to be called during training? | zero_grad(self) | Sets gradients of all model parameters to zero.

python neural-network deep-learning pytorch gradient-descent

asked Dec 28 '17 at 04:31

user1424739

11,937
17
63
152

330

votes

2 answers

Extremely small or NaN values appear in training neural network

I'm trying to implement a neural network architecture in Haskell, and use it on MNIST. I'm using the hmatrix package for linear algebra. My training framework is built using the pipes package. My code compiles and doesn't crash. But the problem is,…

algorithm haskell neural-network backpropagation

asked Jun 21 '17 at 21:32

Charles Langlois

4,198
4
16
25

271

votes

3 answers

How to interpret loss and accuracy for a machine learning model

When I trained my neural network with Theano or Tensorflow, they will report a variable called "loss" per epoch. How should I interpret this variable? Higher loss is better or worse, or what does it mean for the final performance (accuracy) of my…

machine-learning neural-network mathematical-optimization deep-learning objective-function

asked Dec 29 '15 at 20:33

mamatv

3,581
4
19
25

254

votes

10 answers

How do I initialize weights in PyTorch?

How do I initialize weights and biases of a network (via e.g. He or Xavier initialization)?

python machine-learning deep-learning neural-network pytorch

asked Mar 22 '18 at 16:34

Fábio Perez

23,850
22
76
100

238

votes

8 answers

Ordering of batch normalization and dropout?

The original question was in regard to TensorFlow implementations specifically. However, the answers are for implementations in general. This general answer is also the correct answer for TensorFlow. When using batch normalization and dropout in…

python neural-network tensorflow conv-neural-network

asked Sep 25 '16 at 21:12

golmschenk

11,736
20
78
137

227

votes

10 answers

Why use softmax as opposed to standard normalization?

In the output layer of a neural network, it is typical to use the softmax function to approximate a probability distribution: This is expensive to compute because of the exponents. Why not simply perform a Z transform so that all outputs are…

math neural-network softmax

asked Jun 19 '13 at 09:20

Tom

6,601
12
40
48

215

votes

12 answers

Why binary_crossentropy and categorical_crossentropy give different performances for the same problem?

I'm trying to train a CNN to categorize text by topic. When I use binary cross-entropy I get ~80% accuracy, with categorical cross-entropy I get ~50% accuracy. I don't understand why this is. It's a multiclass problem, doesn't that mean that I have…

machine-learning keras neural-network deep-learning conv-neural-network

asked Feb 07 '17 at 03:34

Daniel Messias

2,623
2
18
21

209

votes

8 answers

Where do I call the BatchNormalization function in Keras?

If I want to use the BatchNormalization function in Keras, then do I need to call it once only at the beginning? I read this documentation for it: http://keras.io/layers/normalization/ I don't see where I'm supposed to call it. Below is my code…

python keras neural-network data-science batch-normalization

asked Jan 11 '16 at 07:47

pr338

8,730
19
52
71

192

votes

13 answers

Why must a nonlinear activation function be used in a backpropagation neural network?

I've been reading some things on neural networks and I understand the general principle of a single layer neural network. I understand the need for aditional layers, but why are nonlinear activation functions used? This question is followed by this…

math machine-learning neural-network deep-learning

asked Mar 20 '12 at 06:06

corazza

31,222
37
115
186

188

votes

10 answers

Why do we have to normalize the input for an artificial neural network?

Why do we have to normalize the input for a neural network? I understand that sometimes, when for example the input values are non-numerical a certain transformation must be performed, but when we have a numerical input? Why the numbers must be in a…

machine-learning neural-network normalization

asked Jan 12 '11 at 22:16

karla

4,506
5
34
39

2 3

…

99 100 Next