2

I am trying to learn deep learning.

In torch tutorial,

https://github.com/torch/tutorials/blob/master/2_supervised/2_model.lua

https://github.com/torch/tutorials/blob/master/3_unsupervised/2_models.lua

Supervised model

-- Simple 2-layer neural network, with tanh hidden units
model = nn.Sequential()
model:add(nn.Reshape(ninputs))
model:add(nn.Linear(ninputs,nhiddens))
model:add(nn.Tanh())
model:add(nn.Linear(nhiddens,noutputs))

Unsupervised model

-- encoder
encoder = nn.Sequential()
encoder:add(nn.Linear(inputSize,outputSize))
encoder:add(nn.Tanh())
encoder:add(nn.Diag(outputSize))
-- decoder
decoder = nn.Sequential()
decoder:add(nn.Linear(outputSize,inputSize))
-- complete model
module = unsup.AutoEncoder(encoder, decoder, params.beta)

why unsupervised model needs to implement nn.Diag ?

Thanks in advance.

yutseho
  • 1,639
  • 1
  • 15
  • 27
  • You should actually ask this on the [torch mailing group](https://groups.google.com/forum/#!forum/torch7). You're more likely to get an answer there. (I'm also curious about this issue, so please do post there) – user8472 Sep 26 '15 at 21:16
  • OK~ Done https://groups.google.com/forum/#!topic/torch7/zRRpK9418qE – yutseho Sep 27 '15 at 02:05
  • just scaling by some weight, maybe ... – yutseho Oct 12 '15 at 10:02

1 Answers1

0

It is in fact a scaling by a learnable vector (the diagonal of a matrix). This is mentioned in the section 3.1 of the paper Learning Fast Approximations of Sparse Coding. It is multiplied by the tanh and together form the non linearity.

Benjamin W.
  • 46,058
  • 19
  • 106
  • 116
Roger Trullo
  • 1,436
  • 2
  • 10
  • 19