1

As I understand, when using softmax of K values in RBM visible units, the hidden unit stays binary.

If so - I'm not sure how to compute contributions by the binary units to the visible ones. Am I supposed to relate the binary 0 state in a hidden unit to a specific state out of the K states of the softmax, and the 1 state to the other K-1 states? Or maybe a 0 in the hidden unit correlates to 0 in all of the K possible states of the visible unit (but doesn't it contradict the fact that at least one of the K states must be on?).

OmG
  • 18,337
  • 10
  • 57
  • 90
Uri
  • 25,622
  • 10
  • 45
  • 72

1 Answers1

2

I think I've figured out my misunderstanding: The softmax units behave as groups of binary subunits, and each subunit has its own weights to the hidden units. This means the matrix of weights between the hidden layer and visible layer is 3 dimensional, instead of 2, and now it is obvious how to calculate the contributions.

Uri
  • 25,622
  • 10
  • 45
  • 72
  • can you provide me a tutorial on what is a softmax layer and how to use it in RBM? I understood how an RBM can be trained with binary visible and hidden units, but have no idea about how to use for non-binary numbers. So please point me to some tutorials which explain about softmax in this context. – StrikeR Apr 22 '14 at 09:30
  • @Uri Hi, what happens to the bias term related to such softmax unit? Say I have a variable which can take 10 possible values so would there be only one bias term for that or 10 different bias term for 10 states. – bytestorm Jan 11 '18 at 14:56
  • I'm not 100% sure I understand the question but since 1 will not make much sense (which one are you biasing...) I would say 10 – Uri Jan 11 '18 at 15:23