While I was reading the blog of Colah, In the diagram we can clearly see that zt is going to ~ht and not rt But the equations say otherwise. Isn’t this supposed to be zt*ht-1 And not rt*ht-1. Please correct me if I’m wrong.
1 Answers
I see this is somehow old, however, if you still haven't figured it out and care, or for any other person who would end up here, the answer is that the figure and equations are consistent. Note that, the operator (x) in the diagram (the pink circle with an X in it) is the Hadamard product, which is an element-wise multiplication between two tensors of the same size. In the equations, this operator is illustrated by *
(usually it is represented by a circle and a dot at its center). ~h_t
is the output of the tanh operator. The tanh operator receives a linear combination of the input at time t
, x_t
, and the result of the Hadamard product between r_t
and h_{t-1}
. Note that r_t
should have already been updated by passing the linear combination of x_t
and h_{t-1}
through a sigmoid. I hope the reset is clear.