I would like to ask you some help for creating my custom layer. What I am trying to do is actually quite simple: generating an output layer with 'stateful' variables, i.e. tensors whose value is updated at each batch.
In order to make everything more clear, here is a snippet of what I would like to do:
def call(self, inputs)
c = self.constant
m = self.extra_constant
update = inputs*m + c
X_new = self.X_old + update
outputs = X_new
self.X_old = X_new
return outputs
The idea here is quite simple:
X_old
is initialized to 0 in thedef__ init__(self, ...)
update
is computed as a function of the inputs to the layer- the output of the layer is computed (i.e.
X_new
) - the value of
X_old
is set equal toX_new
so that, at the next batch,X_old
is no longer equal to zero but equal toX_new
from the previous batch.
I have found out that K.update
does the job, as shown in the example:
X_new = K.update(self.X_old, self.X_old + update)
The problem here is that, if I then try to define the outputs of the layer as:
outputs = X_new
return outputs
I will receiver the following error when I try model.fit():
ValueError: An operation has `None` for gradient. Please make sure that all of your ops have
gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.
And I keep having this error even though I imposed layer.trainable = False
and I did not define any bias or weights for the layer. On the other hand, if I just do self.X_old = X_new
, the value of X_old
does not get updated.
Do you guys have a solution to implement this? I believe it should not be that hard, since also stateful RNN have a 'similar' functioning.
Thanks in advance for your help!