I am building a custom Keras layer that it is essentially the softmax function with a base parameter which is trainable. While the layer works on its own, when placed inside a sequential model, model.summary()
determines its output shape as None
and model.fit()
raises a presumably linked exception:
ValueError: as_list() is not defined on an unknown TensorShape.
In other custom layers (including obviously the Linear example from keras) the output shape can be determined after .build()
is called. By looking at model.summary()
's source code, as well as keras.layers.Layer
, there is this @property Layer.output_shape
that fails to automatically determine the output shape.
Then I tried overwriting the property and manually returning the input_shape
argument passed to my layer's .build()
method after saving it (softmax does not change the shape of the input), but this didn't work either: If i make a call to super().output_shape before returning my value,
model.summary() determines the shape as ?
, while if I don't, the value may be shown seemingly correct, but in both cases, I get the exact same error during .fit().
Is there something special about the code iside call() that prevents keras from understanding the shape of the output?
Alteratively, is there a piece of documentation I have missed?
My layer:
class B_Softmax(keras.layers.Layer):
def __init__(self, b_init_mean=10, b_init_var=0.001):
super(B_Softmax, self).__init__()
self.b_init = tf.random_normal_initializer(b_init_mean, b_init_var)
self._out_shape = None
def build(self, input_shape):
self.b = tf.Variable(
initial_value = self.b_init(shape=(1,), dtype='float32'),
trainable=True
)
self._out_shape = input_shape
def call(self, inputs):
# This is an implementation of Softmax for batched inputs
# where the factor b is added to the exponents
nominators = tf.math.exp(self.b * inputs)
denominator = tf.reduce_sum(nominators, axis=1)
denominator = tf.squeeze(denominator)
denominator = tf.expand_dims(denominator, -1)
s = tf.divide(nominators, denominator)
return s
@property
def output_shape(self): # If I comment out this function, summary prints 'None'
self.output_shape # If I leave this line, summary prints '?'
return self._out_shape # If the above line is commented out, summary prints '10' (correctly)
# but the same error is triggered in all three cases
The layer works on its own:
>>> A = tf.constant([[1,2,3], [7,5,6]], dtype="float32")
>>> layer = B_Softmax(1.0)
>>> layer(A)
<tf.Tensor: shape=(2, 3), dtype=float32, numpy=
array([[0.08991686, 0.24461554, 0.6654676 ],
[0.6654677 , 0.08991687, 0.24461551]], dtype=float32)>
But when I try to include it inside a model, the summary doesn't look right:
input_dim = 5
model = keras.Sequential([
Dense(32, activation='relu', input_shape=(input_dim,)),
Dense(num_classes, activation="softmax"),
B_Softmax(1.0)
])
model.summary()
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_10 (Dense) (None, 32) 192
dense_11 (Dense) (None, 10) 330
b__softmax_18 (B_Softmax) None <-------------------1-------- "None", "?", or "10" (in a hacky way) may be printted
=================================================================
Total params: 523
Trainable params: 523
Non-trainable params: 0
And training fails:
batch_size = 128
epochs = 15
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])
model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, validation_split=0.1)
ValueError: in user code:
File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1051, in train_function *
return step_function(self, iterator)
File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1040, in step_function **
outputs = model.distribute_strategy.run(run_step, args=(data,))
File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1030, in run_step **
outputs = model.train_step(data)
File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 894, in train_step
return self.compute_metrics(x, y, y_pred, sample_weight)
File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 987, in compute_metrics
self.compiled_metrics.update_state(y, y_pred, sample_weight)
File "/usr/local/lib/python3.7/dist-packages/keras/engine/compile_utils.py", line 480, in update_state
self.build(y_pred, y_true)
File "/usr/local/lib/python3.7/dist-packages/keras/engine/compile_utils.py", line 398, in build
y_pred)
File "/usr/local/lib/python3.7/dist-packages/keras/engine/compile_utils.py", line 526, in _get_metric_objects
return [self._get_metric_object(m, y_t, y_p) for m in metrics]
File "/usr/local/lib/python3.7/dist-packages/keras/engine/compile_utils.py", line 526, in <listcomp>
return [self._get_metric_object(m, y_t, y_p) for m in metrics]
File "/usr/local/lib/python3.7/dist-packages/keras/engine/compile_utils.py", line 548, in _get_metric_object
y_p_rank = len(y_p.shape.as_list())
ValueError: as_list() is not defined on an unknown TensorShape.