0

I am trying to build a Keras neural network where the activation function at the output layer (conditionally) depends on the inputs. The activation function is quite complicated, so as a simpler example, consider the following:

def myactivation(y,x1,x2):
    if x1 > 1.2:
        return(tf.constant(0.0))
    else:
        return(tf.math.minimum(y,x2))

The network consists of an input layer (i.e. x1, x2), a hidden layer with output y and an output layer. The activation above corresponds to the activation of the output layer.

How would I implement something like this? Grateful for any help and guidance you might have!

André
  • 1,034
  • 9
  • 19
  • I am not sure what `y` is here. Is it the output of the node? If so, how would you calculate the output of the node if it requires the output for calculation? Can you draw the graph? Also, is it important that `x1` and `x2` are separate variables, or could it be OK to just put them on a different feature-channel of the same tensor? – André Jul 11 '22 at 14:56
  • @André Thanks for your comment! The architecture of the network is as following: 2 inputs (x1 and x2), a hidden layer of 25 neurons with ReLU activation function and one output (y*). y is the input to the final node from the hidden layer before running it through the activation function. I.e, if I were to use the sigmoid function, we would simply have y* = 1/(1+exp(-y)). I hope that clarifies thing! – Philip Winchester Jul 11 '22 at 15:55

1 Answers1

1

You can write a custom layer like shown in the keras docs. For your application, the key is to use tf.where based on your condition x1 > 1.2 to switch between the two activation types. Regarding "organizing" the input to the layer, I've decided to stitch them together using tf.stack, such that there is a feature axis in the resulting tensor inp (you'll see in the example).

import tensorflow as tf
import keras

# For reference: https://keras.io/guides/making_new_layers_and_models_via_subclassing/
class AwesomeLayer(keras.layers.Layer):
    def __init__(self, threshold): # Here you could define the layer shape if needed
        super(AwesomeLayer, self).__init__()
        # Here you could define layer (trainable) parameters
        
        self.threshold = tf.constant(threshold)

    def call(self, inputs):
        return tf.where(
            inputs[..., 0] > self.threshold, 
            tf.constant(0.0), 
            tf.math.minimum(inputs[..., 2], inputs[..., 1])
        )

Let's test if it does what you want:

awesome_layer = AwesomeLayer(1.2)

# Input Example 2

x_1 = tf.constant(2.) # Case x_1 > 1.2
x_2 = tf.constant(-5.)
y = tf.constant(1.)

inp = tf.stack([x_1, x_2, y], axis=-1)
print(inp.shape) # (3,)

outp = awesome_layer(inp)
print(outp.numpy()) # 0.0

# Input Example 2

x_1 = tf.constant(-10.) # Case x_1 < 1.2
x_2 = tf.constant(-5.)
y = tf.constant(1.)

inp = tf.stack([x_1, x_2, y], axis=-1)
print(inp.shape) # (3,)

outp = awesome_layer(inp)

print(outp.numpy()) # -5.0

# Input Example 3

x_1 = tf.constant(-10.) # Case x_1 < 1.2
x_2 = tf.constant(25.)
y = tf.constant(1.)

inp = tf.stack([x_1, x_2, y], axis=-1)
print(inp.shape) # (3,)

outp = awesome_layer(inp)

print(outp.numpy()) # 1.0
André
  • 1,034
  • 9
  • 19
  • @PhilipWinchester As soon as you see it works for you, I'd be happy if you accept the answer :-) – André Jul 12 '22 at 07:55