how to do custom keras layer matrix multiplication

Question

Layers:

Input shape (None,75)
Hidden layer 1 - shape is (75,3)
Hidden layer 2 - shape is (3,1)

For the last layer, the output must be calculated as ( (H21*w1)*(H22*w2)*(H23*w3)), where H21,H22,H23 will be the outcome of Hidden layer 2, and w1,w2,w3 will be constant weight which are not trainable. So how to write a lambda function for the above outcome

def product(X):
    return X[0]*X[1]

keras_model = Sequential()
keras_model.add(Dense(75, 
input_dim=75,activation='tanh',name="layer1" ))
keras_model.add(Dense(3 ,activation='tanh',name="layer2" ))
keras_model.add(Dense(1,name="layer3"))
cross1=keras_model.add(Lambda(lambda x:product,output_shape=(1,1)))([layer2,layer3])
print(cross1)

NameError: name 'layer2' is not defined

score 2 · Accepted Answer · answered Dec 27 '19 at 18:23

2

Use the functional API model

inputs = Input((75,))                                         #shape (batch, 75)
output1 = Dense(75, activation='tanh',name="layer1" )(inputs) #shape (batch, 75)
output2 = Dense(3 ,activation='tanh',name="layer2" )(output1) #shape (batch, 3)
output3 = Dense(1,name="layer3")(output2)                     #shape (batch, 1)

cross1 = Lambda(lambda x: x[0] * x[1])([output2, output3])    #shape (batch, 3)

model = Model(inputs, cross1)

Please notice that the shapes are totally different from what you expect.

answered Dec 27 '19 at 18:23

Daniel Möller

84,878
18
192
214

Iam new to keras...then how to write for my expectation. Input shape (None,75) Hidden layer 1 - shape is (75,3) Hidden layer 2 - shape is (3,1) For the last layer, the output must be calculated as ( (H21*w1)*(H22*w2)*(H23*w3)), where H21,H22,H23 will be the outcome of Hidden layer 2, and w1,w2,w3 will be constant weight which are not trainable. So how to write a lambda function for the above outcome – karun reddy Dec 27 '19 at 18:35

score 1 · Answer 2 · answered Dec 27 '19 at 22:42

I will suggest you to do it via a customized layer instead of the Lambda layer. Why? A customized will give you more freedom to do stuffs, and it is also more transparent in terms of viewing your desired weights. More precisely, if you do it through Lambda layer, the constant weight will not be saved as a part of the model, but it will if you use a customized layer.

Here is an example

from keras import backend as K
from keras.layers import *
from keras.models import *
import numpy as np 


class MyLayer(Layer) :
    # see https://keras.io/layers/writing-your-own-keras-layers/
    def __init__(self, 
                 w_vec=None, 
                 allow_training=False,
                 **kwargs) :
        self._w_vec = w_vec
        assert allow_training or (w_vec is not None), \
            "ERROR: non-trainable w_vec must be initialized"
        self.allow_training = allow_training
        super().__init__(**kwargs)
        return
    def build(self, input_shape) :
        batch_size, num_feats = input_shape
        self.w_vec = self.add_weight(shape=(1, num_feats),
                                     name='w_vec',
                                     initializer='uniform', # <- use your own preferred initializer
                                     trainable=self.allow_training,)
        if self._w_vec is not None :
            # predefined w_vec
            assert self._w_vec.shape[1] == num_feats, \
                "ERROR: initial w_vec shape mismatches the input shape"
            # set it to the weight
            self.set_weights([self._w_vec])  # <- set weights to the supplied one
        super().build(input_shape)
        return
    def call(self, x) :
        # Given:
        #   x = [H21, H22, H23]
        #   w_vec = [w1, w2, w3]
        # Step 1: output elem_prod
        #   elem_prod = [H21*w1, H22*w2, H23*w3]
        elem_prod = x * self.w_vec
        # Step 2: output ret
        #   ret = (H21*w1) * (H22*w2) * (H23*w3)
        ret = K.prod(elem_prod, axis=-1, keepdims=True)
        return ret
    def compute_output_shape(self, input_shape) :
        return (input_shape[0], 1)

def make_test_cases(w_vec=None, allow_training=False):
    x = Input(shape=(75,))
    y = Dense(75, activation='tanh', name='fc1')(x)
    y = Dense(3, activation='tanh', name='fc2')(y)
    y = MyLayer(w_vec, allow_training, name='core')(y)
    y = Dense(1, name='fc3')(y)
    net = Model(inputs=x, outputs=y, name='{}-{}'.format( 'randomInit' if w_vec is None else 'assignInit',
                                                          'trainable' if allow_training else 'nontrainable'))
    print(net.name)
    print(net.layers[-2].get_weights()[0])
    print(net.summary())
    return net

And you may run the following test cases to see the differences (pay attention to the first and the last lines in the print out, which gives you the initial values and the number of constant parameters, respectively)

a. Constant weights, non-trainable

m1 = make_test_cases(w_vec=np.arange(3).reshape([1,3]), allow_training=False)

will give you

assignInit-nontrainable [[0. 1. 2.]]
_________________________________________________________________  
Layer (type)                 Output Shape              Param # 

=================================================================  
input_4 (InputLayer)         (None, 75)                0         
_________________________________________________________________  
fc1 (Dense)                  (None, 75)                5700      
_________________________________________________________________  
fc2 (Dense)                  (None, 3)                 228       
_________________________________________________________________  
core (MyLayer)               (None, 1)                 3         
_________________________________________________________________  
fc3 (Dense)                  (None, 1)                 2         
=================================================================  
Total params: 5,933  
Trainable params: 5,930  
Non-trainable params: 3
_________________________________________________________________

b. Constant weights, trainable

m2 = make_test_cases(w_vec=np.arange(3).reshape([1,3]), allow_training=True)

will give you

assignInit-trainable    [[0. 1. 2.]]
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_5 (InputLayer)         (None, 75)                0         
_________________________________________________________________
fc1 (Dense)                  (None, 75)                5700      
_________________________________________________________________
fc2 (Dense)                  (None, 3)                 228       
_________________________________________________________________
core (MyLayer)               (None, 1)                 3         
_________________________________________________________________
fc3 (Dense)                  (None, 1)                 2         
=================================================================
Total params: 5,933
Trainable params: 5,933
Non-trainable params: 0
_________________________________________________________________

c. Random weights, trainable

m3 = make_test_cases(w_vec=None, allow_training=True)

will give you

randomInit-trainable [[ 0.02650297 -0.02010062 -0.03771694]]
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_6 (InputLayer)         (None, 75)                0         
_________________________________________________________________
fc1 (Dense)                  (None, 75)                5700      
_________________________________________________________________
fc2 (Dense)                  (None, 3)                 228       
_________________________________________________________________
core (MyLayer)               (None, 1)                 3         
_________________________________________________________________
fc3 (Dense)                  (None, 1)                 2         
=================================================================
Total params: 5,933
Trainable params: 5,933
Non-trainable params: 0
_________________________________________________________________

Final remark

I will say it is unclear which case may work better in advance for your problem, but trying all three sounds like a good plan.

w_vec = [1, 1, 1]...sir can i fix the constant weights directly — karun reddy, Dec 29 '19 at 05:51
Of course you can, just set something like `m1 = make_test_cases(w_vec=np.array([[1,1,1]]), allow_training=False)` — pitfall, Dec 30 '19 at 08:51
def forward_prop(params) '''some code number of layers and neurons where last layer gives result as output''' exp_scores = np.exp(result) probs = exp_scores / np.sum(exp_scores, axis=1, keepdims=True) N = 150 # Number of samples corect_logprobs = -np.log(probs[range(N), y]) loss = np.sum(corect_logprobs) / N return loss — karun reddy, Jan 05 '20 at 14:20
This computes for the forward propagation of the neural network, as well as the loss.so how to write in keras — karun reddy, Jan 05 '20 at 14:22

how to do custom keras layer matrix multiplication

2 Answers2

Here is an example

a. Constant weights, non-trainable

b. Constant weights, trainable

c. Random weights, trainable

Final remark