11

I am using Keras to generate a simple single layer feed forward network. I'd like to get a better handle on the values of the weights when they are initialized via the kernel_initializer argument.

Is there a way I can view the values of the weights just after initialisation (i.e. before the training is complete)?

desertnaut
  • 57,590
  • 26
  • 140
  • 166
Agrippa
  • 475
  • 2
  • 5
  • 13
  • Related question with informative answers: [Why should weights of Neural Networks be initialized to random numbers?](https://stackoverflow.com/questions/20027598/why-should-weights-of-neural-networks-be-initialized-to-random-numbers) – charlesreid1 Oct 18 '17 at 21:06

3 Answers3

11

Just use get_weights() on the model. For example:

i = Input((2,))
x = Dense(5)(i)

model = Model(i, x)

print model.get_weights()

This will print a 2x5 matrix of weights and a 1x5 matrix of biases:

[array([[-0.46599612,  0.28759909,  0.48267472,  0.55951393,  0.3887372 ],
   [-0.56448901,  0.76363671,  0.88165808, -0.87762225, -0.2169953 ]], dtype=float32), 
 array([ 0.,  0.,  0.,  0.,  0.], dtype=float32)]

Biases are zero since the default bias initializer is zeros.

Chris K
  • 1,703
  • 1
  • 14
  • 26
  • Just tested it on my code. You and @charlesreid1 are indeed right. The reason I got confused is that I thought you get the weights from get_weights()[1] due to another answer on stackoverflow, but if the weights are from get_weights()[0] it all makes sense. Would you mind telling me what each array represents? Thanks a lot. – Agrippa Oct 18 '17 at 17:11
  • The representation of each array depends on the type of the layer; however you can usually infer what the array represents from its shape. Ask another question if you want to know for a specific layer type. – Chris K Oct 18 '17 at 17:24
  • It's actually not that obvious to me. So I'll close this question and ask another for the specific structure of my FFN. Thanks for your help! – Agrippa Oct 18 '17 at 18:15
  • in case this is still of interest, get_weights()[1] may give you the bias while [0] will give you the conv weights. Bias are by default initialised to 0 so I'm guessing this is why you saw all 0s array. – Scratch Sep 25 '18 at 09:14
3

You need to specify the dimensions of the input to the first layer otherwise it will give you an empty list. Compare both results from both prints the only difference is in the initialization of the shape of the input.

from keras import backend as K
import numpy as np 
from keras.models import Sequential
from keras.layers import Dense
# first model without input_dim prints an empty list   
model = Sequential()
model.add(Dense(5, weights=[np.ones((3,5)),np.zeros(5)], activation='relu'))
print(model.get_weights())


# second model with input_dim prints the assigned weights
model1 = Sequential()
model1.add(Dense(5,  weights=[np.ones((3,5)),np.zeros(5)],input_dim=3, activation='relu'))
model1.add(Dense(1, activation='sigmoid'))

print(model1.get_weights())
Stephen Rauch
  • 47,830
  • 31
  • 106
  • 135
  • As of the current Keras version 2.6.0, the first case of `model.get_weights()` without specifying `input_dim` will not return an empty list, but will throw a `ValueError`; see https://stackoverflow.com/a/69553344/4685471 – desertnaut Oct 13 '21 at 10:07
1

The answer given by @Chris_K should work - model.get_weights() prints correct initialization weights before fit is called. Try running this code as a sanity check - it should print two matrices (for two layers) that are non-zero, then print two matrices that are zero:

from keras.models import Sequential
from keras.layers import Dense
import keras
import numpy as np

X = np.random.randn(10,3)
Y = np.random.randn(10,)

# create model
model1 = Sequential()
model1.add(Dense(12, input_dim=3, activation='relu'))
model1.add(Dense(1, activation='sigmoid'))

print(model1.get_weights())

# create model
model2 = Sequential()
model2.add(Dense(12, input_dim=3, kernel_initializer='zero', activation='relu'))
model2.add(Dense(1, kernel_initializer='zero', activation='sigmoid'))

print(model2.get_weights())

Here's the output I'm seeing:

[
array([[-0.08758801, -0.20260376,  0.23681498, -0.59153044, -0.26144034,
         0.48446459, -0.02285194,  0.0874517 ,  0.0555284 , -0.14660612,
         0.05574059, -0.14752924],
       [ 0.20496374, -0.4272995 ,  0.07676286, -0.38965166,  0.47710329,
        -0.26640627, -0.33820981, -0.48640659,  0.11153179, -0.01180136,
        -0.52833426,  0.56279379],
       [-0.12849617,  0.2982074 ,  0.38974017, -0.58133346, -0.09883761,
         0.56037289,  0.57482034,  0.08853614,  0.14282584, -0.52498174,
        -0.35414279, -0.49750996]], dtype=float32), array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.], dtype=float32), array([[-0.65539688],
       [-0.58926439],
       [ 0.6232332 ],
       [-0.6493122 ],
       [ 0.57437611],
       [-0.42971158],
       [ 0.66621709],
       [-0.17393446],
       [ 0.57196724],
       [-0.01042461],
       [ 0.32426012],
       [-0.08326346]], dtype=float32), array([ 0.], dtype=float32)]
[array([[ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.]], dtype=float32), array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.], dtype=float32), array([[ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.],
       [ 0.]], dtype=float32), array([ 0.], dtype=float32)]
charlesreid1
  • 4,360
  • 4
  • 30
  • 52