I'm building convolutional neural network for classification of the data into different categories The input data is of shape : 30000, 6, 15, 1 the data has 30000 samples, 15 predictors and 6 possible categories.
The weight and bias dictionary I'm using is as follows:
weights = {
'wc1': tf.get_variable('W0', shape=(3,3,1,8), initializer=tf.contrib.layers.xavier_initializer()),
'wc2': tf.get_variable('W1', shape=(3,3,8,12), initializer=tf.contrib.layers.xavier_initializer()),
'wc3': tf.get_variable('W2', shape=(3,3,12,16), initializer=tf.contrib.layers.xavier_initializer()),
'wc4': tf.get_variable('W3', shape=(3,3,16,20), initializer=tf.contrib.layers.xavier_initializer()),
'wd1': tf.get_variable('W4', shape=(4*4*20,20), initializer=tf.contrib.layers.xavier_initializer()),
'out': tf.get_variable('W6', shape=(20,n_classes), initializer=tf.contrib.layers.xavier_initializer()),
}
biases = {
'bc1': tf.get_variable('B0', shape=(8), initializer=tf.contrib.layers.xavier_initializer()),
'bc2': tf.get_variable('B1', shape=(12), initializer=tf.contrib.layers.xavier_initializer()),
'bc3': tf.get_variable('B2', shape=(16), initializer=tf.contrib.layers.xavier_initializer()),
'bc4': tf.get_variable('B3', shape=(20), initializer=tf.contrib.layers.xavier_initializer()),
'bd1': tf.get_variable('B4', shape=(20), initializer=tf.contrib.layers.xavier_initializer()),
'out': tf.get_variable('B5', shape=(6), initializer=tf.contrib.layers.xavier_initializer()),
}
As expected the output tensor of :
def conv_net(x, weights, biases):
conv1 = conv2d(x, weights['wc1'], biases['bc1'])
conv1 = maxpool2d(conv1, k=2)
conv2 = conv2d(conv1, weights['wc2'], biases['bc2'])
conv2 = maxpool2d(conv2, k=2)
conv3 = conv2d(conv2, weights['wc3'], biases['bc3'])
conv3 = maxpool2d(conv3, k=2)
conv4 = conv2d(conv3, weights['wc4'], biases['bc4'])
conv4 = maxpool2d(conv4, k=2)
# Fully connected layer
# Reshape conv2 output to fit fully connected layer input
fc1 = tf.reshape(conv4, [-1, weights['wd1'].get_shape().as_list()[0]])
fc1 = tf.add(tf.matmul(fc1, weights['wd1']), biases['bd1'])
fc1 = tf.nn.relu(fc1)
# Output, class prediction
out = tf.add(tf.matmul(fc1, weights['out']), biases['out'])
return out
When x = x(batch size = 64) is of shape (4,6).
But since the labels for the batch os 64 is of shape [64,6] where 6 is the number of categories, the cost function defined as
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))
where pred = conv_net(x, weights, biases)
gives the error :
InvalidArgumentError (see above for traceback): logits and labels must be broadcastable: logits_size=[4,6] labels_size=[64,6]
[[Node: softmax_cross_entropy_with_logits_sg = SoftmaxCrossEntropyWithLogits[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Add_1, softmax_cross_entropy_with_logits_sg/Reshape_1)]]
If my understanding is correct it has to do with the definition of fully connected layer and output layer filter size in weights library. Am I understanding right and if yes, what should be the filter shape in FC and output layers? and what's the underlying logic?