I'm training the following model:
model = tf.keras.Sequential([
tf.keras.layers.Embedding(input_dim=30, output_dim=64, mask_zero=True),
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(units=1024)),
tf.keras.layers.Dense(128, activation="sigmoid"),
tf.keras.layers.Dense(10, activation="linear")
])
This network deals with text, so I turned each string of my dataset into a numpy array by converting each letter to a numeric value:
def converter(fen):
normal_list = []
for letter in fen:
if letter == "/" or letter == " " or letter == "-":
normal_list.append(0)
elif letter == "p":
normal_list.append(1)
elif letter == "P":
normal_list.append(2)
elif letter == "n":
normal_list.append(3)
elif letter == "N":
normal_list.append(4)
elif letter == "b":
normal_list.append(5)
elif letter == "B":
normal_list.append(6)
elif letter == "r":
normal_list.append(7)
elif letter == "R":
normal_list.append(8)
elif letter == "q":
normal_list.append(9)
elif letter == "Q":
normal_list.append(10)
elif letter == "k":
normal_list.append(11)
elif letter == "K":
normal_list.append(12)
elif letter == "a":
normal_list.append(13)
elif letter == "b":
normal_list.append(14)
elif letter == "c":
normal_list.append(15)
elif letter == "d":
normal_list.append(16)
elif letter == "e":
normal_list.append(17)
elif letter == "f":
normal_list.append(18)
elif letter == "g":
normal_list.append(19)
elif letter == "h":
normal_list.append(20)
elif letter == "1":
normal_list.append(21)
elif letter == "2":
normal_list.append(22)
elif letter == "3":
normal_list.append(23)
elif letter == "4":
normal_list.append(24)
elif letter == "5":
normal_list.append(25)
elif letter == "6":
normal_list.append(26)
elif letter == "7":
normal_list.append(27)
elif letter == "8":
normal_list.append(28)
elif letter == "9":
normal_list.append(29)
else:
normal_list.append(0)
return np.array(normal_list, ndmin=2).astype(np.float32)
# I used ndmin = 2 because the embedding layer turns it into ndmin = 3
Then I imported the dataset for training converting the samples:
x_set = []
y_set = []
for position in df["position"]:
x_set.append(cvt.converter(position))
The len(x_set)
is 950, and the x_set[0].shape
is (1, ?) where ? varies between 50 and 70.
About the y_set
, I used:
for a in range(len(df["position"])):
y_set.append(np.array([
df["Pawns"][a], df["Knights"][a], df["Bishops"][a], df["Rooks"][a],
df["Queens"][a], df["Mobility"][a], df["King"][a], df["Threats"][a],
df["Passed"][a], df["Space"][a]
], ndmin=2)) # If I don't use ndmin = 2 here I get ValueError: Data cardinality is ambiguous
And its len is also 950
When I call model.fit(x_set, y_set, epochs = 10)
the model only uses one sample to train the net:
Epoch 1/10
1/1 [==============================] - 19s 19s/step - loss: 0.2291 - mae: 0.4116
Epoch 2/10
1/1 [==============================] - 3s 3s/step - loss: 0.1645 - mae: 0.3302
Epoch 3/10
1/1 [==============================] - 3s 3s/step - loss: 0.0764 - mae: 0.1982
Epoch 4/10
1/1 [==============================] - 3s 3s/step - loss: 1.4347 - mae: 1.0087
Epoch 5/10
1/1 [==============================] - 3s 3s/step - loss: 0.0038 - mae: 0.0461
Epoch 6/10
1/1 [==============================] - 3s 3s/step - loss: 0.0532 - mae: 0.1780
Epoch 7/10
1/1 [==============================] - 3s 3s/step - loss: 0.0597 - mae: 0.1931
Epoch 8/10
1/1 [==============================] - 3s 3s/step - loss: 0.0522 - mae: 0.1814
Epoch 9/10
1/1 [==============================] - 3s 3s/step - loss: 0.0375 - mae: 0.1583
Epoch 10/10
1/1 [==============================] - 3s 3s/step - loss: 0.0252 - mae: 0.1432
Shouldn't it be using all of 950 samples of x_set? What is wrong in this code?