I have a model with several dense layers that behaves normally in all aspects.
Then, I add weights to the training events (their values are between 0 and 1):
w = mydata.Weight
#...
kfold = GroupKFold(n_splits=num_folds)
for train, test in kfold.split(X, y, groups=groups):
X_train, X_test = X.iloc[train], X.iloc[test]
y_train, y_test = y.iloc[train], y.iloc[test]
w_train = w.iloc[train]
#...
le_fit = model.fit(X_train, y_train, batch_size=200, epochs=10, sample_weight=w_train, verbose=0)
#...
predictions = np.rint(model.predict(X_test))
and the prediction becomes useless:
InvalidArgumentError: `predictions` contains negative values
Condition x >= 0 did not hold element-wise:
x (confusion_matrix_1/Cast:0) =
[-9223372036854775808 .......
Just to be safe, I added constraints in the layers, eg:
layers.Dense(units=800, activation='relu', kernel_constraint=constraints.MinMaxNorm(min_value=0.0, max_value=1.0))
but nothing changed.
Can you suggest what is going wrong?
Edit: I now realized that the training loss is also a nan.
Edit: I made all weights equal to one. The results don't change.
Edit: I don't know why this question was closed as asking for debugging. The answer makes it obvious that it wasn't about debugging. It is about the correct usage of two very commonly used items (Keras with GroupKFold), which turns out to include a counter-intuitive element, and it is not problem-specific.