-1

My model throws weird output in predict function:

from tensorflow import keras
from tensorflow.keras import layers
import pandas as pd

data = pd.read_csv("/kaggle/input/water-potability/water_potability.csv")

wat_train = data.sample(frac=0.7, random_state=0)
wat_valid = data.drop(wat_train.index)

max_ = wat_train.max(axis=0)
min_ = wat_train.min(axis=0)
wat_train = (wat_train - min_) / (max_ - min_)
wat_valid = (wat_valid - min_) / (max_ - min_)

# Split features and target
X_train = wat_train.drop('Potability', axis=1)
X_valid = wat_valid.drop('Potability', axis=1)
y_train = wat_train['Potability']
y_valid = wat_valid['Potability']

print(X_train.shape)


model = keras.Sequential([
    layers.BatchNormalization(input_shape=[9]),
    layers.Dense(4, activation="relu"),
    layers.BatchNormalization(),
    layers.Dropout(0.3),
    layers.Dense(4, activation="relu"),
    layers.BatchNormalization(),
    layers.Dropout(0.3),
    layers.Dense(1, activation="sigmoid")
])

model.compile(
    optimizer='adam',
    loss='binary_crossentropy',
    metrics=['binary_accuracy'],
)


early_stopping = keras.callbacks.EarlyStopping(
    patience=10,
    min_delta=0.001,
    restore_best_weights=True,
)

model.fit(X_train, y_train,
         validation_data=(X_valid, y_valid),
         batch_size=512,
         epochs=1000,
         callbacks=[early_stopping],
)


predictions = model.predict(X_valid)

print(predictions)

It prints:

[[nan]
 [nan]
 [nan]
 [nan]
 [nan]
 [nan]
 [nan]
 [nan]]

I am using Water Quality Dataset on kaggle: https://www.kaggle.com/datasets/adityakadiwal/water-potability

The output layer is sigmoid function. I want to return between 0 -> 1 But this returns unlimited number of [nan]

Ajeet Verma
  • 2,938
  • 3
  • 13
  • 24
JellyCZ
  • 9
  • 3

1 Answers1

0

This could be due water-potability.csv contains None cells. You can read this to replace them with others values: Replace None with NaN in pandas dataframe

TomatPast
  • 21
  • 5