My colleagues and this question on Cross Validated say you should transform data to zero mean and unit variance for neural networks. However, my performance was slightly worse with scaling than without.
I tried using:
scaler = preprocessing.StandardScaler().fit(X_train)
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)
steps = 5000
def exp_decay(global_step):
return tf.train.exponential_decay(
learning_rate=0.1, global_step=global_step,
decay_steps=steps, decay_rate=0.01)
random.seed(42) # to sample data the same way
classifier = skflow.TensorFlowDNNClassifier(
hidden_units=[150, 150, 150],
n_classes=2,
batch_size=128,
steps=steps,
learning_rate=exp_decay)
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
Did I do something wrong or is scaling not necessary?