I wanted to build a simple machine learning model with tensorflow so that I could understand the process of machine learning and be able do stuff myself.
The dataset that I decided to use are from kaggle.
https://www.kaggle.com/andonians/random-linear-regression/version/2
Since I'm a beginner I didn't want to divide the dataset into 2 parts, namely training and validation.
My code is as follows
train=train.reindex(np.random.permutation(train.index))
train['x']=train['x']/(train['x'].max())
train['y']=train['y']/(train['y'].max())
train_features=np.array(train[['x']])
train_label=np.array(train[['y']])
train_features=tf.convert_to_tensor(train_features)
train_label=tf.convert_to_tensor(train_label)
w=tf.convert_to_tensor(tf.Variable(tf.truncated_normal([1,700],mean=0.0,stddev=1.0,dtype=tf.float64)))
b=tf.convert_to_tensor(tf.Variable(tf.zeros(1,dtype=tf.float64)))
def cal(x,y):
prediction=tf.add(tf.matmul(w,x),b)
error=tf.reduce_mean(tf.square(y-prediction))
return [prediction,error]
y,cost=cal(train_features,train_label)
learning_rate=0.05
epochs=3000
init=tf.global_variables_initializer()
optimize=tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
with tf.Session() as sess:
sess.run(init)
for i in list(range(epochs)):
sess.run(optimize)
print(sess.run(cost))
But the output that I get is nan. nan all the way upto 3000 lines( epochs is 3000).
I can't seem to figure out the reason. I even tried to run it without normalization i.e. without converting the values between 0 and 1. I have also decreased the learning rate to 0.0005. But it seems to have no effect.
Thanks in advance.
P.S.- I have not included the test set as I first want to train and see if it works. I will add it later.