Getting nan in tensorflow

Question

I wanted to build a simple machine learning model with tensorflow so that I could understand the process of machine learning and be able do stuff myself.

The dataset that I decided to use are from kaggle.

https://www.kaggle.com/andonians/random-linear-regression/version/2

Since I'm a beginner I didn't want to divide the dataset into 2 parts, namely training and validation.

My code is as follows

train=train.reindex(np.random.permutation(train.index))
train['x']=train['x']/(train['x'].max())
train['y']=train['y']/(train['y'].max())
train_features=np.array(train[['x']])

train_label=np.array(train[['y']])

train_features=tf.convert_to_tensor(train_features)
train_label=tf.convert_to_tensor(train_label)

w=tf.convert_to_tensor(tf.Variable(tf.truncated_normal([1,700],mean=0.0,stddev=1.0,dtype=tf.float64)))

b=tf.convert_to_tensor(tf.Variable(tf.zeros(1,dtype=tf.float64)))
def cal(x,y):
prediction=tf.add(tf.matmul(w,x),b)
error=tf.reduce_mean(tf.square(y-prediction))
return [prediction,error]
y,cost=cal(train_features,train_label)

learning_rate=0.05

epochs=3000

init=tf.global_variables_initializer()
optimize=tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

with tf.Session() as sess:

sess.run(init)
for i in list(range(epochs)):
sess.run(optimize)
print(sess.run(cost))

But the output that I get is nan. nan all the way upto 3000 lines( epochs is 3000).

I can't seem to figure out the reason. I even tried to run it without normalization i.e. without converting the values between 0 and 1. I have also decreased the learning rate to 0.0005. But it seems to have no effect.

Thanks in advance.

P.S.- I have not included the test set as I first want to train and see if it works. I will add it later.

score 0 · Answer 1 · answered Dec 23 '18 at 14:14

Note: Since no one has answered my question and I figured it out myself, I decided to answer my question in case someone encounters the same issues.

1) the nan values I was receiving when running the program was because the orginal data set had nan values. To solve this use

train.dropna(inplace=True)

This should clear up the nan values.

2)The size of w should match the size of x. For a single feature let the size of x be (m,1). Then the size of w will be (1,m). This is essentially matrix multiplication.

Getting nan in tensorflow

1 Answers1