0

I want to build and train a neural network with tensorflow (but without Keras, on Keras it I got it working) on the kaggle dataset 'House Prices'. I use Python and apart from the actual training, my code runs fine. However, when training, I either get no error (but it doesn't train), or I get a TypeError: 'DataFrame' objects are mutable, thus they cannot be hashed.

I run the script on Google's colab in a ipynotebook and I believe that the main issue is in entering the feed_dict. However, I don't know what is wrong here. The batch_X contains 100x10 features, and the batch_Y has 100 labels. I guess that this might be the critical snipplet:

train_data = { X: batch_X, Y_:batch_Y }

The train_data is what I feed to sess.run(train_step, feed_dict=train_data")

Here's my code: https://colab.research.google.com/drive/1qabmzzicZVu7v72Be8kljM1pUaglb1bY

# train and train_normalized are the training data set (DataFrame)
# train_labels_normalized are the labels only

#Start session:
with tf.Session() as sess:
  sess.run(init)

  possible_indeces = list(range(0, train.shape[0]))
  iterations = 1000
  batch_size = 100

  for step in range(0, iterations):
    #draw batch indeces:
    batch_indeces = random.sample(possible_indeces, batch_size)
    #get features and respective labels
    batch_X = np.array(train_normalized.iloc[batch_indeces])
    batch_Y = np.array(train_labels_normalized.iloc[batch_indeces])

    train_data = { X: batch_X, Y_: batch_Y}

    sess.run(train_step, feed_dict=train_data)

What I was hoping for is that it would run for a couple of minutes and return with optimized Weights (2 hidden layers with 48 nodes each) allowing me to make predictions. However, it simply skips over the above code or throws the error belo.

Does anyone have an idea what went wrong?

TypeError Traceback (most recent call last)
<ipython-input-536-79506f90a868> in <module>()
     13     batch_Y = p.array(train_labels_normalized.iloc[batch_indeces])
     14 
---> 15     train_data = { X: batch_X, Y_: batch_Y}
     16 
     17     sess.run(train_step, feed_dict=train_data)

  /usr/local/lib/python3.6/dist-packages/pandas/core/generic.py in __hash__(self)

   1814  def __hash__(self):
   1815  raise TypeError('{0!r} objects are mutable, thus they cannot be'
-> 1816     ' hashed'.format(self.__class__.__name__))
   1817
   1818     def __iter__(self):

  TypeError: 'DataFrame' objects are mutable, thus they cannot be hashed
Lumos
  • 570
  • 1
  • 11
  • 24
McCahen
  • 3
  • 3
  • Your code works on my Colab Copy? Can you edit your stack trace it is hard to read. – Sıddık Açıl Jul 14 '19 at 20:44
  • Thank you for looking at it, @SıddıkAçıl. For some reason it only sometimes has that error. I have reformatted the traceback and added some additional outputs in the Colab now. The additional output indicates clearly that the model doesn't learn as I hoped. However, it does indeed iterate. Maybe the optimizer and loss function don't work together? I have no idea... O_ô Any help is highly appreciated. – McCahen Jul 15 '19 at 10:33

1 Answers1

0

The problem derives from your seventh(Test) step.

#Set X to the test data
X = test_normalized.astype(np.float32)
print(type(X)) # **<class 'pandas.core.frame.DataFrame'>**
Y1 = tf.nn.sigmoid(tf.matmul(X, W1))
Y2 = tf.nn.sigmoid(tf.matmul(Y1, W2))
Y3 = tf.matmul(Y2, W3)

You are setting X to a DataFrame. On the first run this does not affect anything. But, when you run sixth step after seventh you run into this problem because you have overwritten contents of X.

Try changing X to X_:

X_ = test_normalized.astype(np.float32)
Y1 = tf.nn.sigmoid(tf.matmul(X_, W1))

Also, your final eval does not work. Get it into a tf.Session.

Sıddık Açıl
  • 957
  • 8
  • 18
  • Thank you very much! @Sıddık Açıl This indeed solved the problem of the error appearing. (: I have also changed the code for the eval in the current version of the Colab. However, the outputs indicate, that the code doesn't learn at all... Do you have an idea why that could be? – McCahen Jul 15 '19 at 16:49