Error processing Timeseries tensorflow notebook on TPU

Question

Here's a Timeseries notebook I used from the good work by Magnus Erik Hvass Pedersen - thanks for that:

https://colab.research.google.com/drive/1F6CuGVWN5TNgIjqxdu5glFeGBEr71TgO

I have had success running a version of this notebook via Google Colab on a GPU but when I do the same (after some modifications to make the code compatible on TPUs) I get this error:

ValueError: Error when checking input: expected input to have shape (299776, 20) but got array with shape (33309, 20)

The full stack trace can be found on the cell location https://colab.research.google.com/drive/1F6CuGVWN5TNgIjqxdu5glFeGBEr71TgO#scrollTo=wdSmXdvDw5HL.

It has been a bit of a tug-of-war to get the input/output data shapes in order but as we kept solving I/O shape issues other shape related issues started proping up.

The notebook is available for sharing and commenting.

Any thoughts will be appreciated.

It appears that the initial query has been resolved by changing test set size to a smaller number ie 1344, see https://colab.research.google.com/drive/1F6CuGVWN5TNgIjqxdu5glFeGBEr71TgO#scrollTo=6u0JTx3Zw5Fn&line=2&uniqifier=1 among other changes to the notebook. Now I get the error `AssertionError: batch_size must be divisible by strategy.num_towers (1 vs 8)`, `batch_size` has been set to 256 which is divisible by 8, so the error message isn't clear. See section **Train the Recurrent Neural Network** from the ToC on the left-hand panel. -- This isn't ideal I'd like to use all the data. — Mani Sarkar, Oct 17 '18 at 13:51
Hi Mani! you should edit the question to reflect your update issue if you feel like it is similar enough to the original question. — d_kennetz, Oct 17 '18 at 14:12
@d_kennetz The title of the query does reflect the issue, the temporary resolution leads to a new problem, so we didn't really solve the original problem (if you see my comment above), I would like to drop the work-around and get the notebook to work. — Mani Sarkar, Oct 17 '18 at 15:57
batch_generator is yield'ing within the for loop, where in the original notebook its yielding after the for loop. I think that might be the issue? — michaelb, Oct 17 '18 at 18:32
@michaelb are you saying this after running the notebook and looking at the code or just DRY run of the code. I will take a look and see if I can use your hint - thanks for that. — Mani Sarkar, Oct 17 '18 at 22:16
Actually it seems the batch_generator was not the issue, I was able to get the model to train by removing the validation_data option. This makes sense as the validation data for this model is just one example, not a batch. I'll see if there is a way around this and follow up. — michaelb, Oct 18 '18 at 17:04
Thanks for that insight, did you undo all our steps, if you see we got around the first problem by adjusting the test set to 1344, see cell https://colab.research.google.com/drive/1F6CuGVWN5TNgIjqxdu5glFeGBEr71TgO#scrollTo=6u0JTx3Zw5Fn&line=2&uniqifier=1 and the next one — Mani Sarkar, Oct 19 '18 at 20:54
When I reset the test dataset to its original code state, remove the validation set and re-run the notebook, I get *ValueError: Operation 'tpu_140099307695464/VarIsInitializedOp' has been marked as not fetchable.* — Mani Sarkar, Oct 19 '18 at 21:02
@michaelb Two questions for you, what code changes did you make to get the notebook work on TPUs and also did you get a chance to find out how to make the original notebook run without any errors/exceptions. — Mani Sarkar, Nov 16 '18 at 22:17

score 2 · Accepted Answer · answered Nov 15 '18 at 20:54

2

To solve the error ValueError: Operation 'tpu_140099307695464/VarIsInitializedOp'. Try using tf.train.RMSPropOptimizer instead of using RMSProp from tensorflow.keras.optimizers.

answered Nov 15 '18 at 20:54

aman2930

275
2
9

Thanks again, it worked, the model is now training - no errors so far. – Mani Sarkar Nov 22 '18 at 01:03
1

Here's the link to the tweet and post talking about the notebooks - https://twitter.com/theNeomatrix369/status/1065939282265808896, thanks both @michaelb and Aman2930 for your help. You guys also helped to make this possible. – Mani Sarkar Nov 27 '18 at 18:52
1

Since this fixed your issue, would you mind choosing this as the correct answer for the benefit of future users? – liamdalton Nov 28 '18 at 18:06

Error processing Timeseries tensorflow notebook on TPU

1 Answers1