Google Cloud TPUs (Tensor Processing Units) accelerate machine learning workloads developed using TensorFlow. This tag is used for questions about using the Google Cloud TPU service. Topics can range from the service user experience, issues with the trainer program written with Tensorflow, project quota issues, security, authentication, etc.
Questions tagged [google-cloud-tpu]
188 questions
3
votes
2 answers
Mask R-CNN for TPU on Google Colab
We are trying to build an image segmentation deep learning model using Google Colab TPU. Our model is Mask R-CNN.
TPU_WORKER = 'grpc://' + os.environ['COLAB_TPU_ADDR']
import tensorflow as tf
tpu_model =…

Jayanth Rasamsetti
- 31
- 2
3
votes
2 answers
Panic error on google cloud TPU
I can open a ctpu session and get the code I need from my git repository, but when I run my tensorflow code from the cloud shell, I get a message to say that there is no TPU and my program crashes. Here is the error message I…

Adrien Doerig
- 86
- 10
3
votes
0 answers
GCE VM can't connect to TPU
I've been following the instruction at https://cloud.google.com/tpu/docs/custom-setup
and now I'm trying to run a tiny example from https://cloud.google.com/tpu/docs/quickstart
But it hangs on sess.run(tpu.initialize_system())
I suspect that it…

Max Moroz
- 31
- 1
2
votes
2 answers
Tensorboard profiling a predict call using Cloud TPU Node
I've been trying to profile a predict call of a custom NN model using a Cloud TPU v2-8 Node.
It is important to say that my prediction call takes about 2 minutes to finish and I do it using data divided in TFRecord batches.
I followed the official…

Mauricio Caetano
- 21
- 3
2
votes
1 answer
TPU returning "failed call to cuInit: UNKNOWN ERROR (303)" on Google Cloud with Kubernetes Cluster
I am trying to use a TPU with Google Cloud's Kubernetes engine. My code returns several errors when I try to initialize the TPU, and any other operations only run on the CPU. To run this program, I am transferring a Python file from my Dockerhub…

Lexi2277
- 35
- 4
2
votes
1 answer
Google Colab TPU Version
How do I print in Google Colab which TPU version I am using and how much memory the TPUs have?
With I get the following Output
tpu =…
user14588808
2
votes
0 answers
Can't create Google Cloud TPU: an internal error has occurred code 13
I can't create a Google Cloud TPU:
gcloud compute tpus create bert-tpu --version=1.15 --preemptible --zone=europe-west4-a
Create request issued for: [bert-tpu]
Waiting for operation [projects//locations/europe-west4-a/operations/] to…

webb
- 4,180
- 1
- 17
- 26
2
votes
1 answer
Error with TPUClusterResolver for Cloud TPU v3 Pod with TensorFlow 2.1
I'm trying to use my (pre-emptible) Cloud TPU v3-256 on my Google Cloud Compute Engine VM with TensorFlow 2.1, but it doesn't seem to be working as the TPUClusterResolver throws a Could not lookup TPU metadata error.
Using individual…

Vineeth Narayanan
- 21
- 3
2
votes
2 answers
GCP and TPU, experimental_connect_to_cluster give no response
I am trying to use TPU on GCP with tensorflow 2.1 with Keras API.
Unfortunately, I am stuck after creating the tpu-node.
In fact, it seems that my VM "see" the tpu, but could not connect to it.
The code I am using :
resolver =…

Shiro
- 795
- 1
- 7
- 23
2
votes
1 answer
No OpKernel was registered to support Op 'TPUReplicateMetadata' used by node TPUReplicateMetadata
When I run the following .ipynb:
https://colab.research.google.com/drive/1DpUCBm58fruGNRtQL_DiSVbT90spdZgm
I got:
No OpKernel was registered to support Op 'TPUReplicateMetadata' used by node TPUReplicateMetadata (defined at…
user7862197
2
votes
1 answer
Colab TPU: TensorFlow '2.0.0-beta0' LinearClassifier .train Bug
Attempting to get LinearClassifier running with Colab TPU.
https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/estimator/LinearClassifier
TPUStrategy is supported in TensorFlow 2.0…

Machine Learning
- 485
- 6
- 15
2
votes
1 answer
How does BERT utilize TPU memories?
README in the Google's BERT repo says, even a single sentence of length 512 can not sit in a 12 GB Titan X for the BERT-Large model.
But in the BERT paper, it says 64 TPU chips are used to train BERT-Large
with a maximum length 512 and batch size…

soloice
- 980
- 8
- 17
2
votes
1 answer
Converting code from keras to tf.keras causes problems
I am learning machine translation in Keras using the code from this article. The article's code works fine on GPU and CPU as-is.
Now I want to take advantage of Google Colab TPUs. The code doesn't TPU-ify as-is, I need to move in a TF direction. …

Lars Ericson
- 1,952
- 4
- 32
- 45
2
votes
2 answers
Simple model can't run on tpu (on colab)
I have problems running a very simple model using TPU on google colab. I have distilled it to a very simple program. I suspect it doesn't like the nested models (input_2?) but I have no idea how to solve this:
import numpy as np
import os
import…

Moshel
- 400
- 3
- 13
2
votes
2 answers
How to save Keras model trained on TPU?
I'm using Colab environment to make experiments with lstm model. But cannot save trained model.
sess = tf.keras.backend.get_session()
training_model = lstm_model(seq_len=100, batch_size=128, stateful=False)
tpu_model =…

psu
- 111
- 1
- 10