Google Cloud TPUs (Tensor Processing Units) accelerate machine learning workloads developed using TensorFlow. This tag is used for questions about using the Google Cloud TPU service. Topics can range from the service user experience, issues with the trainer program written with Tensorflow, project quota issues, security, authentication, etc.
Questions tagged [google-cloud-tpu]
188 questions
2
votes
2 answers
InvalidArgumentError: Unsuccessful TensorSliceReader constructor: Failed to get matching files ... File system scheme '[local]' not implemented
I get the following error when running a notebook:
InvalidArgumentErrorTraceback (most recent call last)
in ()
----> 1 tpu_ops = tf.contrib.tpu.batch_parallel(run_find_closest_latent_vector, [],…

Mani Sarkar
- 115
- 2
- 9
2
votes
3 answers
Google Colab KeyError: 'COLAB_TPU_ADDR'
I'm trying to run a simple MNIST classifier on Google Colab using the TPU option. After creating the model using Keras, I am trying to convert it into TPU by:
import tensorflow as tf
import os
tpu_model = tf.contrib.tpu.keras_to_tpu_model(
…

Nick Ben
- 53
- 2
- 6
2
votes
2 answers
Can't access TPU from VM in GCP
Trying to run this code
import os import tensorflow as tf from tensorflow.contrib
import tpu from tensorflow.contrib.cluster_resolver import TPUClusterResolver
def axy_computation(a, x, y): return a * x + y
inputs = [
3.0,
tf.ones([3,…

wadhwasahil
- 468
- 7
- 28
2
votes
2 answers
TPU local Filesystem doesn't exist?
I wrote a NN model that analyze an image and extract 8 floating numbers at the end. The model is working fine (but slowly) on my computer so I try it on the TPU cloud and there BAM! I have an error:
I1008 12:58:47.077905 140221679261440…

user1273813
- 23
- 7
2
votes
3 answers
Predict value of single image after training model on TPU
I still want to know how I can predict the value of an image after training the network, but it seems like it is not supported yet. Any idea for a workaround (taken from the mnist_tpu.py)?
if mode == tf.estimator.ModeKeys.PREDICT:
raise…

craft
- 495
- 5
- 16
2
votes
3 answers
TPU terminology confusion
So I know how epochs, train steps, batch sizes and this kind of stuff are defined, but it is really hard to me to get my head wraped around the TPU terminology like train loops, iterations per loop and so on. I read this but Im still confused.
Also…

craft
- 495
- 5
- 16
2
votes
1 answer
Rewrite tf.Session into tf.Estimator API
I have some code which was written with the tf.Session low level API and since I want to use it on a TPU I should rewrite it into tf.Estimator API best, since there is a TPUEstimator class for the TPU acceleration.
Is there a standard way to do this…

craft
- 495
- 5
- 16
2
votes
1 answer
TPUEstimator does not work with use_tpu=False
I’m trying to run a model using TPUEstimator locally on a CPU first to validate that it works by setting use_tpu=False on the estimator initialization. When running train I get this error.
InternalError: failed to synchronously memcpy…

liamdalton
- 239
- 1
- 7
2
votes
1 answer
Op type not registered 'BatchDatasetV2'
I’m trying to train a model and am using tf.contrib.data.batch_and_drop_remainder to prepare my dataset. When I run estimator.train I get the following error:
NotFoundError: Op type not registered 'BatchDatasetV2' in binary
running on…

Auberon López
- 268
- 1
- 9
2
votes
1 answer
Canned models on GCP TPUs
Google's TPUs require you to port over your tensorflow Estimators to TPUEstimators, but what I can't seem to figure out is what kind of changes are necessary for the "canned" estimators (like the DNNClassifier) - it seems that only input function…

Igor Rivin
- 4,632
- 2
- 23
- 35
2
votes
0 answers
how to reduce GPU/TPU memory usage for reusing encoder (e.g. RNN/Tensor2Tensor/etc.)?
I have a list (e.g. 1000) of phrases (each phrase contains one or more words) to encode.
I reuse a same encoder (e.g. RNN/Tensor2Tensor/etc.) for each of the phrase (which means they share the learned parameters in the encoder).
As a result, the…

Hypnoz
- 1,115
- 4
- 15
- 27
1
vote
0 answers
unable to open interactive shell for vertex ai custom training job
It happens on custom training job with tpu_v2 in us-central1. I followed "launch web terminal" link under training debugging in custom training job UI, but got the following message.
I should have the necessary permissions as I started the custom…

bill
- 650
- 8
- 17
1
vote
1 answer
Malformed entry 11 in list file /etc/apt/sources.list on fresh TPU VM on GCP
I created a TPU VM on GCP.
I logged in via ssh and want to install some software. But I get the following error:
$ sudo apt-get update
E: Malformed entry 11 in list file /etc/apt/sources.list (URI parse)
E: The list of sources could not be…

BioGeek
- 21,897
- 23
- 83
- 145
1
vote
1 answer
TPU VM vs. VM instance - usage
I just started to learn to use google TPUs and am confused about TPU instance (or TPU resources/TPU VM) and VM instance.
I followed the google cloud guide and created a tpu vm, where I cloned my github repo, create a conda environment and installed…

JXuan
- 15
- 1
- 3
1
vote
1 answer
How to delete a temp folder in Google Cloud (TPU) VM?
So I'm following the mesh-transformer fine-tuning repo to fine-tune GPT-J. I've fine-tuned a model on a Google Cloud PTU-VM before, but then deleted the fine-tuned model. Now, I'm trying to fine-tune a new model in the same VM, but the code is…

Jacques Thibodeau
- 859
- 1
- 8
- 21