Use this tag for Tensor Processing Unit (TPU). TPU is an application-specific integrated circuit developed by Google.
Questions tagged [tpu]
421 questions
2
votes
0 answers
Tokenize dataset using map on tf.data.Dataset.from_tensor_slices(....)
Note: I am using the free TPU provided on Kaggle.
I want to tokenize the text using transformers such that I tokenize only the batch while training the model instead of first tokenizing the whole dataset and then creating batches from the tokenized…

Abhishek Prajapat
- 1,793
- 2
- 8
- 19
2
votes
2 answers
How to clear Colab Tensorflow TPU memory
I am executing model for several folds. After each fold I want to clear the TPU memory so that I don't get OOM error.
Full trace of the current error.
ResourceExhaustedError Traceback (most recent call…

Abhishek Prajapat
- 1,793
- 2
- 8
- 19
2
votes
1 answer
TPU returning "failed call to cuInit: UNKNOWN ERROR (303)" on Google Cloud with Kubernetes Cluster
I am trying to use a TPU with Google Cloud's Kubernetes engine. My code returns several errors when I try to initialize the TPU, and any other operations only run on the CPU. To run this program, I am transferring a Python file from my Dockerhub…

Lexi2277
- 35
- 4
2
votes
0 answers
Using a TPus with SpaCy
Is it possible to use a tpu in spacy... I know that you can use a gpu with spacy.prefer_gpu(). Is there something similar to this for tpu? Thanks in advance!

pineapps
- 21
- 3
2
votes
1 answer
UnimplementedError: File system scheme '[local]' not implemented
I am getting an error while implementing TensorFlow in TPU
UnimplementedError: File system scheme '[local]' not implemented (file: '1.png')
I know this question has been answered before but my issue is different,
I am getting this error when I…

Talha Anwar
- 2,699
- 4
- 23
- 62
2
votes
0 answers
How to gather prediction result on TPU (Pytorch)?
I'm trying to fine-tune my bert-based QA model(PyTorch) with Tpu v3-8 provided by Kaggle. In the validation process I used a ParallelLoader to make predictions on 8 cores at the same time. But after that I don't know what should I do to gather all…

佩特微
- 21
- 1
2
votes
1 answer
Tensorflow load saved model, Predict and Evaluate. Too low accuracy on test?
I have trained my model on TPU and result seems good for testing. Dataset has 5 classes and result shows that:
accuracy: 0.9867 - sparse_categorical_accuracy: 0.9867 - loss: 0.0412 - val_accuracy: 0.9859 - val_sparse_categorical_accuracy: 0.9859 -…

Nobat
- 51
- 5
2
votes
1 answer
Kaggle TPU: failed to connect to all addresses
I'm facing some problems while trying to fit my model using TPU on kaggle.
Tpu already's initialized:
try:
tpu = tf.distribute.cluster_resolver.TPUClusterResolver()
print(f'Running on TPU {tpu.master()}')
except ValueError:
tpu = None
if…

rdn
- 33
- 1
- 3
2
votes
0 answers
Training a keras model using TPU pods?
I was wondering if anyone has an example of using a keras model on a TPU pod?
I have a model creating method which returns a keras model which is compiled within a TPU strategy scope, as recommended by many examples on using TPUs with keras. This…

st0ne
- 106
- 1
- 9
2
votes
1 answer
Train model on Colab TPU with distributed strategy
I'm trying to train and run an image classification model on Colab, using TPU. No pytorch.
I know that TPU works only with files from GCS buckets so I load the dataset from a bucket and I commented also the checkpoint and logging functions, to not…

AndreiV6
- 23
- 4
2
votes
1 answer
Use TPU in Google Colab
I am currently training a neural network with the help of a TPU.
I changed the runtime type and initialized the TPU.
I have the feeling that it is still not faster. I used https://www.tensorflow.org/guide/tpu.
Did I something wrong?
# TPU…
user14576365
2
votes
1 answer
TF 2.3 using experimental_steps_per_execution in model.compile cause drop in model performance
Using TPU, I have tried to pass experimental_steps_per_execution to model.compile(...), I do see a big speedup, but for the exact learning rate schedule, I noticed a 2-3% drop in accuracy when training is done. In summary, the only thing I changed…

kawingkelvin
- 3,649
- 2
- 30
- 50
2
votes
0 answers
How to reduce TPU idle time?
I'm getting like 99.7% TPU idle time with my training code (https://github.com/ksjae/KoGPT2-train). What are the general methods used to reducing IDLE time?
How can I(or any user in general) reduce it to a sane amount?
How can I find the culprit of…

efe23eds
- 51
- 4
2
votes
1 answer
RPC failed with status = "Unavailable: Socket closed" Error when training FairSeq RoBERTa on Cloud TPU using PyTorch
I followed the tutorials "Pre-training FairSeq RoBERTa on Cloud TPU using Pytorch" to setup a Preemptible (v2-8) TPU env and train my RoBERTa model. The PyTorch env is based on torch-xla-1.6 as instructed by the document. However, it does not output…

user3786340
- 190
- 1
- 8
2
votes
0 answers
Google Colab don't get the file from GCS Bucket
I am trying to train a model from this repo with TPU which requires all input files and the model directory must use a cloud storage bucket.
I did create a bucket and upload all the files of the model.
But google colab cannot read the path of my…

huy
- 1,648
- 3
- 14
- 40