0

I am trying to deploy BERT and Transformer models on Mali GPU, using TensorflowLite. But the problem is that TensorflowLite does not support some operations in these models, including {CAST, GATHER, MUL, RESHAPE, UNPACK}. Does anyone have any idea how I can delegate those operations on GPU? Are there any other TensorflowLite libraries that could support embedded GPU and specifically Mali GPU? I just want to measure their latency on GPU.

STARTING!
Log parameter values verbosely: [0]
Min num runs: [1]
**Graph**: [mobilebert_float_384_gpu.tflite]
**Use gpu**: [1]
Loaded model mobilebert_float_384_gpu.tflite
**INFO**: Created TensorFlow Lite delegate for GPU.
**ERROR**: Following operations are not supported by GPU delegate:
CAST: Not supported cast case
GATHER: Operation is not supported.
MUL: MUL requires one tensor that not less than second in all dimensions.
RESHAPE: OP is supported, but tensor type isn't matched!
UNPACK: Operation is not supported.
**2661 operations will run on the GPU, and the remaining 81 operations will run on the CPU.**
INFO: Initialized OpenCL-based API.
INFO: Created 1 GPU delegate kernels.
Explicitly applied GPU delegate, and the model graph will be partially executed by the delegate w/ 1 delegate kernels.
The input model file size (MB): 100.239
Initialized session in 2491.9ms.
Running benchmark for at least 1 iterations and at least 0.5 seconds but terminate if exceeding 150 seconds.
count=2 first=434022 curr=247839 min=247839 max=434022 avg=340930 std=93091
Neo97
  • 11
  • 2

1 Answers1

0

The GPU delegate doesn't support all operations, just this list.

You may wish to try the NNAPI delegate if it's available to you. It can delegate to different processors as "best", not just the GPU. The supported operations are here.

Worst case, if neither of those work for you is to either fallback to the CPU, or write your own delegate - but that's a chunk of work.

BenClark
  • 338
  • 2
  • 12