I am trying to deploy BERT and Transformer models on Mali GPU, using TensorflowLite. But the problem is that TensorflowLite does not support some operations in these models, including {CAST, GATHER, MUL, RESHAPE, UNPACK}. Does anyone have any idea how I can delegate those operations on GPU? Are there any other TensorflowLite libraries that could support embedded GPU and specifically Mali GPU? I just want to measure their latency on GPU.
STARTING!
Log parameter values verbosely: [0]
Min num runs: [1]
**Graph**: [mobilebert_float_384_gpu.tflite]
**Use gpu**: [1]
Loaded model mobilebert_float_384_gpu.tflite
**INFO**: Created TensorFlow Lite delegate for GPU.
**ERROR**: Following operations are not supported by GPU delegate:
CAST: Not supported cast case
GATHER: Operation is not supported.
MUL: MUL requires one tensor that not less than second in all dimensions.
RESHAPE: OP is supported, but tensor type isn't matched!
UNPACK: Operation is not supported.
**2661 operations will run on the GPU, and the remaining 81 operations will run on the CPU.**
INFO: Initialized OpenCL-based API.
INFO: Created 1 GPU delegate kernels.
Explicitly applied GPU delegate, and the model graph will be partially executed by the delegate w/ 1 delegate kernels.
The input model file size (MB): 100.239
Initialized session in 2491.9ms.
Running benchmark for at least 1 iterations and at least 0.5 seconds but terminate if exceeding 150 seconds.
count=2 first=434022 curr=247839 min=247839 max=434022 avg=340930 std=93091