0

I’ve got a simple model consisting only of convolutions (even no activation between) and I wanted to benchmark it in Caffe2 on ARM Android device using multiple cores.

When I run

./speed_benchmark --init_net=model_for_inference-simplified-init-net.pb --net=model_for_inference-simplified-predict-net.pb --iter=1

it runs on single core.

Speed benchmark was built using:

scripts/build_android.sh -DANDROID_ABI=arm64-v8a -DANDROID_TOOLCHAIN=clang -DBUILD_BINARY=ON

On X86 it has been build via

mkdir build
cd build
cmake .. -DBUILD_BINARY=ON

and setting OMP_NUM_THREADS=8 helps but not on ARM

Do I need to change the building command for arm, set some environmental variables, some binary arguments or something else?

1 Answers1

0

I didn't know that I need to set the engine information in the model like described in https://caffe2.ai/docs/mobile-integration.html

After updating prediction net by:

for op in predict_net.op:
  if op.type == 'Conv':
    op.engine = 'NNPACK'

more cores started to be used