-3

I've been learning about PoseNet in order to use it in my health-related research work.
I was impressed how mobilenet enables to keep high accuracy while reducing CPU (or GPU/NPU) dependency by adapting few parameters where my questions sprouted.
I've noticed that in mobilenet official papers, there were two multipliers introduced: alpha and rho. I'll skip the explanation of both parameters. I wonder what is each value of alpha and rho for the mobilenet for the newest PoseNet model. Also, I'm wondering if there is a guideline for parameters(especially alpha and rho) tuning, and how the values of both are set and validated before training the model.
Like, if the selected value of alpha is 0.5, I wonder why the value is better than 0.75 or 0.25 .
My questions are:

  1. What are the values of alpha and rho for mobilenet (the version used to train PoseNet)
  2. Why/how those numbers are selected/validated?
desertnaut
  • 57,590
  • 26
  • 140
  • 166
Joyccino
  • 3
  • 3
  • According to 30 seconds of Google searching: "Two parameters are introduced so that MobileNet can be tuned easily: Width Multiplier α and Resolution Multiplier ρ." It's for the sizes of your images. See https://towardsdatascience.com/review-mobilenetv1-depthwise-separable-convolution-light-weight-model-a382df364b69 for further reading. I guess it's so it can tell the difference between a toy car and a person, and a car and GIANT MAN! – Dan Rayson Feb 05 '21 at 15:09
  • I know what they are. My question was: what are the values of alpha and rho when mobilenet (especially mobilenet version which is used to train PoseNet) is trained? – Joyccino Feb 08 '21 at 06:34

1 Answers1

0

The one in the https://www.tensorflow.org/lite/models/pose_estimation/overview uses alpha=1.0. The alpha multiplies number of input/output channels for each convolutions, and for alpha=1.0, first convolution layer has 32 channels. Nevertheless there are PoseNets with other backbones, which you can easily try from TF.js example. https://github.com/tensorflow/tfjs-models/tree/master/posenet

rho value is somewhat more theoretical, and in the original paper it says

In practice we implicitly set ρ by setting the input resolution.

Taehee Jeong
  • 146
  • 2