I am reading this paper in an attempt to recreate the salient region detection and segmentation model employed. I have the following questions pertaining to section 3 of the paper and I would highly appreciate it if someone could provide clarity on them.
The word "scales" is used at multiple points in the section, for example, line 4 of the section states "saliency maps are created at different scales". I do not exactly understand what the authors mean by the word scales. Moreover, is there a mathematical way to think about it?
I understand that a saliency value
is computed for each pixel at (
) using the equation
However, there is no mention of in the equation. Hence, I am confused as to what pixel the saliency value is being computed for. Is it
?
- I did not understand what the authors meant by the term "bin" in section 3.2 line 5 where it is stated, "The hill-climbing algorithm can be seen as a search window being run across the space of the d-dimensional histogram to find the largest bin within that window."
Lastly, any other tips or clarifications are most welcome and much appreciated!