I don't know the definitions of any particular codec or implementations of encoders but I am familiar with the rational and motivation behind VBR (more as it concerns audio, but I believe the concept is the same).
There are two main categories in play here: single pass and multi pass. Single pass (on-the-fly) encodes much faster. It just passes through the video once and encodes. It can be done in real time for broadcasts and other situations that the whole video isnt available for prior analysis. Your question seems to mainly concern multi-pass. Though it is called multi-pass, it usually means just two. More so, you seem to be asking about multi-pass VBR encoding in which an average (ABR) is specified and must be adhered to.
VBR allows higher bit rates for sections that demand it due to higher color depth, amount of , amount of edges, etc (or in audio - lots of polyphony, mixed frequencies, etc) and lower rates for "plainer" sections with less of those qualities (audio: single voice, sections with only rhythm, etc) the extreme of this being entire frames of a solid color or close to it (silence). Basically the same criteria that effect the compression of still images.
As such, it seems to me that the most effective way for an encoder to stick to a specified average would be to sample individual frames at a certain periodic frequency throughout the entirety of the file. Say, twice a second for the entirety of the video. (I don't know if this is even in the ballpark of a realistic estimate, but you get the idea). This hopefully gives a good estimate of the videos character (for lack of a better word) and allows for most efficient distribution of those precious resources.
It should also be noted that there is sometimes a range of minimum and maximum bit rates that can be employed so that at no time can the bit rate be less than X, or more than Y. Well chosen ranges obviously depend on the resolution.
As for terms to google - try multi-pass encoding and AVR. And as usual, wikipedia sketches a pretty good rough picture, enough so you'd know where to go for further readiong http://en.wikipedia.org/wiki/Variable_bitrate#Multi-pass_encoding_and_single-pass_encoding