From what I see from a wireshark dump, they buffer content by doing multiple requests (with a parameter "range") to the server. The client only requests the next part when he needs it.
GET /videoplayback?
sver=3&
key=yt1&
sparams=algorithm%2Cburst%2Ccp%2Cfactor%2Cid%2Cip%2Cipbits%2Citag%2Csource%2Cupn%2Cexpire&
algorithm=throttle-factor&
upn=L4m-ID0n0V0&
expire=1334882299&
factor=1.25&
ipbits=8&
ip=77.0.0.0&
fexp=912300%2C919303%2C911623&
source=youtube&
range=8908800-10690559&
cp=U0hSSVhTUF9OU0NOMl9QTVRDOjhGTXRjbEpBNzls&
burst=40&
signature=20F9219AACD9249B3517F56ECFE8B12C6B001D2F.BDDD25B61745E0F6E0BBAC7E792C121AA67A4C7C&
keepalive=yes&
itag=34&
cm2=0&
id=9cc8ae37c50b77f7 HTTP/1.1
Otherwise, this kind of bandwidth throttling is only doable if the client 'has the control' over the server, i.e. either requests what he knows he needs (ex: Youtube progressive download), or tell the server regularly where he is (ex: RTCP/RTMP / RTP streaming).