The most generic, cross-language way is to sleep between requests. Something like 10 seconds sleep should mimic how fast a real human goes through web pages. To avoid robot identifying algorithms some people sleep a random amount of time: sleep(ten_seconds + rand())
.
You can make it fancier by keeping track of different sleep timeouts for each domain so that you can fetch something from another server while waiting for the sleep timeout.
The second method is to actually try to reduce the bandwidth for your request. You may need to write your own http client with this feature to do it. Or on linux you can use the network stack to do it for you - google qdisc
.
You can of course combine both methods.
Note that reducing bandwidth is not very friendly to sites with lots of small resources. That's because you're increasing the amount of time you're connected for each resource hence occupying one network socket and probably one web server thread while you're at it.
On the other hand not reducing bandwidth is not very friendly to sites with lots of large resources like mp3 files or videos. That's because you're saturating their network - switches, routers, ISP connection - by downloading as fast as they can serve.
A smart implementation will download small files at full speed, sleeping between downloads, but reduce bandwidth for large files.