How do you store the http code? is it stored locally on each machine
or on a shared storage? If locally , how do I update the code if the
web site changes
There's a number of ways, you can store the code on shared storage such as an NFS/S3 mount - making it very easy to update centrally, obviously you then introduce a single-point-of-failure so people often have two copies of the code on different storage so they can only lose half of their nodes - and you can use this for blue/green-testing/deployment too. Another option would be to store it on a distributed file-system such as Ceph or similar, same caveats hold obviously.
Same goes for static content , I'm assuming it resides on a shared
storage
Generally this is the true, a lot of people use cloud-based storage for static content as it's often 'near', from a network perspective, to their CDNs, it's rare to see content stored on web-servers directly these days.
If I'm using a CDN , does it simply cache all static accessed data?
and for how long ?
That's certainly the base-functionality, they usually can do a great deal more than that and the TTL is almost always configurable on an individual object/file basis.
LBs - How can it see if a machine is overloaded (load avg) if at all ?
Lots of different ways, open connections, response times, shared resource utilisation stats - LB's can be very 'tuneable', I have a huge amount of respect for good LB managers.
LBs - can I have a cluster of LBs? if so how does it work ?
Yes, literally tiers of them and way you like - as an example we use Global LB'ing to send traffic to a specific datacenter based on a number of factors then once it hits that site it gets split into different service-groups (green/blue for instance) and then to the actual service-LBs.
What DB would you pick for youtube/streaming like site? and why
There's not one best in class DB sorry - there's too many factors, cost being a major one (my word MSSQL and Oracle can get super spendy these days!) but the main thing to consider is if your DB NEEDS referential-integrity as if it does then you need a SQL-based DB (there are free ones though, MySQL and PostGRES are very popular), but if you can design your data right then you can get away with 'NoSQL' databases such as Couchbase/Mongo/Cassandra and they absolutely FLY, so much quicker than SQL for basic queries - but obviously they're less feature-rich. The other thing is that you can do your DB work entirely in the cloud now - AWS in particular have a strong portfolio of DB types and Azure obviously has MSSQL as part of their portfolio.