I am preparing for the system design interview, and since I have little experience with this topic, I bought the "Grokking the system design interview" course from educative.io, which was recommended by several websites. However I read it, I think I did not manage to understand several things, so if someone could answer my questions, that would be helpful.
- Since I have no experience with nosql, I find it difficult to chose the proper db system. Several times the course just do not give any reasoning why it chose one db over another one. For example in chapter "Designing Youtube or Netflix" the editors chose mysql for db role with no explanation. In the same chapter we have the following non-functional requirements:
"The system should be highly available. Consistency can take a hit (in the interest of availability); if a user doesn’t see a video for a while, it should be fine."
Following the above hint and taking into account the size of the system and applying the material in the "CAP theorem" chapter for me it seems or Cassandra and CouchDB would be a better choise. What do I miss here?
Same question goes for "Designing Facebook’s Newsfeed"
- Is CAP theorem still applicable?
What I mean is: HBase is according to the chapter "CAP theorem" good at consistency and partition tolerance, but according to the HBase documentation, it also supports High Availibility since version 2.X. So it seems to me that it is a one fits all / universal solution for db storage which goes against CAP theorem, unless they sacrificed something for HA. What do I miss here?
The numbers are kind of inconsistent around the course about how much RAM/storage/bandwidth can a computer handle, I guess they are outdated. What are the current numbers for a, regular computers, b, modern servers?
Almost every chapter has a part called "Capacity Estimation and Constraints", but what is calculated here changes from chapter to chapter. Sometimes only storage is calculated, often bandwidth too, sometimes QPS is added, sometimes there are task specific metrics. How do I know what should I calculate for a specific task?
Thanks in advance!