-1

I understand horizontal scaling, vertical scaling, sharding, I want to gain more understanding on what will happen to the application i.e the effects of not scaling over how I can solve the problem by scaling.

Here are my doubts,

  1. What are all the possible things that can happen to the application if I don't scale. For example, the application will slow down, the requests won't get served, or the application will go down.
  2. Let's say the system will slow down on increasing the load, why does that happen? If the requests don't get served why does that happen? Do threads come into the picture?
  3. If threads come into the picture, how does it do so?

1 Answers1

0

Generally, all requests have a timeout, these timeouts occur at most layer boundaries (Browser->HTTP server, HTTP Server -> Application Server / Microservices layer, Application -> Database). When your load increases to the point where some layer cannot service the request before that timeout occurs, the user will not get a response, and the application will be broken

Depending on where the timeout occurs, you may send a useful error, or it could be a generic "hang" where the application appears to be frozen or broken in some way.

If enough requests are awaiting servicing, and you have turned up all the timeouts to an unreasonably high level, you may allow more and more threads to queue. These threads use memory, and ultimately you will run out of memory and be unable to create additional threads, at which point the application will once again hang and become unresponsive.

Rob Conklin
  • 8,806
  • 1
  • 19
  • 23
  • I read that async request helps reduce the load, how does that happen? I understand that the async request is non-blocking but if multiple threads are already waiting to be executed, how does async help? – noob codes Dec 24 '22 at 12:04
  • The async request reduces the need to rely on an open socket awaiting the response. It can allow the queue to grow much larger, although it doesn't necessarily reduce the queue server-side, it does reduce the need for open sockets awaiting a response, which can give you more room before things go sideways. It can shift the timeout from a network-layer timeout to human-layer timeout (they get tired of waiting and go away). – Rob Conklin Dec 24 '22 at 22:38
  • Does that mean that the server processes an async request similar to any sync request apart from not waiting for a response? If the threads of an async request get processed in the same manner as any other request would, the latency will remain the same right? Then why go for async? How does that work? I'm trying to understand how threads within servers work while serving requests, Can you share the resources you used to learn this? – noob codes Dec 25 '22 at 03:56
  • Like I said, the reason to go for async is to eliminate the timeout between the end user and the server.. That timeout is usually the shortest (by far) and will be the first to fail. Async is not a panacea, it's just another tool in the toolbox to help with scaling issues. It *does* work well when combined with other scaling solutions like off-server worker pools (think AWS Lambda or Azure Functions) that scale much higher. – Rob Conklin Dec 28 '22 at 01:44
  • As far as learning resources, unfortunately it (for me) has come down to 30 years experience building applications at scale. Much earlier in my career books around enterprise software patterns like Fowler's were useful: https://www.amazon.com/Patterns-Enterprise-Application-Architecture-Martin/dp/0321127420 although I haven't read that book in about 15 years, so I'm not sure how well it's aged... Some bits are likely still really relevant, but it predated the cloud so some things have definitely changed. – Rob Conklin Dec 28 '22 at 01:46