While the real time transcoding approach is possible, the more normal practice is probably to pre-transcode the video into a number of different bit rate formats and allow the client choose which one to request the next 'chunk' of the video from based on their current network conditions.
This approach is referred to as Adaptive Bit Rate streaming and uses streaming formats like HLS and MPEG-DASH.
Ultimately, it is a tradeoff between processing overhead and storage overhead:
- the real time transcoding approach requires you to only store one copy of the video to serve to users, but it requires you to transcode the video each time a user wants to view it (unless the one transcoding you do store works for them).
- the ABR approach requires you to store a copy of the video for each bit rate you want to serve, but you only have to do the transcoding once (for each bit rate).
It gets further complicated by the expected viewing profile of the videos - storage is a bigger issue if the video will only be watched once or twice a year, and processing a bigger issues if you will have 100,000 users a day viewing the same video.
As a note, the approach you describe is quite similar to a live ABR stream - the live stream will be transcoded in real time (or as close as is possible) to, for example, 5 different bit rate streams and the clients will request the video in, for example, 10 second chunks. The client decides which bit rate stream to take the next chunk from based on the network conditions.
The main difference from your proposal is that all bit rates are being produced all the time, not just the ones being requested by a client at that moment.
In practice, with a large enough client base and spread of network conditions the two approaches would actually be the same - i.e. if you have enough clients and enough different network conditions you will be creating a full range of low and high bandwidth streams, anyway.