Encapsulation is a conscious decision to give your code - your implementation of some functionality - a particular interface, such that the consumer of your implementation only assume what they have to.
I'll start with a non-JS example, but will show one at the end.
Hostnames are an interface. IP addresses are implementation details.
When you go to Stack Overflow, you go to stackoverflow.com, you don't go to 151.101.193.69. And if you did go to 151.101.193.69, you'd notice that it's a Fastly CDN address, not a Stack Overflow address.
It's likely that when Stack Overflow just started, it implemented its web access using its own server, on a different IP address, say, for example, 198.51.100.253.
If everyone bookmarked 198.51.100.253, then when Stack Overflow started using Fastly CDN, suddenly everyone who bookmarked it - millions of people - would have to adjust.
That is a case of broken compatibility, because those millions of people would have been coupled to the IP address 198.51.100.253.
By encapsulating the IP address 198.51.100.253 - the actual detail of the implementation of web access - behind only the thing that users need to know - the name of the website - stackoverflow.com - the public interface, Stack Overflow was able to migrate to Fastly CDN, and all those millions of users were none the wiser.
This was possible because all these users were not coupled to the IP address 198.51.100.253, so when it changed to 151.101.193.69, nobody was affected.
This principle applies in many areas. Here are some examples:
- Energy: You pay for electricity. The supplier can provide it using coal, gas, diesel, nuclear energy, hydro, they can change it from one to the other, and you're none the wiser, you're not coupled to hydro, because your interface is the electric socket, not a generator.
- Business: When an office building gets cleaning company to keep the building clean, they only have a contract with the company; They cleaners get hired and fired, their salary changes, but that's all encapsulated by the cleaning company and does not affect the building.
- Money: You don't need money, you need food and shelter and clothes. But those are implementation details. The interface you export to your employer is money, so they don't have to pay you in food, and if you change your diet or style, they don't have to adjust what food or clothes they buy you.
- Engineering: When an office building gets HVAC, and it breaks, the owner just calls the HVAC company, they don't try to fix it themselves. If they did, they void the warranty, because the HVAC company can't guarantee good product if someone else touches the HVAC. Their public interface is the maintenance contract and the HVAC user-facing controls - you're not allowed to access the implementation directly.
And of course, software: Let's say you have a distributed key-value store which has the following client API:
client = kv.connect("endpoint.my.db");
bucket = crc(myKey)
nodeId = bucket % client.nodeCount();
myValue = client.get(nodeId, bucket, myKey);
This interface:
- allows the caller to directly and easily find the node which will store the key.
- allows the caller to cache bucket information to further save calls.
- allows the caller to avoid extra calls to map a key to a bucket.
However, it leaks a ton of implementation details into the interface:
- the existence of buckets
- the usage of CRC to map keys to buckets
- the bucket distribution and rebalancing strategy - the usage of bucket % nodeCount as the logic to map buckets to nodes
- the fact that buckets are owned by individual nodes
And now the caller is coupled with all these implementation details. If the maintainer of the DB wants to make certain changes, they will break all existing users. Examples:
- Use CRC32 instead of CRC, presumably because it's faster. This would cause existing code to use the wrong bucket and/or node, failing the queries.
- Instead of round-robin buckets, allocate buckets based on storage nodes' free space, free CPU, free memory, etc. - that breaks
bucket % client.nodeCount()
- likewise leads to wrong bucket/node and fails queries.
- Allow multiple nodes to own a bucket - requests will still go to a single node.
- Change the rebalancing strategy - if a node goes down, then nodeCount goes from e.g. 3 to 2, so all the buckes have to be rebalanced such that
bucket % client.nodeCount()
finds the right node for that bucket.
- Allow reading from any node instead of the bucket owner - requests will still go to a single node.
To decouple the caller from the implementation, you don't allow them to cache anything, save calls, or assume anything:
client = kv.connect("endpoint.my.db");
myValue = client.get(myKey);
The caller doesn't know about CRC, buckets, or even nodes.
Here, the client has to do extra work to figure out which node to send the request to. Perhaps with Zookeeper or using a gossip protocol.
With this interface:
- Hashing logic e.g. CRC isn't hard-coded in the caller, it's on the server side and changing it won't break the caller.
- Any bucket distribution strategy is likewise only on the server side.
- Any rebalancing logic is likewise not in the client.
Even other changes are possible by just upgrading the client, but not changing any code in the caller:
- Allow multiple nodes to own a bucket
- Read from any node (e.g. choosing the one with the lowest latency).
- Switching from a Zookeeper-based node finding infrastructure to a gossip-based one.