A distributed database doesn't mean that the data is replicated to all the server machines. The data replication is kinda a topology thing.
What a distributed database means is that multiple machines work together to store and serve data. How they do it is another story. From Wiki
A distributed database is a database in which storage devices are not all attached to a common processing unit such as the CPU and which is controlled by a distributed database management system
What you're asking is that how data is distributed in a distributed database.
Usually the most common distribution method is the Hashbased distribution
. It calculates a hash
value against every key
or ID
of the value and stores it in one of the nodes. Yes the data is not stored in all of the servers (hence the distribution)
Hash
distribution ensures that the data is more or less uniformly distributed among the database server machines or cluster.
Your other question how a query is served can be answered by first understanding that all those different nodes of a server work together to find the result of the query. Every database server basically performs the query on its own local data set (remember data is distributed not replicated) and reply back to the client. The client API should be intelligent enough to accumulate the result or connect with all the servers for the returned reader to perform accurately.