Distributed Databases questions

Question

When we talk about Distribuited Databases, all them, must to have the same information? for example imagine the table customers

select * from customers

this query must to return same result in any database? in other word all users must exist in all databases? for example "user 1" must exist in all databases?

and now imagine a table master-detail for example, sale and sale_detail if you are using the sistem and this "insert a new sales (sale and its details)" is must to insert this new sale in all databases?

and how a transaction works here? or sale-sale_details not must to be in all databases?

how a distributed transaction work?

Wikipedia has good introductions. – philipxy Jun 07 '16 at 03:29 — philipxy, Jun 07 '16 at 03:29

score 0 · Accepted Answer · answered Jun 08 '16 at 07:37

A distributed database doesn't mean that the data is replicated to all the server machines. The data replication is kinda a topology thing.

What a distributed database means is that multiple machines work together to store and serve data. How they do it is another story. From Wiki

A distributed database is a database in which storage devices are not all attached to a common processing unit such as the CPU and which is controlled by a distributed database management system

What you're asking is that how data is distributed in a distributed database.

Usually the most common distribution method is the Hashbased distribution. It calculates a hash value against every key or ID of the value and stores it in one of the nodes. Yes the data is not stored in all of the servers (hence the distribution)

Hash distribution ensures that the data is more or less uniformly distributed among the database server machines or cluster.

Your other question how a query is served can be answered by first understanding that all those different nodes of a server work together to find the result of the query. Every database server basically performs the query on its own local data set (remember data is distributed not replicated) and reply back to the client. The client API should be intelligent enough to accumulate the result or connect with all the servers for the returned reader to perform accurately.

Distributed Databases questions

1 Answers1