Processing logic : processing logic or processing elements are distributed
- This is so you can optimise the query itself when you know that the data that the query will be retrieving from will be 'scatter' across different places(usually across the net)
- Consider a situation where you want to say, get all the employees from a DB, but the actual DB is been breaking down into two fragments horizontally, but the employees you want only exists in one out of the two fragments, now consider if you don't distribute the processing logic, you will have to put the two fragments together by union them, and only to be making use of only one half of the data, so the cost of transferring the other half that isn't really required will result then be just wasted computing in the form of longer response time or overall wait time, etc.
Data : used by a number of applications may be distributed to a number of processing sites
- We mentioned the idea of fragments just before, but the idea of fragment is really just a formal way of defining how the data should be 'break down'.
Usually, the fragments will be either horizontal fragments or vertical fragments.
A fragment should have a property known as the correctness property
. The correctness property
demands three conditions to be held true for any fragments, a somewhat 'simplified' interpretation of these conditions are
Reconstruction
When you put back the fragments, you get the original table.
Completeness
All the records from the original table should be presented in a fragment, otherwise data will be lost.
Disjointness
Each record only show up in one fragment.
A trashy analogy would be that, basically think about how you have a piece of paper, you tear the piece paper up out of anger, but then you suddenly realise you had something important written on the piece paper, you really want to be able to put them back to its original state, if all the pieces of the paper were disjointed
, and all the information written were completely
written on those pieces, and lastly you have all the original pieces that you have just torn in front of you, so you could just reconstruct
the original thing,
Control : The control of the execution of various tasks might be distributed instead of being performed by one computer system.
I think this mostly tights into the performance aspect of DDB and some aspect of control access. So instead of running queries in one place.