Questions tagged [distributed-system]

A distributed system consists of a collection of autonomous computers, connected through a network and distribution middleware, which enables computers to coordinate their activities and to share the resources of the system, so that users perceive the system as a single, integrated computing facility.

A distributed system is a software system in which components located on networked computers communicate and coordinate their actions by passing messages.

1253 questions
8
votes
2 answers

How does HBase guarantee row level atomicity?

Considering the fact that HBase stores each column family in a separate HFile and the fact that a row can span many Column Families. How does HBase ensure that a put/delete operation on a row that spans multiple column families is indeed atomic ?
arun_suresh
  • 2,875
  • 20
  • 20
8
votes
3 answers

What are the use cases for a Vector Clock versus a Version Vector?

I have been having trouble finding an example of what use cases are suitable for Vector Clocks and Version Vectors, and how they might differ. I understand that they largely work in the same way, with Vector Clocks using receive and send functions,…
8
votes
4 answers

Synchronize Data From Multiple Data Sources

Our team is trying to build a predictive maintenance system whose task is to look at a set of events and predict whether these events depict a set of known anomalies or not. We are at the design phase and the current system design is as…
8
votes
4 answers

PBFT: Why cant the replicas perform the request after 2/3 have prepared? why do we need commit phase?

I know there are some questions on this website that asks the same questions. However the answer is never clear: In PBFT, why cant the replicas execute the requests after 2/3s have prepared? why is commit phase needed? if 2/3 + 1 replica have agreed…
user2584960
  • 645
  • 1
  • 6
  • 20
8
votes
4 answers

Logical Clocks: Lamport Timestamps

I am currently trying to understand Lamport timestamps. Consider two processes P1 (producing events a1, a2,...) and P2 (producing events b1, b2,...). Let C(e) denote the Lamport timestamp associated with event an e. I created timestamps for each…
typeduke
  • 6,494
  • 6
  • 25
  • 34
8
votes
2 answers

Google File System Consistency Model

I was reading about GFS and its consistency model but I'm failing to grasp some of it. In particular, can someone provide me with a specific example scenario (or an explanation of why it cannot happen) of: concurrent record append that could…
Simone
  • 2,261
  • 2
  • 19
  • 27
8
votes
5 answers

How to make my Java Swing application a Client-Server application?

I have made a Java Swing application. Now I would like to make it a Client-Server application. All clients should be notified when data on the server is changed, so I'm not looking for a Web Service. The Client-Server application will be run on a…
Jonas
  • 121,568
  • 97
  • 310
  • 388
8
votes
2 answers

Architecture for a globally distributed Neo4j?

I am doing some work for an organisation that has offices in 48 countries of the world. Essentially the way they work now is that they all store data in a local copy of the database and that is replicated out to all the regions/offices in the world.…
gremwell
  • 1,419
  • 17
  • 23
8
votes
2 answers

how Message Queue System Works?

I have studied Message Queues System in my class but I still don't get it how these Message Queues System work in real time scenarios? Is there any tutorial which can help me to get the complete picture? Can someone explain me how these systems…
Haider Ali
  • 800
  • 2
  • 9
  • 22
7
votes
2 answers

How is ETCD a highly available system, even though it uses Raft which is a CP algorithm?

This is from Kubernetes documentation: Consistent and highly-available key value store used as Kubernetes' backing store for all cluster data. Does Kubernetes have a separate mechanism internally to make ETCD more available? or does ETCD use,…
7
votes
0 answers

Performance difference in Redis vs etcdv3

I was going through benchmark documentation page of Redis and Etcd. From the benchmark data it seems Etcd is as efficient as…
Rahul
  • 326
  • 2
  • 10
7
votes
0 answers

Idempotency and Race Condition on REST API in a Distributed System

What could be possible alternative solution to implement Idempotency and also handle race condition. For ex. consider a request to add a customer to System Of Record. The customer detail will have email id as key attribute. And suppose there is API…
7
votes
6 answers

Best data store solution for small mathematical data but fast and with aggregate functions

I'm looking for a data storage solution for a project with these requirements: The application creates dynamically a containter/table in the store. For a small period of time (two weeks for example) that table/container gets a huge amount of…
vtortola
  • 34,709
  • 29
  • 161
  • 263
7
votes
1 answer

Bloom filters in a distributed environment

I have a system consisting of a few application instances, written in Java. Requests to them are load balanced for high availability. Every second, hundreds of small chunks of data (each consisting of a few simple strings) are received by this…
zgguy
  • 226
  • 1
  • 5
7
votes
1 answer

Difference between atomic broadcast and consensus

Consensus is about all the machines coming to an agreement over a value. Atomic broadcast also says that a process emitting a msg should either be agreed by all or none So what is the difference?
ffff
  • 2,853
  • 1
  • 25
  • 44