Questions tagged [distributed-system]

A distributed system consists of a collection of autonomous computers, connected through a network and distribution middleware, which enables computers to coordinate their activities and to share the resources of the system, so that users perceive the system as a single, integrated computing facility.

A distributed system is a software system in which components located on networked computers communicate and coordinate their actions by passing messages.

1253 questions
12
votes
2 answers

Tensorflow on shared GPUs: how to automatically select the one that is unused

I have access through ssh to a cluster of n GPUs. Tensorflow automatically gave them names gpu:0,...,gpu:(n-1). Others have access too and sometimes they take random gpus. I did not place any tf.device() explicitely because that is cumbersome and…
jeandut
  • 2,471
  • 4
  • 29
  • 56
11
votes
2 answers

How would you program a strong read-after-write consistency in a distributed system?

Recently, S3 announces strong read-after-write consistency. I'm curious as to how one can program that. Doesn't it violate the CAP theorem? In my mind, the simplest way is to wait for the replication to happen and then return, but that would result…
oky_sabeni
  • 7,672
  • 15
  • 65
  • 89
11
votes
3 answers

What library can I use to do simple, lightweight message passing?

I will be starting a project which requires communication between distributed nodes(the project is in C++). I need a lightweight message passing library to pass very simple messages(basically just strings of text) between nodes. The library must…
Mike
  • 23,892
  • 18
  • 70
  • 90
11
votes
1 answer

Why is the word "entropy" present in anti-entropy protocols?

Anti-entropy protocols are a form of gossip protocols. http://en.wikipedia.org/wiki/Gossip_protocol. I was wondering if someone could explain, the the significance of word entropy here.
10
votes
1 answer

How can I serve my django website from multiple machines, that is how can I make it distributed?

I have my django website, which I want to make distributed, I know all the concept of system design and distributed system but still cannot figure out how can I serve it using multiple server. I am trying to make my systems distributed, so that I…
user10798329
10
votes
1 answer

What is a crashloop?

I'm reading Google's Site Reliability Engineering book and ran across the word crashloop which I've never heard before and have not been able to locate a definition "If a task tries to use more resources than it requested, Borg kills the task and…
10
votes
2 answers

Difference between Zookeeper and a managed replicated database service

I just came across Zookeeper and am wondering as to what's the difference between Zookeeper and an available, consistent, durable, distributed, replicated database service like AWS DynamoDB or even AWS S3(storage service) for that matter. The key…
10
votes
4 answers

What is the difference between a distributed system and distributed computing?

I found out the following definitions of, respectively, distributed system and distributed computing: Distributed system: a collection of independent computers that are connected with an interconnection network. Distributed computing: a method of…
10
votes
2 answers

Framework or tool for "distributed unit testing"?

Is there any tool or framework able to make it easier to test distributed software written in Java? My system under test is a peer-to-peer software, and I'd like to perform testing using something like PNUnit, but with Java instead of .Net. The…
msugar
  • 151
  • 1
  • 6
10
votes
5 answers

Are PHP sessions hard to scale across a distributed system?

At work we do almost everything in Java and perl, but I wanted to build out a feature using PHP and sessions. Some peeps thought that it was a bad idea to try to do PHP sessions on our system, cause it's distributed to many servers. What would the…
Ken
  • 103
  • 1
  • 6
9
votes
4 answers

what is the Vertical and Horizontal distribution?

Vertical distribution : Distributed processing is equivalent to organizing a client-server application as a multitiered architecture . Place logically different components on different machines. Horizontal distribution : Distribution of the…
wasim
  • 103
  • 1
  • 1
  • 7
9
votes
3 answers

Microservices - Is event store technology (in event sourcing solutions) shared between all microservices?

As far as my little current experience allows me to understand, one of the core concepts about "microservice" is that it relies on its own database which is independent from other microservices. Diving into how to handle distributed transactions in…
9
votes
2 answers

How to connect MetaTrader with a Node.JS?

I'm building a system, based on Node.JS, to connect with MetaTrader and to process all action like link account, open, close trade order... But I still have not found out the way how to connect with MetaTrader in Nodejs. Can you give me a solution…
9
votes
3 answers

Is operation in raft log entry supposed to be idempotent?

In raft, when a node restart, it try to redo all the log entries to catch up the state. But if node goes down again in recovery phase, node would do some op twice. These twice redo op will violate state machine if ops are not idempotent. According…
smxxqjl
  • 91
  • 4
9
votes
2 answers

Find Top 10 Most Frequent visited URl, data is stored across network

Source: Google Interview Question Given a large network of computers, each keeping log files of visited urls, find the top ten most visited URLs. Have many large int (visits)> maps. Calculate < string (url) -> int (sum of visits…
Spandan
  • 2,128
  • 5
  • 25
  • 37
1 2
3
83 84