Fault tolerance refers to a system's capability to isolate, compensate for and recover from failure with minimal impact to the end user. When using this tag - include tags indicating the system and/or technology you are working with (as additional support meta-data).
Questions tagged [fault-tolerance]
305 questions
0
votes
1 answer
Does Elasticsearch stop indexing data when some nodes go down?
I have read that when a new indexing request is sent to ES cluster. ES will specify which shard should that document be stored in depending on routing. Then that node which hosts that primary shard (aka coordinating node) will broadcast the indexing…

Simple Code
- 2,354
- 2
- 27
- 56
0
votes
1 answer
Can PubSub be reasonably used to imitate two-way binding?
One of the most convenient things about AngularJS was the two-way binding. Would it be advisable to replicate that with a much smaller library like PubSubJS and not use AngularJS? Or, would that create many more events that PubSubJS is intended to…

Rozgonyi
- 1,027
- 13
- 27
0
votes
0 answers
Enterprise Wide Cluster Streaming System
I'm interested in deploying an enterprise service bus on a fault tolerant system with a variety of use cases that include tracking network traffic and analyzing social media data. I'd prefer to use a streaming application, but open to the idea of…

Rookie
- 7
- 3
0
votes
1 answer
Tibco EMS Server Fault tolerance with distributed setup not working for java app
I did fault tolerance setup with 2 instances on separate vm ( host1 & host2). If I stop the primary instance the secondary instance is getting activated successfully but the current connections are getting closed with error
reconnect failed:…

sachin
- 23
- 4
0
votes
1 answer
Ignite - full sync configuration
I have two server ignite nodes (Each node is started in the Spring Boot application) in the cluster.
And i have two cache:
//Persistence cahce
configuration.setReadThrough(true);
configuration.setWriteThrough(true);
…

Wi-Al
- 225
- 1
- 13
0
votes
1 answer
Fault tolerant Jenkins on DCOS
I am running a Jenkins server on DCOS as documented here https://docs.mesosphere.com/1.7/usage/tutorials/jenkins/.
The Jenkins server is able to spawn new mesos slaves when new jobs are scheduled and kill them when the job is completed.
But if a…

justcodeit
- 3
- 3
0
votes
0 answers
AWS region wise Fault tolerance
Motive: Tried Muti-AZ Fault tolerance for a word-press website, It worked fine. Now trying to achieve region wise Fault tolerance for the same website.
Problem faced: The new word-press website which I am trying to build is not getting…

amazon tam
- 35
- 1
- 5
0
votes
1 answer
How do notaries provide proof to nodes that they are honest?
For example, how can it be proven to a node that a notary hasn't colluded with a counterparty to double spend an output?
How is trust and consensus achieved within the Corda system?
Edit:
In a scenario in which a regulatory body is the operator of a…

Matthew Stannard
- 145
- 1
- 6
0
votes
0 answers
How to find out whether a message (e.g., a configuration change) or the confirmation of this message got lost?
This is kind of an abstract information theory question:
A sends a piece of information, e.g., a configuration change, to B. B responds with an ACK to tell A that the configuration change was received.
Now imagine that A never gets the response from…

Thomas Giesel
- 163
- 2
- 7
0
votes
1 answer
Specific Linux distros for running high-reliability spaceflight software?
Edited for clarity:
With reliability and fault-tolerance being extremely important, are there any specific Linux distros (or perhaps types of Linux distros) recommended for running high-reliability C++ software?
I am developing C++ software to…
user8782808
0
votes
1 answer
Apache Camel - Recovering from a JVM crash
I am considering using Apache camel for implementing EIP patterns in our solution. Our requirement is to build a fault-tolerant system which can recover from failures.
I understand the native error handling capabilities available with Apache Camel…

Marquis
- 9
- 1
0
votes
0 answers
Implementing Active/Passive Topology in Java
Need to implement active/passive topology for a Java instance.
The requirement is if active becomes unavailable due to some reason then passive takes over active and active becomes passive.
It's just a java application which just putting data into…

bittu
- 786
- 7
- 14
0
votes
1 answer
Which FTA Install package do I need to DL?
We are running an End-2-End configuration with TWSzOS as the MDM. The MDM & all of the current FTA's are 8.5.1 and I'm attempting to upgrade everything, but I have a current request to install a new FTA and I don't want to install another out of…

LFultz
- 1
- 2
0
votes
0 answers
Is there a way to make an Apache Flume Source resilient?
I've found some information in the docs and online threads that describe how to define a failover node for a Flume Sink, but what about a Flume Source? Is there a way to define a failover for a Source, or have a Source operate over an array of…

josiah
- 1,314
- 1
- 13
- 33
0
votes
4 answers
Making a simple program, how to I make it so blank/spaces input doesn't count/add to it
I am making a simple program that creates a grocery list. Right now, I am having trouble with blank input being added to my list: when I hit enter with or without spaces, it adds the blank input as an item. Is there a simple way to prevent…

Tye Chamberlain
- 79
- 1
- 1
- 2