Questions tagged [fault-tolerance]

89 questions
2
votes
3 answers

When to employ server failover in a virtual environment

This question kind of got me thinking about fault tolerance in DHCP, so I did a little digging in my current environment and discovered that we only have 1 DHCP server per major site in our company with no redundancy. All of our DHCP servers are…
2
votes
2 answers

Detect HP ML370 G5 Bad disk in RAID0

I have one HP ML370 G5 server. Which has four 146GB SAS disks and two 300GB SAS disks that are configured as RAID 0. I have encountered a problem with one of my disks. The windows server event viewer shows the error below: Logical drive 2…
Pooya Yazdani
  • 277
  • 5
  • 12
2
votes
1 answer

What is the importance of "RpcClientAccessServer" on a database in an Active/Active stretched DAG?

I have two sites with a stretched DAG between them. By "sites" I mean two AD sites that correspond with different physical and network circuits. When I take an individual mailbox database and move it to a second site, I want to make sure all the…
makerofthings7
  • 8,911
  • 34
  • 121
  • 197
2
votes
1 answer

ClearOS - how to avoid getting stuck at a fsck message at boot?

I have had this happen a couple times - I have a ClearOS Enterprise 5.2 box, and due to a power outage or similar, it ends up showing an error at boot and saying that fsck needs to be run (I think it said with (or without?) the -a parameter). …
Scott Szretter
  • 1,882
  • 11
  • 43
  • 66
2
votes
2 answers

keepalived questions (requirements, abilities, limitations)

1) What are keepalived's (physical/network) requirements? Does the two (or more) keepalived' nodes need to be connected to the same switch? (something related to broadcasting maybe). 2) Can keepalived nodes run on different networks, "internet"…
Poni
  • 315
  • 3
  • 14
1
vote
0 answers

VMWare Fault Tolerance vLockStep

I would like to understand how vLockStep is working and could only find very high level description. The current documentation say that disk read but not disk write performed by the primary are replicated to the secondary. What is the reason for…
skyde
  • 11
  • 5
1
vote
2 answers

What is the recommended way to setup Docker on a multi-drive server in a fault tolerant mode?

I have a server with 4 250GB sata drives on regular sata controllers. I would like to setup Docker on some sort of fault tolerant file system so that if one of the drives fails the whole thing doesn't collapse. I'm pretty sure it's probably not…
Frank Barcenas
  • 605
  • 6
  • 18
1
vote
1 answer

How to restart a systemd service when a file has not been modified for a period of time?

My setup has a systemd service that should periodically write to a file. I would like to monitor the file for changes, so when it hasn't been modified for a while, I know the service has bugged out. I would like to be able to automatically restart…
pinealan
  • 13
  • 4
1
vote
2 answers

Backup strategy for dedicated LEMP stack server

We currently have an application running on a dedicated server which utilizes a LEMP (Linux, Nginx, MariaDB, PHP) stack. Right now we are only doing backups at a set interval (every x hours). I have been researching how we should go about having a…
1
vote
1 answer

HAProxy - clear stick-table from the command line

I'm using HAProxy to do automatic failover for an LDAP server, but I don't want automatic fail-back. The scenario is that I have 2 nodes, s1 and s2. I want all traffic going to s1 unless it fails, when it fails, I want all traffic going to s2. …
RikSaunderson
  • 207
  • 4
  • 13
1
vote
1 answer

Fault-tolerant S3 website hosting

Due to the recent S3 downtime episode on the East Coast, I want to ask the community what is the best way to implement a fault-tolerant S3 website hosting solution? From my understanding, you need to name a bucket after your domain (e.g.,…
Justin
  • 113
  • 4
1
vote
1 answer

rsyslog and elasticsearch: How to configure multiple servers?

We are currently setting some hosts to forward their logs via rsyslog and omelasticsearch to an elasticsearch cluster. The manual for omelasticsearch seems to allow only one server name of the ES cluster to be configured, which would be a single…
Martin Schröder
  • 315
  • 1
  • 5
  • 24
1
vote
0 answers

Can I use a reverse-proxy on a hosting service to redirect connections to multiple IPs while taking care of maintaining sessions?

I have a situation where I am trying to host my own web server because I need some custom software, including telephony, WebRTC, node.js, etc. However, my net connection in the town and country where it will be hosted, over a fiber optic line is not…
Sunny
  • 381
  • 1
  • 6
  • 16
1
vote
0 answers

VMware ESXi - what happens when unable to create a secondary instance of a VM for fault tolerance

Is anyone aware what would happen if VMware ESXi is unable to start back up a secondary instance of a VM configured for fault tolerance? Is there a fallback plan to try and get the VM up? Or is the VM considered failed at this point? Initially, when…
O_O
  • 635
  • 3
  • 15
  • 25
1
vote
1 answer

PHP - Memcached - Libmemcached - Handle cache server outage

I am working on ensuring our app degrades gracefully in case of a complete cache outage, which is highly unlikely, as we have minimum of at least 3 cache nodes to add to the cache pool, by way of PHP's memcached addServer api call. However, it is…
Mike Purcell
  • 1,708
  • 7
  • 32
  • 54