I am having huge problems with my cluster. Servers keeps disconnecting for unknown reason (there is nothing in logs) and crashing for unknown reason. I think I might have cluster setup wrong.
First this is first, I understand sharding and that is great feature but what are:
"n replica per shard"?
What that means?
Second thing. How to configure cluster with "n" servers? I have 6 servers because of sharding (i have few docs with more then 10mil records) but I am not sure that I configured my cluster correctly.
On every server I wrote:
for example (srv1.conf)
join=srv2:port
join=srv3:port
join=srv4:port
join=srv5:port
join=srv6:port
Is this correct way to add server to cluster?
There is nothing in docs and it would be great if you can post some "recommended" cluster configuration.
And third thing is about failover. In my 6 cluster server all tables have 6
shards with three replicas. Once i shout down for example server 1 app goes down and some crazy writes being on cluster. What is point of cluster if I do not have some redundancy if some other server goes down?
I really hope that someone could help me with this because when I had just one server my app was working all the time. Now every time that some servers get disconnected everything crashes. I am using nodejs rethinkdbdash.
UPDATE
I know what shard is, I have 2mil records for one table for example and they are distributed among 6 servers (for me this is important because of read speed). I do not understand what is "replica". Every table is configured like this, 6 shards and 3 replicas per shard. From what you said that means if some server goes down table will be available for read, but it is not (saying something like set read_mode=outdated and app crash). There is no way that I am going to change every part of app that is doing read and say read_mode= outdated. That is just poor programming.
There is nothing in logs. On every server in dmesg I have this:
TCP: TCP: Possible SYN flooding on port 28015. Sending cookies. Check SNMP counters.