Riak all_nodes_down error but they are up

Question

I built a Riak cluster on three Raspberry Pi computers. vm.args and app.config files are double checked on each node (static IP addresss are correct).

All nodes are valid:

# ./riak-admin member-status 
================================= Membership ==================================
Status     Ring    Pending    Node
-------------------------------------------------------------------------------
valid       0.0%     32.8%    'riak@192.168.8.214'
valid       0.0%     32.8%    'riak@192.168.8.215'
valid     100.0%     34.4%    'riak@192.168.8.59'
-------------------------------------------------------------------------------
Valid:3 / Leaving:0 / Exiting:0 / Joining:0 / Down:0

And they are all up:

# ./riak-admin ring_status  
================================== Claimant ===================================
Claimant:  'riak@192.168.8.59'
Status:     up
Ring Ready: true

============================== Ownership Handoff ==============================
Owner:      riak@192.168.8.59
Next Owner: riak@192.168.8.214

Index: 0
  Waiting on: [riak_kv_vnode,riak_pipe_vnode]

(... skipping index's)

-------------------------------------------------------------------------------
Owner:      riak@192.168.8.59
Next Owner: riak@192.168.8.215

(... skipping index's)

-------------------------------------------------------------------------------

============================== Unreachable Nodes ==============================
All nodes are up and reachable

I can ping each node on Riak's port, and it returns OK. The problem is the following: If I add simple key value, it returns an all_nodes_down error.

For example, here I'm trying to attribute value Allo to a keys hello fr:

# curl -XPUT http://192.168.8.59:8098/riak/hello/fr -d ‘Allo’
Error:
all_nodes_down

Before building the cluster when I had only one node, I added this key using localhost and I could retrieve it without any problems or errors.

I've looked up mailing lists at basho.com, and it seems this error occurs when the ring is wrong, for example if an administrator changes the claimant node's name without cleaning the ring, etc., but this is not my case. The ring was purged on each node before configuration and Riak start. I'm not an experienced system administrator, and I'm completely new to all what is distributed systems so if anyone has an idea or suggestion, please share.

Edit:

Suggestion from official Riak documentation:

http://docs.basho.com/riak/latest/ops/running/recovery/errors/

Check riak-admin member-status and ensure that all expected nodes in the cluster are of valid Status

As you can see on my riak-admin member-status command result, all nodes are listed as valid.

have you checked the console.log for errors? It seems odd that one node would still have 100% after a couple of hours on a new cluster — Joe, Jun 25 '14 at 16:49

score 0 · Answer 1 · answered Jun 27 '14 at 11:59

Along with what Joe said about checking the logs in /var/log/riak, check these commands:

riak-admin transfer-limit - make sure this isn't 0.
riak-admin transfers - re-run this command using GNU watch every 5 seconds or so to ensure that transfers are happening. If not, check your log files for errors.

Riak all_nodes_down error but they are up

1 Answers1