0

I need help getting Riak to work with Chef.

Currently every time I chef an amazon box with Riak 1.4.8 using the default basho riak cook book I have to manually ssh into the machine kill -9 the beam.smp process then rm -rf /var/lib/riak/ring then I can finally do sudo riak start and it will work.

Prior to that I get:

Node 'riak@' not responding to pings.

I have even created a shell script:

#!/bin/bash
# Generated by Chef for <%= @node[:fqdn] %>
#<%= @node[:ec2][:local_ipv4] %>
# This script should be run by root.

riak stop
riakPid="/var/run/riak/riak.pid"
if [ -e "$riakPid" ]; then
   kill -9 $(<${riakPid})
fi
rm -f /var/run/riak/*
rm -f /var/lib/riak/ring/*
riak start

And Chef says:

bash[/etc/riak/clearOldRiakInfo.sh] ran successfully

For the above script.

If I manually run that script everything works fine. Why is this not cheffing properly.

UPDATE: This has been solved by creating a script to delete the ring directory when the machine gets cheffed.

This would only happen when I would create a new machine from scratch as the fqdn would get set correctly after Riak had started and created the ring. If I manually went on the box and deleted the ring then it would rechef perfectly fine. So I have to create the script so that the very first chef run on the machine would clean out the ring info.

twreid
  • 1,453
  • 2
  • 22
  • 42
  • When you execute the bash script, it's going to be tied to the parent Chef process. So you need to detach or it will die when the Chef Client run finishes. – sethvargo Jun 11 '14 at 16:52
  • How do I detach it from the chef process? – twreid Jun 11 '14 at 16:56
  • Use something like init.d or upstart. – sethvargo Jun 11 '14 at 16:59
  • Can you upload the generated `vm.args` and `app.config` files in `/etc/riak/` to a gist? – Alex Moore Jun 13 '14 at 17:57
  • Sorry I forgot to comment back. I ended up getting it to work by having chef run a script that deletes the ring directory then restarts riak. It wasn't cheffing properly because the node name was riak@#{node['fqdn']}, but when the ring was first created the node name was different. – twreid Jun 13 '14 at 18:21

1 Answers1

1

Given the error message you provided, Riak is not starting because the Erlang node name is not being generated correctly. The Erlang node name configuration exists within vm.args and is produced by the node['riak']['args']['-name'] attribute.

The default for node['riak']['args']['-name'] is riak@#{node['fqdn']}. Please check the value Ohai is reporting for node['fqdn']. Alternatively, if you are overriding this attribute somewhere else, ensure that produces a valid value for -name.

A more detailed description of -name within vm.args can be found here.

Hector Castro
  • 423
  • 3
  • 7
  • You are correct that was the reason it wasn't starting. I ended up having to delete the ring directory and restart riak. – twreid Jun 16 '14 at 13:39