2

Context:

  • I'm testing an elasticsearch 1.7.1 configuration that's set up by chef, and testing in kitchen
  • The chef script and configuration works because it's running in production somehow
  • running service elasticsearch start as the elasticsearch user fails, but the actual call it delegates to does not.

From what I've learned, chef scripts are run as root. So, when the test fails (it checks to see if elasticsearch is running by running service elasticsearch status), I log into the vagrant machine. As root, if I run service elasticsearch start, I get an OK (which is incorrect, but another issue) and then run a subsequent service elasticsearch status, I'm met with the error: elasticsearch dead but pid file exists

Digging further, I set debug statements on the init.d script that's run by service and saw that the actual command was basically a call to the init.d/functions function daemon, which just calls:

runuser -s /bin/bash elasticsearch -c 'ulimit -S -c 0 >/dev/null 2>&1 ; /usr/share/elasticsearch/bin/elasticsearch -p /var/run/elasticsearch/elasticsearch.pid -d -Des.default.path.home=/usr/share/elasticsearch -Des.default.path.logs=/var/log/elasticsearch/ -Des.default.path.data=/data/elasticsearch/data/ -Des.default.path.work=/tmp/elasticsearch -Des.default.path.conf=/etc/elasticsearch/'

So I tried a sudo su - elasticsearch and then ran the part in quotes:

[elasticsearch@default-centos ~]$ ulimit -S -c 0 >/dev/null 2>&1 ; 
/usr/share/elasticsearch/bin/elasticsearch 
-p /var/run/elasticsearch/elasticsearch.pid -d 
-Des.default.path.home=/usr/share/elasticsearch 
-Des.default.path.logs=/var/log/elasticsearch/ 
-Des.default.path.data=/data/elasticsearch/data/ 
-Des.default.path.work=/tmp/elasticsearch 
-Des.default.path.conf=/etc/elasticsearch/

A subsequent service elasticsearch status shows that elasticsearch is running just fine! I've even set the logging to TRACE, and there's no indication that elasticsearch has crashed.

BrDaHa
  • 5,138
  • 5
  • 32
  • 47
  • Any log from Chef ? Are you sure it is run as root actually ? The problem sounds more a system administration problem than chef related at end... – Tensibai Oct 22 '15 at 09:17
  • Are you using the community cookbook? I ask because it has a very old init script that was put together a while ago, and we're working on switching the cookbook to the same init scripts used by the deb/rpm packages shipped by elasticsearch.org. There's a `2.0.0_wip` branch where that's been done already, if you'd like to try it. – Martin Oct 22 '15 at 09:18
  • @Tensibai I saw a difference in the way chef exited but it doesn't seem related. Chef had something in its chef-client.log about how a validation.pem file was missing, and errored out. But this doesn't seem like it would be causing something like my issue above? – BrDaHa Oct 22 '15 at 21:40
  • @Martin AFAIK this is the init script that Elasticsearch 1.7.1 comes with – BrDaHa Oct 22 '15 at 21:40
  • If you're not using the community cookbook, it sounds like a problem with the packaged init script. I've noticed something similar when recently testing 1.7.2 and 1.7.3. Probably worth reporting to the project upstream. – Martin Oct 23 '15 at 02:15

0 Answers0