I have configured a Riak cluster and all is well. I configured ejabberd.yml to talk to it but the last few lines I see in the logs are as below and I can't figure out how to debug it any further.
Anyone able to help?
Change ejabberd install user to root...
Missing ssl cert for your host. Generating snakeoil ssl cert for xmpp.alchemy.local...
Generating a 4096 bit RSA private key
..............................................++
..............................................++
writing new private key to '/tmp/selfsigned.key'
-----
Writing ssl cert and private key to '/opt/ejabberd/ssl/host.pem'...
Missing ssl cert for your xmpp domain. Generating snakeoil ssl cert for alchemy.local...
Generating a 4096 bit RSA private key
................................................++
........................++
writing new private key to '/tmp/selfsigned.key'
-----
Writing ssl cert and private key to '/opt/ejabberd/ssl/alchemy.local.pem'...
Generating ejabberd config file...
Generating ejabberdctl config file...
Starting ejabberd...
tail: cannot open ‘/var/log/ejabberd/crash.log’ for reading: No such file or directory
tail: cannot open ‘/var/log/ejabberd/error.log’ for reading: No such file or directory
tail: cannot open ‘/var/log/ejabberd/erlang.log’ for reading: No such file or directory
tail: ‘/var/log/ejabberd/crash.log’ has been replaced with a remote file. giving up on this name
tail: ‘/var/log/ejabberd/error.log’ has been replaced with a remote file. giving up on this name
09:57:16.424 [info] Application lager started on node ejabberd@xmpp
09:57:16.431 [info] Application crypto started on node ejabberd@xmpp
09:57:16.442 [info] Application sasl started on node ejabberd@xmpp
09:57:16.452 [info] Application asn1 started on node ejabberd@xmpp
09:57:16.452 [info] Application public_key started on node ejabberd@xmpp
09:57:16.466 [info] Application ssl started on node ejabberd@xmpp
09:57:16.474 [info] Application p1_yaml started on node ejabberd@xmpp
09:57:16.485 [info] Application p1_tls started on node ejabberd@xmpp
09:57:16.493 [info] Application p1_xml started on node ejabberd@xmpp
09:57:16.507 [info] Application p1_stringprep started on node ejabberd@xmpp
09:57:16.511 [info] Application p1_zlib started on node ejabberd@xmpp
09:57:16.519 [info] Application cache_tab started on node ejabberd@xmpp
09:57:16.677 [info] Application mnesia started on node ejabberd@xmpp
09:57:17.179 [info] Application inets started on node ejabberd@xmpp
09:57:17.558 [notice] Changed loglevel of //var/log/ejabberd/ejabberd.log to debug
09:57:17.567 [debug] Supervisor ejabberd_sup started ejabberd_hooks:start_link() at pid <0.239.0>
09:57:17.575 [debug] Supervisor kernel_safe_sup started pg2:start_link() at pid <0.241.0>
09:57:17.575 [debug] Supervisor ejabberd_sup started ejabberd_node_groups:start_link() at pid <0.240.0>
09:57:17.575 [debug] Supervisor ejabberd_sup started ejabberd_system_monitor:start_link() at pid <0.242.0>
09:57:17.577 [debug] Supervisor ejabberd_sup started ejabberd_router:start_link() at pid <0.243.0>
09:57:17.579 [debug] Supervisor ejabberd_sup started ejabberd_router_multicast:start_link() at pid <0.249.0>
09:57:17.582 [debug] Supervisor ejabberd_sup started ejabberd_s2s:start_link() at pid <0.255.0>
09:57:17.584 [debug] Supervisor ejabberd_sup started ejabberd_local:start_link() at pid <0.265.0>
09:57:17.585 [debug] The option captcha_cmd is not configured, but some module wants to use the CAPTCHA feature.
09:57:17.587 [debug] Supervisor ejabberd_sup started ejabberd_captcha:start_link() at pid <0.271.0>
09:57:17.587 [debug] Supervisor ejabberd_sup started ejabberd_tmp_sup:start_link(ejabberd_s2s_in_sup, ejabberd_s2s_in) at pid <0.273.0>
09:57:17.587 [debug] Supervisor ejabberd_sup started ejabberd_tmp_sup:start_link(ejabberd_s2s_out_sup, ejabberd_s2s_out) at pid <0.274.0>
09:57:17.587 [debug] Supervisor ejabberd_sup started ejabberd_tmp_sup:start_link(ejabberd_service_sup, ejabberd_service) at pid <0.275.0>
09:57:17.587 [debug] Supervisor ejabberd_sup started ejabberd_tmp_sup:start_link(ejabberd_iq_sup, gen_iq_handler) at pid <0.276.0>
09:57:17.588 [debug] Supervisor ejabberd_sup started ejabberd_tmp_sup:start_link(ejabberd_frontend_socket_sup, ejabberd_frontend_socket) at pid <0.277.0>
09:57:17.588 [debug] Supervisor ejabberd_sup started ejabberd_listener:start_link() at pid <0.278.0>
09:57:17.591 [info] Application protobuffs started on node ejabberd@xmpp
09:57:17.591 [info] Application riak_pb started on node ejabberd@xmpp
09:57:17.591 [info] Application riakc started on node ejabberd@xmpp
09:57:17.600 [debug] Supervisor inet_gethost_native_sup started undefined at pid <0.289.0>
09:57:17.600 [debug] Supervisor kernel_safe_sup started inet_gethost_native:start_link() at pid <0.288.0>
I managed to get a bit more detail by ejabberdctl debug. It seems to be having an issue with the node name, but the exact same config works fine if i don't add Riak to the ejabberd.yml file.
tail: cannot open ‘/var/log/ejabberd/crash.log’ for reading: No such file or directory
tail: cannot open ‘/var/log/ejabberd/error.log’ for reading: No such file or directory
tail: cannot open ‘/var/log/ejabberd/erlang.log’ for reading: No such file or directory
{error_logger,{{2016,2,4},{19,30,38}},"Invalid node name: ~p~n",['debug-not a tty-ejabberd@xmpp']}
{error_logger,{{2016,2,4},{19,30,38}},crash_report,[[{initial_call,{net_kernel,init,['Argument__1']}},{pid,<0.20.0>},{registered_name,[]},{error_info,{exit,{error,badarg},[{gen_server,init_it,6,[{file,"gen_server.erl"},{line,322}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,237}]}]}},{ancestors,[net_sup,kernel_sup,<0.10.0>]},{messages,[]},{links,[<0.17.0>]},{dictionary,[{longnames,false}]},{trap_exit,true},{status,running},{heap_size,987},{stack_size,27},{reductions,902}],[]]}
{error_logger,{{2016,2,4},{19,30,38}},supervisor_report,[{supervisor,{local,net_sup}},{errorContext,start_error},{reason,{'EXIT',nodistribution}},{offender,[{pid,undefined},{name,net_kernel},{mfargs,{net_kernel,start_link,[['debug-not a tty-ejabberd',shortnames]]}},{restart_type,permanent},{shutdown,2000},{child_type,worker}]}]}
{error_logger,{{2016,2,4},{19,30,38}},supervisor_report,[{supervisor,{local,kernel_sup}},{errorContext,start_error},{reason,{shutdown,{failed_to_start_child,net_kernel,{'EXIT',nodistribution}}}},{offender,[{pid,undefined},{name,net_sup},{mfargs,{erl_distribution,start_link,[]}},{restart_type,permanent},{shutdown,infinity},{child_type,supervisor}]}]}
{error_logger,{{2016,2,4},{19,30,38}},crash_report,[[{initial_call,{application_master,init,['Argument__1','Argument__2','Argument__3','Argument__4']}},{pid,<0.9.0>},{registered_name,[]},{error_info,{exit,{{shutdown,{failed_to_start_child,net_sup,{shutdown,{failed_to_start_child,net_kernel,{'EXIT',nodistribution}}}}},{kernel,start,[normal,[]]}},[{application_master,init,4,[{file,"application_master.erl"},{line,133}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,237}]}]}},{ancestors,[<0.8.0>]},{messages,[{'EXIT',<0.10.0>,normal}]},{links,[<0.8.0>,<0.7.0>]},{dictionary,[]},{trap_exit,true},{status,running},{heap_size,376},{stack_size,27},{reductions,117}],[]]}
{error_logger,{{2016,2,4},{19,30,38}},std_info,[{application,kernel},{exited,{{shutdown,{failed_to_start_child,net_sup,{shutdown,{failed_to_start_child,net_kernel,{'EXIT',nodistribution}}}}},{kernel,start,[normal,[]]}}},{type,permanent}]}
{"Kernel pid terminated",application_controller,"{application_start_failure,kernel,{{shutdown,{failed_to_start_child,net_sup,{shutdown,{failed_to_start_child,net_kernel,{'EXIT',nodistribution}}}}},{kernel,start,[normal,[]]}}}"}
Crash dump was written to: //var/log/ejabberd/erl_crash_20160204-193037.dump
Kernel pid terminated (application_controller) ({application_start_failure,kernel,{{shutdown,{failed_to_start_child,net_sup,{shutdown,{failed_to_start_child,net_kernel,{'EXIT',nodistribution}}}}},{k
**** Update ****
The reason that this was happening was because I hadn't specified the --net network connection in my Docker run command and so it couldn't find the network. Unfortunately there is no debug to indicate this it just hangs forever (this would happen in the general case where the network that hosts the Riak cluster cannot be found and is a nightmare to debug).
I still don't know why the "debug" flag causes tty to be added to the node name but i suspect it is somehow related as when i fixed it that also stopped appearing.
Anyone interested in knowing the detail can view it here https://github.com/rroemhild/docker-ejabberd/issues/81#issuecomment-181133336