1

I have created a very basic Elixir supervisor and worker, to test hot code reloading feature of Erlang VM. Here is the supervisor:

defmodule RelTest do
  use Application

  # See http://elixir-lang.org/docs/stable/elixir/Application.html
  # for more information on OTP Applications
  def start(_type, _args) do
    import Supervisor.Spec, warn: false

    port = Application.get_env(:APP_NAME, :listen_port, 9000)
    {:ok, socket} = :gen_tcp.listen(port, [:binary, active: false, reuseaddr: true])

    # Define workers and child supervisors to be supervised
    children = [
      # Starts a worker by calling: RelTest.Worker.start_link(arg1, arg2, arg3)
      # worker(RelTest.Worker, [arg1, arg2, arg3]),
      worker(Task, [fn -> TestListener.start(socket) end])
    ]

    # See http://elixir-lang.org/docs/stable/elixir/Supervisor.html
    # for other strategies and supported options
    opts = [strategy: :one_for_one, name: RelTest.Supervisor]
    Supervisor.start_link(children, opts)
  end
end

Basically, I'm starting a Task worker, which is:

defmodule TestListener do
  require Logger

  def start(socket) do
    {:ok, client} = :gen_tcp.accept(socket)
    Logger.info "A client connected"
    Task.async(fn -> loop(client) end)
    start(socket)
  end

  def loop(socket) do
    case :gen_tcp.recv(socket, 0) do
      {:ok, _} ->
    say_hello(socket)
    Logger.info "Said hello to client ;)"
    loop(socket)
      {:error, _} ->
    Logger.info "Oops, client had error :("
    :gen_tcp.close(socket)
    end
  end

  def say_hello(socket) do
    :ok = :gen_tcp.send(socket, <<"Hey there!\n">>)
  end

end

This is version 0.1.0. So I run these:

MIX_ENV=prod mix compile
Mix_ENV=prod mix release

and I get a nice release. I run it with ./rel/rel_test/bin/rel_test console and everything works. Now I'm going to bump the code and version, so here is the version 0.1.1 of listener:

defmodule TestListener do
  require Logger

  def start(socket) do
    {:ok, client} = :gen_tcp.accept(socket)
    Logger.info "A client connected"
    Task.async(fn -> loop(client) end)
    start(socket)
  end

  def loop(socket) do
    case :gen_tcp.recv(socket, 0) do
      {:ok, _} ->
    say_hello(socket)
    Logger.info "Said hello to client ;)"
    loop(socket)
      {:error, _} ->
    Logger.info "Oops, client had error :("
    :gen_tcp.close(socket)
    end
  end

  def say_hello(socket) do
    :ok = :gen_tcp.send(socket, <<"Hey there, next version!\n">>)
  end

end

Now I run

MIX_ENV=prod mix compile
Mix_ENV=prod mix release

and appup is created successfully, then to do hot upgrade

./rel/rel_test/bin/rel_test upgrade "0.1.1"

and the upgrade works, but it kills my listener after upgrade.

I tested with a nc localhost 9000 (9000 is the port of listener), staying connected and running upgrade command. Connection gets killed and I get a message in console:

=SUPERVISOR REPORT==== 31-Aug-2016::23:40:09 ===
     Supervisor: {local,'Elixir.RelTest.Supervisor'}
     Context:    child_terminated
     Reason:     killed
     Offender:   [{pid,<0.601.0>},
                  {id,'Elixir.Task'},
                  {mfargs,
                      {'Elixir.Task',start_link,
                          [#Fun<Elixir.RelTest.0.117418367>]}},
                  {restart_type,permanent},
                  {shutdown,5000},
                  {child_type,worker}]

So why this happens? Is it something I'm missing, or it is the expected behavior? Is it not the use case of hot code reloading?

I have read LYSE, but the author says the running code should keep running, only the external calls made after the upgrade are to be served with new version.

Then why kill the worker?

vfsoraki
  • 2,186
  • 1
  • 20
  • 45

0 Answers0