1

Looking for a help with testing terminate/2 callback in my Channel.

Test and setup looks like this:

setup do
  :ok = Ecto.Adapters.SQL.Sandbox.checkout(MyApp.Repo)
  Ecto.Adapters.SQL.Sandbox.mode(MyApp.Repo, {:shared, self()})

  {:ok, socket} = connect(UserSocket, %{token: "some_token"})
  {:ok, %{}, socket} = subscribe_and_join(socket, "some_channel", %{})

  %{socket: socket}
end

test "terminate/2", %{socket: socket} do
  # for avoiding "** (EXIT from #PID<...>) {:shutdown, :closed}"
  Process.unlink(socket.channel_pid)

  assert close(socket) == :ok
  # some additional asserts go here
end

In terminate/2 method I just call a helper module, let's name it TerminationHandler.

def terminate(_reason, _socket) do
  TerminationHandler.call()
end

And call/0 method in TerminationHandler contains a DB query. It can look like this i.e

def call() do
  users = User |> where([u], u.type == "super") |> Repo.all # line where error appears
  # some extra logic goes here
end

This is the error that I get periodically (maybe once in 10 runs)

14:31:29.312 [error] GenServer #PID<0.1041.0> terminating
** (stop) exited in: GenServer.call(#PID<0.1040.0>, {:checkout, #Reference<0.3713952378.42205187.247763>, true, 60000}, 5000)
    ** (EXIT) shutdown: "owner #PID<0.1039.0> exited with: shutdown"
    (db_connection) lib/db_connection/ownership/proxy.ex:32: DBConnection.Ownership.Proxy.checkout/2
    (db_connection) lib/db_connection.ex:928: DBConnection.checkout/2
    (db_connection) lib/db_connection.ex:750: DBConnection.run/3
    (db_connection) lib/db_connection.ex:644: DBConnection.execute/4
    (ecto) lib/ecto/adapters/postgres/connection.ex:98: Ecto.Adapters.Postgres.Connection.execute/4
    (ecto) lib/ecto/adapters/sql.ex:256: Ecto.Adapters.SQL.sql_call/6
    (ecto) lib/ecto/adapters/sql.ex:436: Ecto.Adapters.SQL.execute_or_reset/7
    (ecto) lib/ecto/repo/queryable.ex:133: Ecto.Repo.Queryable.execute/5
    (ecto) lib/ecto/repo/queryable.ex:37: Ecto.Repo.Queryable.all/4
    (my_app) lib/my_app/helpers/termination_handler.ex:4: MyApp.Helpers.TerminationHandler.call/0
    (stdlib) gen_server.erl:673: :gen_server.try_terminate/3
    (stdlib) gen_server.erl:858: :gen_server.terminate/10
    (stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3
Last message: {:join, Phoenix.Channel.Server}Last message: {:join, Phoenix.Channel.Server}

Would appreciate any responses regarding reasons of this error and possible ways to avoid it.

Adam Millerchip
  • 20,844
  • 5
  • 51
  • 74

2 Answers2

1

As stated in the documentation for GenServer.terminate/2:

[...] the supervisor will send the exit signal :shutdown and the GenServer will have the duration of the timeout to terminate. If after duration of this timeout the process is still alive, it will be killed immediately.

That is seemingly your case. DBConnection.checkout/2 seems to be waiting for the available connection to appear and this is lasted beyond the timeout. Hence the owner experiences a brutal kill.

There could be two possible solutions:

  • increase a timeout of shutdown (I would avoid that)
  • increase an amount of allowed simultaneous database connections.

The latter is likely needed in any case, since your pool seems to be full. That way the connection would be checked out immediately, and it should return in the timeout interval successfully.

Aleksei Matiushkin
  • 119,336
  • 10
  • 100
  • 160
  • Thanks for a reply! As I understand the default timeout is 60_000ms for a [web socket](https://hexdocs.pm/phoenix/Phoenix.Transports.WebSocket.html) and I didn't change it. This is definitely enough for my terminate function to complete, so looks like there is more than just a timeout issue, or I still don't get something... Also terminate function here is to do some job and WS should not kill anything until function finishes, otherwise there is not much sense in it, as you can't be sure that everything will be done on a close phase. – Aleksandr Mikhaylenko Oct 24 '19 at 07:19
  • It’s a timeout on one of your genservers/supervisors, not on the websocket itself, I believe. – Aleksei Matiushkin Oct 24 '19 at 07:57
  • ok, thanks. Seems like due to unlinking of a socket process it loses a DB connection and error happens, but how to resolve it is a next question – Aleksandr Mikhaylenko Oct 24 '19 at 13:52
0

This might help.

  defmacro leave_channel(socket) do
    quote do
      Process.unlink(unquote(socket).channel_pid)
      mref = Process.monitor(unquote(socket).channel_pid)
      ref = leave(unquote(socket))
      assert_reply ref, :ok
      assert_receive {:DOWN, ^mref, :process, _pid, _reason}
    end
  end


  defmacro close_socket(socket) do
    quote do
      Process.unlink(unquote(socket).channel_pid)
      mref = Process.monitor(unquote(socket).channel_pid)
      close(unquote(socket))
      assert_receive {:DOWN, ^mref, :process, _pid, _reason}
    end
  end