9

I have a supervisor with one_for_one restart strategy. Is it possible to set some time interval between child process restarting?

For example, the remote db crushed and I want to wait 10 seconds between restore connection attempts.

stvar
  • 6,551
  • 2
  • 13
  • 28
kolchanov
  • 2,018
  • 2
  • 14
  • 32

2 Answers2

6

Actually, you could let the supervisor to immediately restart its children and implement what is called lazy initialization:

  1. The supervisor (re)starts (immediately) the child (say, a gen_server)
  2. The gen_server returns a 0 timeout in its init function
  3. In the handle_info you do an active wait (your 10 seconds) to ensure the DB is properly initialized

This way, you ensure that all requests to the gen_server are processed after the DB is properly initialized.

Roberto Aloi
  • 30,570
  • 21
  • 75
  • 112
  • Thank you, but I hav't problem with gen_server init, I want a time interval (sleep) between restart attempt. – kolchanov Oct 10 '12 at 17:06
  • 2
    I think what @Robert Aloi suggests will work. The gen_server's init would not try to reconnect, instead it returns the 0 timeout which will satisfy the supervisor that the child has started. Then, in the handle_info function you do the "real" initialization... After sleeping for 10 seconds – Jr0 Oct 11 '12 at 01:42
  • Or even better after ensuring that the DB is alive. Waiting a fixed amount of time is the root of all evils. – Roberto Aloi Oct 11 '12 at 07:04
  • 3
    The trouble is that you **CANNOT** specify a time interval between retart attempts. There is no way to do it in the supervisor. @RobertoAloi is giving you a work around where the supervisor immediately restarts the server, as is its way, and you wait inside the server. – rvirding Oct 11 '12 at 20:21
0

You can't do that with standard supervisor behaviour, you need to implement your own sup as a gen_server which trap exits from others than its parent and restarts manually their childs, but which also check each time before restarting a child that the 10 seconds are expired by setting a timeout

Abdelghani
  • 45
  • 5