4

I have a chain of Akka actors K -> L -> M, where K sends messages to L, L sends message to M, and each can reply to its predecessor using the sender of a received message.

My question is this: how can we safely unlink L from the chain, such that after the operation K sends messages directly to M, and M sees K as its sender?

If each actor was stable and long-lived, I could work out how to tell K and M to talk to each other. L could hang around long enough to forward messages already in her mailbox until getting a signal that each of K and M was no longer talking to L.

However, K and M might also be thinking about unlinking themselves, and that's where everything gets hairy.

Is there a well-known protocol for making this work safely? So far I haven't found the right search terms.

I know that one alternative would be to freeze the system and copy the entire thing, minus L, into a new chain. However, I would like to implement a more real-time, less interrupted solution.

I've been thinking about some kind of lock exchange, in which K and M promise to not unlink themselves until L has finished unlinking and forwarding all messages. But any solution with the word "lock" seems awkward in an asynchronous solution, even though the actors wouldn't be fully locked (just delaying their own unlink operations until a convenient time).

Daniel Ashton
  • 1,388
  • 11
  • 23
  • 1
    What are you trying to achieve? What's the bigger picture here that you are seeking to address by removing a link from an asynchronous communication chain? – Robin Green Dec 15 '13 at 21:53
  • 1
    Would it work if you replaced L with a stateful proxy which first delegates to your current L for processing, but at some point forgets about that delegate and simply passes messages directly between K and M. You still have 3 actors, but you are effectively turning one of them into a no-op at some point. – erickson Dec 16 '13 at 00:35
  • 1
    In case when you want to swap L with M your actor K can send a message Z to L and after that it can stop sending messages and wait for confirmation message X. Your actor L in case of message Z (if you're using priority-based mailbox don't forget to set lower than regular priority for Z) can send a confirmation message X to K (and optionally stop itself). After receiving X K can safely send messages directly to M. You can use [FSM](http://doc.akka.io/docs/akka/snapshot/scala/fsm.html) for implementation of the state logic. – Sergiy Prydatchenko Dec 16 '13 at 12:57
  • @RobinGreen, I'm trying to learn how to build a living web of nodes that can safely reroute around themselves when no longer needed for business reasons. The bigger picture is that over time the chain will gather a growing number of nodes that have done their duty and reached the end of life. They could remain as forwarding-only nodes, but eventually they will far outweigh the living nodes, causing a degradation in message-passing speed (as messages have to be routed through more and more forwarding-only nodes). Rather than freezing and copying the chain, I want a way to prune it on-the-fly. – Daniel Ashton Dec 16 '13 at 20:41
  • @erickson, I think the idea of a replacement proxy/delegate would be most delightful if K, L and M might be of different types of actors. If they are all the same type, the replacement/no-op logic might just as well be built directly into that type of actor. But your approach is interesting because it begins to differentiate between message handling and the actual work of an actor. Nice. – Daniel Ashton Dec 16 '13 at 20:51
  • @SergiyPrydatchenko, you touch on something that looks necessary here: having the upstream K agree that it will refrain from sending messages to L (or M) until it gets confirmation X. I'm thinking about implementing this using Stashes. In my chain the natural Z might be addressed to L or L's downstream, so I'll need to create an artificial Z, probably on request from L. When you say "set lower than regular priority" did you mean that Z should be processed sooner? or later than other messages in L's mailbox? What should happen if M wants a Z from L during the Z and X exchange between K and L? – Daniel Ashton Dec 16 '13 at 21:06
  • 1
    @DanielAshton, when I say "set lower than regular priority" I mean "later" (this matters only for priority mailboxes). Z message in this case means "The mailbox of L should be empty, now we can terminate L, but we should send a confirmation to K first. This confirmation should allow K to safely switch to M with full preserving of the order of messages from K to M". If your application don't rely on messages order, then you can simply switch K to M and send a PoisonPill to L. In this case M may some time receiving messages from K and L simultaneously and their order might be broken. – Sergiy Prydatchenko Dec 17 '13 at 11:55
  • 1
    @DanielAshton, if " In my chain the natural Z might be addressed to L or L's downstream" then probably you should add the same Z->X scheme to your L and L downstream-s. In this case it will looks like a recursive processing of Z-s. L in this case should send a confirmation X to K (and optionally terminate itself) only after it will receive X-s from all of its downstream-s. – Sergiy Prydatchenko Dec 17 '13 at 11:59
  • 1
    @DanielAshton, I don't understand your question "What should happen if M wants a Z from L during the Z and X exchange between K and L?". In this scheme M may not know about Z at all. And L can be able to process Z among other kind of messages. So, no need to "wait for Z" for any kind of actors. Only K (and L in K-L-L-L-M recursive scheme) should wait for X. – Sergiy Prydatchenko Dec 17 '13 at 12:03
  • @SergiyPrydatchenko, in this case K, L and M are all of the same type, as are their upstream and downstream neighbours. It might happen that both L and M want to remove themselves from the chain at almost the same time, and that's where I'm getting confused. I had been thinking that when L requests a Z message from K, we could get into a race condition if M also requests a Z message from L. But that might not be a problem: perhaps L's behaviour should not change until it receives the actual Z message from K. This might simplify things. – Daniel Ashton Dec 17 '13 at 15:47

1 Answers1

1

A rough idea for a solution would be to maintain state within each node for previous and next and then support messages for unkinking a node from the chain and also telling a node that it's next node has changed. When a next node changes, send a poison pill to the node that informed you of that (assuming it's the one being unlinked) to gracefully stop it. When unlinked and before stopped, act as a pure passthrough for any more messages that might come in. Putting all that together, the code would look something like this:

object ChainNode {
  case object Unlink
  case class Link(prev:Option[ActorRef], next:Option[ActorRef])
  case class ChangeNextNode(node:Option[ActorRef])
}

trait ChainNode extends Actor{
  import ChainNode._
  import context._

  override def postStop{
    println(s"${self.path} has been stopped")
  }

  def receive = chainReceive()

  def chainReceive(prevNode:Option[ActorRef] = None, nextNode:Option[ActorRef] = None):Receive = {
    case Unlink =>
      prevNode foreach{ node =>
        println(s"unlinking node ${self.path} from sender ${node.path}") 
        node ! ChangeNextNode(nextNode)
      } 
      become(unlinked(nextNode))

    case Link(prev, next) =>
      println(s"${self.path} is linking to $prev and $next")
      become(chainReceive(prev, next))

    case ChangeNextNode(newNext) =>
      println(s"${self.path} is changing next node to $newNext")
      become(chainReceive(prevNode, newNext))
      sender ! PoisonPill

    case other =>    
      println(s"${self.path} received message $other")
      val msg = processOther(other)
      nextNode foreach{ node =>
        println(s"${self.path} forwarding on to ${node.path}")
        node ! msg
      }
  }

  def unlinked(nextNode:Option[ActorRef]):Receive = {
    case any => 
      println(s"${self.path} has been unlinked, just forwarding w/o processing...")
      nextNode foreach (_ ! any)      
  }

  def processOther(msg:Any):Any
}

class NodeA extends ChainNode{
  def processOther(msg:Any) = "foo"
}

class NodeB extends ChainNode{
  def processOther(msg:Any) = "bar"
}

class NodeC  extends ChainNode{
  def processOther(msg:Any) = "baz"
}

Then, a simple test scenario where the linking is changed mid way through:

object ChainTest{
  import ChainNode._  

  def main(args: Array[String]) {
    val system = ActorSystem("chain")
    val a = system.actorOf(Props[NodeA])
    val b = system.actorOf(Props[NodeA])
    val c = system.actorOf(Props[NodeA])

    a ! Link(None, Some(b))
    b ! Link(Some(a), Some(c))
    c ! Link(Some(b), None)

    import system.dispatcher
    Future{
      for(i <- 1 until 10){
        a ! "hello"
        Thread.sleep(200)
      }
    }

    Future{
      Thread.sleep(300)
      b ! Unlink
    }
  }
}

It's not perfect, but it could serve as a good starting point for you. One downside is that messages like Unlink and ChangeNextNode still will be processed in the order they were received. If there are a bunch of messages in front of them in the mailboxes, those must be processed first before the change (unlinking for example) takes effect. This could lead to an unwanted delay in making the change. If that's an issue then you might want to look into a priority based mailbox where messages like Unlink and ChangeNextNode have a higher priority then the rest of the messages being handled.

cmbaxter
  • 35,283
  • 4
  • 86
  • 95
  • You described pretty much the algorithm I had derived when I asked the question. The issue I'm hitting is that when tested under load of hundreds to thousands of messages, the inactive/dead nodes sometimes receive a message or two after they have gotten the PoisonPill. (I made them become inactive instead of dead so I can fail an assertion if they ever get a message.) I suspect that the problem happens when two consecutive nodes both try to unlink themselves at the same time, and send conflicting messages to their upstream and downstream nodes. Perhaps I have over-engineered it. – Daniel Ashton Dec 16 '13 at 21:17
  • Thanks, btw, for writing up the code. I learned a few things just from reading it. I hadn't paid attention to postStop, for example, and your technique of becoming the same state but with different parameters was a nice demonstration of something I had just learned elsewhere. Thanks! – Daniel Ashton Dec 16 '13 at 21:34