1

I've an Actor system that is processing a continuous stream messages from an external system. I've the following actors in my system.

  1. SubscribeActor - this actor subscribes to a Redis channel and creates a new InferActor and passes the JSON payload to it.
  2. InferenceActor - this actor is responsible for 2a. parsing the payload and extracting some value text values from the JSON payload. 2b. calling an external REST service to passing the values extracted in 2a to this service. The REST service is deployed on a different node in the LAN and does some fair bit of heavy lifting in terms of computation.

The external REST service in 2b is invoked using a Spray client. I tested the system and it works fine till 2a. However, as soon as I introduce 2b. I start to get OutOfMemory errors and the system eventually comes to a halt.

Currently, I've two primary suspects -

  1. Design flaw - The way I'm using the Spray client inside my actor system is not correct (I'm new to Spray)
  2. Performance issues due the latency caused by the slow REST service.

Before I go to #2 I want to make sure that I'm using the Spray client correctly, esp. when I'm calling it from other actors. My question is the usage below correct/incorrect/suboptimal ?

Here is the code of the web service REST client that invokes the service.

trait GeoWebClient {
  def get(url: String, params: Map[String, String]): Future[String]
}

class GeoSprayWebClient(implicit system: ActorSystem) extends GeoWebClient {

  import system.dispatcher

  // create a function from HttpRequest to a Future of HttpResponse
  val pipeline: HttpRequest => Future[HttpResponse] = sendReceive

  // create a function to send a GET request and receive a string response
  def get(path: String, params: Map[String, String]): Future[String] = {

    val uri = Uri("http://myhost:9191/infer") withQuery params
    val request = Get(uri)
    val futureResponse = pipeline(request)
    futureResponse.map(_.entity.asString)
  }
}

And here is the code for InferenceActor that invokes the service above.

class InferenceActor extends Actor with ActorLogging with ParseUtils {

  val system = context.system    
  import system.dispatcher    
  val restServiceClient = new GeoSprayWebClient()(system)    

  def receive = {

    case JsonMsg(s) => {

      //first parse the message to 
      val text: Option[String] = parseAndExtractText(s) //defined in ParseUtils trait
      log.info(s"extract text $text")

      def sendReq(text: String) = {
        import spray.http._
        val params = Map(("text" -> text))
        // send GET request with absolute URI
        val futureResponse = restServiceClient.get("http://myhost:9191/infer", params)
        futureResponse
      }

      val f: Option[Future[String]] = text.map(x => sendReq(x))

      // wait for Future to complete NOTE: I commented this code without any change. 
      /* f.foreach { r => r.onComplete {
        case Success(response) => log.debug("*********************" + response)
        case Failure(error) => log.info("An error has occurred: " + error.getMessage)
      }
      }
      */
      context stop self    

    }
  }        
}
Soumya Simanta
  • 11,523
  • 24
  • 106
  • 161
  • Have you looked at a memory dump to find out which kind of objects are causing the OOM situation? Try `jmap -histo:live ` for a quick overview while your server is still running or use Eclipse MAT for a after-the-fact analysis. – jrudolph Aug 15 '14 at 16:29

1 Answers1

0

If your second piece of code is blocking like you say it is try wrapping that future in another future as stated in the akka documentation, Blocking needs careful management.

It should limit the amount of resources for that request.

although it looks like it would be easier to transfer text.map to a different actor.

raam86
  • 6,785
  • 2
  • 31
  • 46