0

We’re working to take our software to Azure cloud and looking at Orleans and Service Fabric (SF) as potential frameworks. We need to:

  1. Populate our analysis engines with lots of data (e.g., 100MB to 2GB) per engine instance.
  2. Maintain that state, and if an engine instance goes idle for say 20 minutes or more, we’d like to unload it (i.e., and not pay for the engine instance resource).
  3. Each engine instance will support one to several end users with a specific data set.
  4. Each engine instance can be highly interactive generating lots of plot data near realtime. We’re maintaining state as we don’t want to pay the price to populate engine instance for each engine interaction.
  5. An engine instance action can take a few seconds, a few minutes, to even tens of minutes. We’ll want some feedback.
  6. Users may access an engine instance every few seconds (e.g., to steer the engine towards a result based on feedback) and will want live plot data.
  7. Each user will want to talk to a specific engine instance.
  8. As a user expresses interest in running a simulation (i.e., standing up an engine instance), ideally we want him to choose small/medium/large computing resource to run his engine instance (i.e., based on the problem he’s trying to solve he may want more or less computing/memory power).

We’re considering Orleans and SF but we’re having difficulty specifying architecture based on above requirements. We’ve considered:

  1. Trying to think about an SF partition, or an Orleans silo as an ‘engine instance’ described above.
  2. Leveraging both Orleans and SF notion of fault tolerance through replication.
  3. Leveraging local (i.e., to partition or silo) storage to store results and maintain state (i.e., for long periods or until idle for 20 minutes).

We’ve not understood how to:

  1. Limit a silo or a partition to a single engine instance so that we can control resourcing of the engine instance.
  2. Keep a user’s engine instance data separate from another users engine instance data.
  3. Direct a request from a user (e.g., through a web API) to a particular engine instance.

Does this make sense for Orleans, does it make more sense for SF? Any pointers on how to implement the above would be helpful.

1 Answers1

0

When you say SF I assume you mean SF Actors right?

You can use them the way you want, but in both cases does not look as the right solution for your problem, because:

  1. Actors are single threaded, if you plan to share the same instance with multiple clients, each one would have to wait for the previous one to finish before it start processing anything. If you need to monitor the status of a running actor, you would have to make the actor publish the updates to external subscribers.
  2. Actor state is isolated, so you can't access the state of other actors, the way to do it is provide a method to return it, but if the actor is running a command you have to wait the completion, unless you make a separate state service to hold the processed data.
  3. You can't limit the resources required for a actor, in service fabric you specify the resources needed for a service, but you can't do it for actors, and you can't limit the resources they use, when they hit the limit, service fabric will try to balance the resources for your, but nothing prevent the process to consume more memory than requested.
  4. Both actor services communicates using the ask approach, so they will "block" the caller waiting for an answer, it is asynchronous but you still have to keep the caller 'waiting'. (block and wait is because there is not an idea of fire and forget like Akka that uses the Tell approach, where it delivery the message and forget.)

Based on some of your requirements, I think a containers would be a better approach. Because:

  1. You can limit the resource consumption for each container
  2. The data is isolated inside the container and not visible to others

But on containers you have to manage the replication and partitioning by yourself, so in this case I would recommend the best of both worlds:

  • Create SF services to host the shared data sets between the the users
  • SF Service+Actor to only store the results of users simulations.
  • Containers to run the simulations and send updates to actors

This is just an example, it all will depend on your requirements, architecture and how data will be isolated from each other.

Diego Mendes
  • 10,631
  • 2
  • 32
  • 36
  • Thanks very much for your feedback. You make the statement, "in service fabric you specify the resources needed for a service, but you can't do it for actors" - I'm still trying to get my head around whether to use actors pattern or not. Why use actors pattern at all then, can I not use actors pattern such that I just create an instance of the service for each user and they then communicate with targeted partition? Again, I may be missing something but threading aside, this is where I was headed. Again thanks for the feedback and education. – MattWorkWeb Mar 28 '18 at 11:24
  • The actor pattern is a good approach for scenarios where the computation, storage, communication can be defined in isolation as small units, giving this unit control of his own data and concurrency. Because of these characteristics, they are loosely coupled, as they can run in parallel and not compete for each other resources, and they can run on same or another machine without any dependency on each other state\data. – Diego Mendes Mar 28 '18 at 21:35
  • Scenarios where is not recommended are: long running operations, because given the single threaded design these operations will block until completion, very big entities(state) where the time to load the state and save it back is very high. Highly concurrent operations, where this operation(+data) might be request by many other service\actors, only one will be able to process at time. – Diego Mendes Mar 28 '18 at 21:37