Would storing a rich object as an actor with persistance be a good idea?

Question

If you are familiar with Trello, would storing an entire Trello board as an actor (with akka persistence) be a good use case?

A trello board consists of:

lists
tasks in a list
each task can have comments and other properties

What are the general best practices or considerations when deciding if akka persistance is a good use case for a given problem set?

Levi Ramsey · Answer 1 · 2020-09-29T23:01:24.167

1

Any context where event sourcing is a good fit is a good fit for Akka Persistence.

Event sourcing, in turn, is generally applicable (note that nearly any DB you're using is event sourcing (with exceptionally frequent snapshotting, truncation of the event log, and purging of old snapshots)).

Event sourcing works really well when you want to explicitly model how entities in your domain change over time: you're effectively defining an algebra of changes. The richer (i.e. the further from just create/update) this model of change is, the more it's a fit for event sourcing. This modeling of change in turn facilitates letting other components of a system only update their state when needed.

Akka Persistence, especially when used with cluster sharding, lets you handle commands/requests without having to read from a DB on every command/request (basically, you'll read from the DB when bringing back an already persisted actor, but subsequent commands/requests (until such time as the actor passivates or dies) don't require such reads). The model of parent and child actors in Akka also tends to lead to a natural encoding of many-to-one relationships.

In the example of a trello board, I would probably have

each board be a persistent actor, which is parent to
lists, which are persistent actors and are each parents to
list items, which are also persistent actors

Depending on how much was under a list item, they might in turn have child persistent actors (for comments, etc.).

It's probably worth reading up on domain-driven design. While DDD doesn't require the actor model (nor vice versa), and neither of them requires event sourcing (nor vice versa), I and many others have found that they reinforce each other.

edited Sep 29 '20 at 23:01

answered Sep 29 '20 at 20:27

Levi Ramsey

18,884
1
16
30

Thanks for the overview. The only issue with your suggested design is to render a board, it would require so much communication between actors i.e. 1 to get the board, say 5 to get 5 lists, and then if each list has 10 items that's 50 calls. And anything under each list item means 50 x n-things. This really adds up quickly... – Blankman Sep 29 '20 at 21:18
1

You don't necessarily have to make a separate REST call for each actor (the latency for inter-actor communication is a lot lower than the latency for a REST call). Conversely, though, with a single call to get everything, there's a pretty big chance that you're going to end up requesting far more than you need (the serialization/deserialization time adds up quickly). Another benefit of the child actor approach is the parallelism (this is true whether you directly route REST requests to the children or have the board do a scatter-gather). – Levi Ramsey Sep 29 '20 at 22:59
Hi Levi! Out of curiosity, with the model you proposed how do you handle the change of an item from a list to another? I'm not an expert of the actor model, but I see it as a complex thing changing the parent of an actor. It seems to me that you will need to execute two commands to do it "remove from the first list" "add to the second list". Wouldn't it be better to have the list item as a direct child of the board, and the list it is contained in as part of the state (like a property with the id of the list)? – rascio Oct 12 '20 at 15:03
Normally, the parent of an actor doesn't change, so a list item belonging to list A would not strictly be the same list item if it belonged to list B. So command wise "move from list A to list B" would be interpreted as something like "propose to seal this list item", "create list item in list B with these properties (one of which is that it was once this item under A)", "seal this list item with a note that its superceded by the item in list B" (sealing basically being understood as refusing further writes, but still being available for reads). – Levi Ramsey Oct 13 '20 at 11:46
On one level, it's more complex than having the board be composed of items and items refer to lists, but on the other hand, the benefit of leveraging parent-child like that is that a constraint that every item belongs to a list is trivially enforced. There're also the questions of how often you expect a "get me the items in this list" query to be executed relative to "move this item to another list" and how consistent you want that query to be: if it will be executed often and you want strong consistency, the complexity of the move operation gets back to basically what I outlined above. – Levi Ramsey Oct 13 '20 at 11:54

score 0 · Answer 2 · answered Sep 29 '20 at 16:43

0

It mostly depends on how much write the app wants to perform.

Akka persistence is an approach to achieve very high write throughput while ensuring the persistence of the data, i.e., if the actor dies and data in memory is lost, it is fine because the write logs are persisted to disk.

If the persistence of the data is necessary, while very high write throughput is not required (imagine the app updates the Trello board 1 time per second), then it is totally fine to simply writing the data to external storage.

answered Sep 29 '20 at 16:43

yiksanchan

1,890
1
13
37

if there isn't high throughput, is it always a bad idea to use a persistant actor then? what if I just like the simplicity of it and how saving to disk etc. is faster to develop in? – Blankman Sep 29 '20 at 17:05
@Blankman you definitely can, just note that it brings in an extra layer of complexity. If you're generally familiar with Cassandra and event sourcing, feel free to go that route. Just that writing to a SQL database is dead simple and well understood – yiksanchan Sep 29 '20 at 17:20
can you give me a few real world examples of where using akka persistence would be ideal? – Blankman Sep 29 '20 at 18:27

score 0 · Answer 3 · answered Oct 11 '20 at 15:10

0

would storing an entire Trello board as an actor (with akka persistence) be a good use case

I would say the size of the actor should match the size of an Aggregate Root. Making an entire board an Aggregate Root seems like a very bad choice. It means that all actions on that board are now serialized and none can happen concurrently. Why should changing description of card #1 conflicts with moving car #2 to a different category? Why should creating a new board category conflict with assigning card #3 to someone?

I mean, you could make an entire system a single actor and you wouldn't ever have to care about race conditions, but you'd also kill your scalability...

answered Oct 11 '20 at 15:10

plalx

42,889
6
74
90

"Why should changing description of card #1 conflicts with moving car #2 to a different category?" If commands are serialized they will not conflict, they will be executed one after another. An approach of having a big data structure under a single "actor" is something that redis is doing at its core to handle commands (it is single thread). – rascio Oct 12 '20 at 14:55
@rascio I meant it in the sense that why would you serialize them if they wouldn't be conflicting operations? Furthermore, all databases have a form of serialized log for all operations (e.g. write ahead log) and that doesn't mean it's wise to replicate that model for business processes... – plalx Oct 12 '20 at 15:10
Ok, now I got what you meant, and yes it is true. What pushed me to comment is that :) "Making an entire board an Aggregate Root seems like a very bad choice" is not strictly related to it. It is true that each command have to wait, but it gives you better invariants. If you need to constraint a max number of ticket for the board you can easily do it with a single aggregate, but splitting it to an aggregate per ticket open to eventual consistency issues when you are creating a new ticket. It is a matter of trading off what you are losing with what you're gaining. – rascio Oct 12 '20 at 15:27
I mean, without any stats on the usage, and without knowing the invariants, I don't think one can say A is better than B, nor B is better than A – rascio Oct 12 '20 at 15:29
@rascio I agree, which is why I used "seems" and not "is" :) Large aggregates usually indicates wrong boundaries, but it doesn't mean it is. As for strong consistency vs eventual consistency: many rules are artificially forced to be strongly consistent where eventual consistency wouldn't have been a problem for the business. Furthermore, clustering data together to enforce 1% of the rules would also seem like a bad choice. I'd prefer to modify many ARs in a single transaction for that specific use case. Could you justify a large `Board` AR just to enforce a maximum ticket quota? – plalx Oct 12 '20 at 16:58
I would rather prefer `transaction { ticket = board.addTicket(...) //increment ticket count; save(ticket); save(board); }` when adding/removing tickets and having all the other use cases work with a single AR. Obviously, if most use cases need cross-ticket rules then it's another story.. Not sure how that translates to the Actor Model world though hehe. – plalx Oct 12 '20 at 17:01

Would storing a rich object as an actor with persistance be a good idea?

3 Answers3