0

i'm trying to implement client-server application via websockets and i have several doubts how to do it correctly to maintain state of every connected client.

global machine + many objects for each connection?
machine + object - for each connection?

so, i have begin with several tests, to check how it works concurrently

base machine

class AsyncModel:
    def __init__(self, id_):
        self.req_id = id_

    async def prepare_model(self, _):
        print("prepare_model", self.req_id)


    async def before_change(self, _):
        print("before_change", self.req_id)


    async def after_change(self, _):
        print("After change", self.req_id)


transition = dict(trigger="start", source="Start", dest="Done",
                  prepare="prepare_model",
                  before=["before_change"],
                  after="after_change")

and several runnning types

i want to all models change their state at the same time


async def main():
    tasks = []
    machine = AsyncMachine(model=None,
                           states=["Start", "Done"],
                           transitions=[transition],
                           initial='Start',
                           send_event=True,
                           queued=True)
    for i in range(3):
        model = AsyncModel(id_=i)
        machine.add_model(model)

        tasks.append(model.start())

    await asyncio.gather(*tasks)

    for m in machine.models:
        machine.remove_model(m)

asyncio.run(main())

but the output is:


prepare_model 0
before_change 0
After change 0
prepare_model 1
before_change 1
After change 1
prepare_model 2
before_change 2
After change 2

if i create machine + model:


async def main():
    tasks = []

    for i in range(3):
        model = AsyncModel(id_=i)
        machine = AsyncMachine(model=model,
                               states=["Start", "Done"],
                               transitions=[transition],
                               initial='Start',
                               send_event=True,
                               queued=True)


        tasks.append(model.start())

    await asyncio.gather(*tasks)

the output is:

prepare_model 0
prepare_model 1
prepare_model 2
before_change 0
before_change 1
before_change 2
After change 0
After change 1
After change 2

whats the correct way ?

UPDATE

i want to have available contextvar for each running model to be able to correctly log all activity from others modules which model calls, to not pass explicitly some identificator to each external function call (outisde model class)
see some kind of example https://pastebin.com/qMfh0kNb, it's does not work as expected, assert fires

Eugene
  • 19
  • 1
  • 1
    Welcome to StackOverflow! It might not be obvious to you, but your question is very hard to follow. It's unclear what "machine" refers to, what the code does (you never show the definition of `add_model`, for example) or what kind of output you **expect** for each snippet. I suggest that you edit the answer to: 1) trim unnecessary code but make examples runnable, 2) clearly denote what output you expect (and why), and 3) provide a clear and answerable question. Your existing question, "what's the correct way?" is vague and unclear unless one already understands what you've set out to do. – user4815162342 Nov 07 '20 at 13:57

1 Answers1

2

A common answer for the question "What is the right way?" is "Well, it depends...". Without a clear idea of what you want to achieve I can only answer generic questions which I can identify in your post.

With transitions, should I use one ONE machine for EACH model or ONE machine for ALL model?

When using transitions, it's the model that is stateful AND contains the transition callbacks. The machine acts as kind of a 'rulebook' there. Thus, when machines have an identical configuration, I would recommend using ONE machine for ALL models for most use cases. Using multiple machines with the same configuration just increases the memory footprint and code complexity in most cases. Off the top of my head, I can think of one use case where having multiple machines with identical configurations might be useful. But first you might wonder why both versions behave differently even though I just said it should make no difference.

Why are callbacks called in a different order when using one AsyncMachine vs many AsyncMachines?

Without custom parameters, using one AsyncMachine or many AsyncMachines makes no difference. However, you passed queued=True in the constructor which according to the Documentation does this:

If queued processing is enabled, a transition will be finished before the next transition is triggered

This is why your single machine will consider one transition at a time, processing all callbacks of ONE model before shifting to the next event/transition. Since every machine has its own event/transition queue, events will be processed instantly when using multiple machines. Passing queued=True has no effect in your example with mutliple machines. You could get the same behaviour for one machine by not passing the queued parameter or by passing queued=False (default value). I adapted your example a bit for illustration:

from transitions.extensions import AsyncMachine
import asyncio


class AsyncModel:
    def __init__(self, id_):
        self.req_id = id_

    async def prepare_model(self):
        print("prepare_model", self.req_id)

    async def before_change(self):
        print("before_change", self.req_id)

    async def after_change(self):
        print("after change", self.req_id)


transition = dict(trigger="start", source="Start", dest="Done",
                  prepare="prepare_model",
                  before="before_change",
                  after="after_change")

models = [AsyncModel(i) for i in range(3)]


async def main(queued):
    machine = AsyncMachine(model=models,
                           states=["Start", "Done"],
                           transitions=[transition],
                           initial='Start',
                           queued=queued)

    await asyncio.gather(*[model.start() for model in models])
    # alternatively you can dispatch an event to all models of a machine by name
    # await machine.dispatch("start")

print(">>> Queued=True")
asyncio.run(main(queued=False))
print(">>> Queued=False")
asyncio.run(main(queued=False))

So it depends on what you need. With ONE machine, you can have both -- sequential processing of events with queued=True or parallel processing with queued=False.

You mentioned there is one use case where multiple machines might be necessary...

In the documentation there is this passage:

You should consider passing queued=True to the TimeoutMachine constructor. This will make sure that events are processed sequentially and avoid asynchronous racing conditions that may appear when timeout and event happen in close proximity.

When using timeout events or other events that occur in close succession there can be racing conditions when multiple transitions on the same model are processed simultanously. So when this issue inflicts your use case AND you need parallel processing of transitions on separate models, having multiple machines with identical configurations could be a solution.

How to work with contexts in AsyncMachine?

This is thin ice for me and I might be incorrect. I can try to give a brief summary of my current understanding of why things behave a certain way. Consider this example:

from transitions.extensions import AsyncMachine
import asyncio
import contextvars

context_model = contextvars.ContextVar('model', default=None)
context_message = contextvars.ContextVar('message', default="unset")

def process():
    model = context_model.get()
    print(f"Processing {model.id} Request {model.count} => '{context_message.get()}'")


class Model:

    def __init__(self, id):
        self.id = id
        self.count = 0

    def request(self):
        self.count += 1
        context_message.set(f"Super secret request from {self.id}")

    def nested(self):
        context_message.set(f"Not so secret message from {self.id}")
        process()


models = [Model(i) for i in range(3)]


async def model_loop(model):
    context_model.set(model)
    context_message.set(f"Hello from the model loop of {model.id}")
    while model.count < 3:
        await model.loop()


async def main():
    machine = AsyncMachine(model=models, initial='Start', transitions=[['loop', 'Start', '=']],
                           before_state_change='request',
                           after_state_change=[process, 'nested'])
    await asyncio.gather(*[model_loop(model) for model in models])

asyncio.run(main())

Output:

# Processing 0 Request 1 => 'Hello from the model loop of 0'
# Processing 0 Request 1 => 'Not so secret message from 0'
# Processing 1 Request 1 => 'Hello from the model loop of 1'
# Processing 1 Request 1 => 'Not so secret message from 1'
# Processing 2 Request 1 => 'Hello from the model loop of 2'
# Processing 2 Request 1 => 'Not so secret message from 2'
# Processing 0 Request 2 => 'Hello from the model loop of 0'
# Processing 0 Request 2 => 'Not so secret message from 0'
# Processing 1 Request 2 => 'Hello from the model loop of 1'
# Processing 1 Request 2 => 'Not so secret message from 1'
# Processing 2 Request 2 => 'Hello from the model loop of 2'
# Processing 2 Request 2 => 'Not so secret message from 2'
# Processing 0 Request 3 => 'Hello from the model loop of 0'
# Processing 0 Request 3 => 'Not so secret message from 0'
# Processing 1 Request 3 => 'Hello from the model loop of 1'
# Processing 1 Request 3 => 'Not so secret message from 1'
# Processing 2 Request 3 => 'Hello from the model loop of 2'
# Processing 2 Request 3 => 'Not so secret message from 2'

Triggering events has been forwarded to model loops which set two context variables. Both are used by process, a global function which uses context variables for processing. When a transition is triggered, Model.request will be called right before the transtion and increase the Model.count. After Model.state has been changed, the global function process and Model.nested will be called.

process is called two times: Once in the model loop and once in the Model.nested callback. The altered context_message from Model.request is not accessible but changes in Model.nested are available for process. How's that? Because process and Model.request share the same parent context (Model could retrieve the current value of context_message) but when Model sets the variable it is only available in its current local context which is not accessible by the later call (in another callback) to process. If you want local changes to be accessible by process you'd need to trigger it FROM the callback as done in Model.nested.

Long story short: Callbacks for AsyncMachine do share the same parent context but cannot access each other's local context and thus changing values has no effect. However, when the context variable is a reference (like context_model) changes to the model are accessible in other callbacks.

Working with transitions event queues (queued=True) and relying on contextvars needs some extra considerations since -- as the documentation states -- "when processing events in a queue, the trigger call will always return True, since there is no way to determine at queuing time whether a transition involving queued calls will ultimately complete successfully. This is true even when only a single event is processed.". A triggered event might only be added to the queue. Right after, its context is left before the event has been processed. If you need queued processing AND contextvars AND also cannot call functions from INSIDE model callbacks, you should check asyncio.Lock and wrap your call to loop but leave queued=False to prevent function calls to return before they are done.

aleneum
  • 2,083
  • 12
  • 29
  • Thanks for the answer! First of all my idea was to use contextvars with AsyncMachine - to have each model running in each own context with own contextvars available for it. But i could not achive it. I have tried different ways of creating models/machine/setting values to contextvars - notthing helped me. I think that there could be some way to do this.. may be like LockedMachine does? – Eugene Nov 08 '20 at 09:42
  • @Eugene: I added a paragraph about `contextvar` and `AsyncMachine`. Whether async (`AsyncMachine`) or threads (`LockedMachine`) fits your needs better, I cannot tell. In both scenarios it should be possible to share context variables. When it comes to parallel processing -- async or threaded -- it's a good idea to get a deep understanding of what's going on. Throwing code at a problem until it works (for now) might cause hard to debug issues in the future. – aleneum Nov 09 '20 at 09:06
  • i have added some kind of example functionality i'm trying to do https://pastebin.com/qMfh0kNb but it does not work as expected. btw, i have also updated main question - with main purpose – Eugene Nov 09 '20 at 10:13
  • In the documentation it says: "Important note: when processing events in a queue, the trigger call will always return True, since there is no way to determine at queuing time whether a transition involving queued calls will ultimately complete successfully. This is true even when only a single event is processed." When you call loop with `queued=True` in another context, the trigger will be added to the event queue and the call returns/exits the context before it is actually executed. In other words: A queued function might not be executed in the context it was triggered. – aleneum Nov 09 '20 at 13:18
  • i have another question about queued state processing - it is possible to make separate queue for each added model in state machine? so the states will be processed for each running model in parallel and not sequential for all. – Eugene Dec 09 '20 at 11:35