14

I'm currently writing a Haskell program that involves simulating an abstract machine, which has internal state, takes input and gives output. I know how to implement this using the state monad, which results in much cleaner and more manageable code.

My problem is that I don't know how to pull the same trick when I have two (or more) stateful objects interacting with one another. Below I give a highly simplified version of the problem and sketch out what I have so far.

For the sake of this question, let's assume a machine's internal state consists only of a single integer register, so that its data type is

data Machine = Register Int
        deriving (Show)

(The actual machine might have multiple registers, a program pointer, a call stack etc. etc., but let's not worry about that for now.) After a previous question I know how to implement the machine using the state monad, so that I don't have to explicitly pass its internal state around. In this simplified example the implementation looks like this, after importing Control.Monad.State.Lazy:

addToState :: Int -> State Machine ()
addToState i = do
        (Register x) <- get
        put $ Register (x + i)

getValue :: State Machine Int
getValue = do
        (Register i) <- get
        return i

This allows me to write things like

program :: State Machine Int
program = do
        addToState 6
        addToState (-4)
        getValue

runProgram = evalState program (Register 0)

This adds 6 to the register, then subtracts 4, then returns the result. The state monad keeps track of the machine's internal state so that the "program" code doesn't have to explicitly track it.

In object oriented style in an imperative language, this "program" code might look like

def runProgram(machine):
    machine.addToState(6)
    machine.addToState(-4)
    return machine.getValue()

In that case, if I want to simulate two machines interacting with each other I might write

def doInteraction(machine1, machine2):
    a = machine1.getValue()
    machine1.addToState(-a)
    machine2.addToState(a)
    return machine2.getValue()

which sets machine1's state to 0, adding its value onto machine2's state and returning the result.

My question is simply, what is the paradigmatic way to write this kind of imperative code in Haskell? Originally I thought I needed to chain two state monads, but after a hint by Benjamin Hodgson in the comments I realised I should be able to do it with a single state monad where the state is a tuple containing both machines.

The problem is that I don't know how to implement this in a nice clean imperative style. Currently I have the following, which works but is inelegant and fragile:

interaction :: State (Machine, Machine) Int
interaction = do
        (m1, m2) <- get
        let a = evalState (getValue) m1
        let m1' = execState (addToState (-a)) m1
        let m2' = execState (addToState a) m2
        let result = evalState (getValue) m2'
        put $ (m1',m2')
        return result

doInteraction = runState interaction (Register 3, Register 5)

The type signature interaction :: State (Machine, Machine) Int is a nice direct translation of the Python function declaration def doInteraction(machine1, machine2):, but the code is fragile because I resorted to threading state through the functions using explicit let bindings. This requires me to introduce a new name every time I want to change the state of one of the machines, which in turn means I have to manually keep track of which variable represents the most up-to-date state. For longer interactions this is likely to make the code error-prone and hard to edit.

I expect that the result will have something to do with lenses. The problem is that I don't know how to run a monadic action on only one of the two machines. Lenses has an operator <<~ whose documentation says "Run a monadic action, and set the target of Lens to its result", but this action gets run in the current monad, where the state is type (Machine, Machine) rather than Machine.

So at this point my question is, how can I implement the interaction function above in a more imperative / object-oriented style, using state monads (or some other trick) to implicitly keep track of the internal states of the two machines, without having to pass the states around explicitly?

Finally, I realise that wanting to write object oriented code in a pure functional language might be a sign that I'm doing something wrong, so I'm very open to being shown another way to think about the problem of simulating multiple stateful things interacting with each other. Basically I just want to know the "right way" to approach this sort of problem in Haskell.

Community
  • 1
  • 1
N. Virgo
  • 7,970
  • 11
  • 44
  • 65
  • 4
    I found [this page by Gabriel Gonzales](http://www.haskellforall.com/2013/05/program-imperatively-using-haskell.html) which takes a different approach from what I'd imagined. It uses a single state monad where the state is a "universe" containing all the objects, and uses lenses to pick out and modify the objects from within that universe. That idea *seems* to make sense for my application, so I guess I'll work through that if I don't get any answers here. – N. Virgo Oct 02 '16 at 04:29
  • 3
    Yep, the method Gonzales outlined is the way to do it. It's also the most direct translation of your Python. `def doInteraction(machine1, machine2)` becomes `doInteraction :: State (Machine, Machine) Int` – Benjamin Hodgson Oct 02 '16 at 07:22
  • @BenjaminHodgson thank you for your very useful comment - for some reason I hadn't thought of just putting them in a tuple and was imagining something much more complicated. That little nudge sets me in the right direction. – N. Virgo Oct 02 '16 at 15:47
  • @BenjaminHodgson having said that, I've realised I don't know how to implement the body of that function in an imperative style. If you're interested in helping further, please see the edit at the end of my question. – N. Virgo Oct 03 '16 at 01:38
  • Using *lazy* `State` to simulate an abstract machine seems very strange to me. Are you sure you don't want to use the usual "strict" `State`? I put scare quotes around "strict" there because its strictness just comes from performing case analysis on pairs, rather than doing anything weird. Lazy `State` is the weird one. – dfeuer Oct 05 '16 at 02:40
  • @dfeuer I guess I hadn't paid much attention to the difference and assumed lazy would be the default choice. (As I understand it, it means I could give the machine an infinite input stream, and have it poll that stream until it decides to halt, which might be quite useful later on. But maybe I haven't understood that correctly.) If changing to strict will help it's no problem for me to do that. – N. Virgo Oct 05 '16 at 04:11
  • You don't need lazy state to allow the machine to consume input lazily, but rather to produce results lazily (only running the machine long enough to produce the portion of the result that is demanded). Your machine can stop whenever it wants in either case. It's generally advise you to stick with strict state unless and until you find you actually need lazy state. – dfeuer Oct 05 '16 at 13:06
  • @dfeuer I see, thank you, that's helpful. I might end up needing that one day, e.g. if I want a machine to produce an infinite stream of output to be consumed by another machine, with the second machine deciding when to halt. But I don't need that today, so I'll switch to the strict version. – N. Virgo Oct 05 '16 at 13:38

3 Answers3

16

I think good practice would dictate that you should actually make a System data type to wrap your two machines, and then you might as well use lens.

{-# LANGUAGE TemplateHaskell, FlexibleContexts #-}

import Control.Lens
import Control.Monad.State.Lazy

-- With these records, it will be very easy to add extra machines or registers
-- without having to refactor any of the code that follows
data Machine = Machine { _register :: Int } deriving (Show)
data System = System { _machine1, _machine2 :: Machine } deriving (Show)

-- This is some TemplateHaskell magic that makes special `register`, `machine1`,
-- and `machine2` functions.
makeLenses ''Machine
makeLenses ''System


doInteraction :: MonadState System m => m Int
doInteraction = do
    a <- use (machine1.register)
    machine1.register -= a
    machine2.register += a
    use (machine2.register)

Also, just to test this code, we can check at GHCi that it does what we want:

ghci> runState doInteraction (System (Machine 3) (Machine 4))
(7,System {_machine1 = Machine {_register = 0}, _machine2 = Machine {_register = 7}})

Advantages:

  • By using records and lens, there will be no refactoring if I decide to add extra fields. For example, say I want a third machine, then all I do is change System:

    data System = System
      { _machine1, _machine2, _machine3 :: Machine } deriving (Show)
    

    But nothing else in my existing code will change - just now I will be able to use machine3 like I use machine1 and machine2.

  • By using lens, I can scale more easily to nested structures. Note that I just avoided the very simple addToState and getValue functions completely. Since a Lens is actually just a function, machine1.register is just regular function composition. For example, lets say I want a machine to now have an array of registers, then getting or setting particular registers is still simple. We just modify Machine and doInteraction:

    import Data.Array.Unboxed (UArray)
    data Machine = Machine { _registers :: UArray Int Int } deriving (Show)
    
    -- code snipped
    
    doInteraction2 :: MonadState System m => m Int
    doInteraction2 = do
        Just a <- preuse (machine1.registers.ix 2) -- get 3rd reg on machine1
        machine1.registers.ix 2 -= a               -- modify 3rd reg on machine1
        machine2.registers.ix 1 += a               -- modify 2nd reg on machine2
        Just b <- preuse (machine2.registers.ix 1) -- get 2nd reg on machine2
        return b
    

    Note that this is equivalent to having a function like the following in Python:

    def doInteraction2(machine1,machine2):
      a = machine1.registers[2]
      machine1.registers[2] -= a
      machine2.registers[1] += a
      b = machine2.registers[1]
      return b
    

    You can again test this out on GHCi:

    ghci> import Data.Array.IArray (listArray)
    ghci> let regs1 = listArray (0,3) [0,0,6,0]
    ghci> let regs2 = listArray (0,3) [0,7,3,0]
    ghci> runState doInteraction (System (Machine regs1) (Machine regs2))
    (13,System {_machine1 = Machine {_registers = array (0,3) [(0,0),(1,0),(2,0),(3,0)]}, _machine2 = Machine {_registers = array (0,3) [(0,0),(1,13),(2,3),(3,0)]}})
    

EDIT

The OP has specified that he would like to have a way of embedding a State Machine a into a State System a. lens, as always, has such a function if you go digging deep enough. zoom (and its sibling magnify) provide facilities for "zooming" out/in of State/Reader (it only makes sense to zoom out of State and magnify into Reader).

Then, if we want to implement doInteraction while keeping as black boxes getValue and addToState, we get

getValue :: State Machine Int
addToState :: Int -> State Machine ()

doInteraction3 :: State System Int
doInteraction3 = do
  a <- zoom machine1 getValue     -- call `getValue` with state `machine1`
  zoom machine1 (addToState (-a)) -- call `addToState (-a)` with state `machine1` 
  zoom machine2 (addToState a)    -- call `addToState a` with state `machine2`
  zoom machine2 getValue          -- call `getValue` with state `machine2`

Notice however that if we do this we really must commit to a particular state monad transformer (as opposed to the generic MonadState), since not all ways of storing state are going to be necessarily "zoomable" in this way. That said, RWST is another state monad transformer supported by zoom.

Alec
  • 31,829
  • 7
  • 67
  • 114
  • Thanks, this is close to what I want to do. I've been learning lenses and I've got almost as far as this. The place I'm stuck is that where you write `machine2.register += a`, I want to write something like `machine2 ~. addToState a`, where `~.` would basically do `runState addToState [value pointed at by the lens]`, then replace the value pointed at by the lens with the new state, and return the resulting value. The reason is that in reality I don't just want to add and subtract from registers, but to perform complicated actions that both change the state of a Machine and return a value. – N. Virgo Oct 05 '16 at 02:35
  • @Nathaniel Does [`<~`](http://hackage.haskell.org/package/lens-4.14/docs/Control-Lens-Setter.html#v:-60--126-) do what you want by any chance? You could then do something like `machine1.register <~ randomIO`.... If this is not what you mean, could you give a more precise example? – Alec Oct 05 '16 at 02:52
  • No, because `<~` runs the action in the current monad, and thus expects `addToState` to have type `State System Int` instead of `State Machine Int`. I want to be able to call `addToState`, implemented as in my question, from inside the `doInteraction` function, and have it update only one of the two Machines. I will think about how to update the question to explain this more clearly, and I'll let you know when I've done that. – N. Virgo Oct 05 '16 at 03:59
  • Maybe this is a less confusing way to put it: imagine I wanted to call `runProgram` on just one of the Machines in the System, and that I wanted not only for this to update that Machine's state, but that I also wanted to get the result that the program returns. How would I do that? – N. Virgo Oct 05 '16 at 04:07
  • @Nathaniel I've edited my answer for what I think you are looking for. I am curious to see if anyone else has a simpler way... – Alec Oct 05 '16 at 07:15
  • Great, that seems to be exactly what I wanted. I'm not sure I understand the implications of not being able to use MonadState but for now it doesn't seem to be a problem. (I will accept after some time.) – N. Virgo Oct 05 '16 at 08:06
  • @Nathaniel The implications are that your code is less generic. For example, say that in another function you realize that you need to be in a monad that can write too, then now you need to go back and change this to `RWST` (as opposed to just adding a `MonadReader constraint on the new function and switching to `RWST` only when you run the monad). Also, I'd prefer if you _not_ accept sooner. I'm actually very curious about this problem and would like to see what other more experienced Haskellers have to say. – Alec Oct 05 '16 at 16:06
  • Ok, I'll hold of on accepting for a while in that case, and we'll see if anyone else chimes in. I'm very grateful for your help! – N. Virgo Oct 05 '16 at 16:20
5

One option is to make your state transformations into pure functions operating on Machine values:

getValue :: Machine -> Int
getValue (Register x) = x

addToState :: Int -> Machine -> Machine
addToState i (Register x) = Register (x + i)

Then you can lift them into State as needed, writing State actions on multiple machines like so:

doInteraction :: State (Machine, Machine) Int
doInteraction = do
  a <- gets $ getValue . fst
  modify $ first $ addToState (-a)
  modify $ second $ addToState a
  gets $ getValue . snd

Where first (resp. second) is a function from Control.Arrow, used here with the type:

(a -> b) -> (a, c) -> (b, c)

That is, it modifies the first element of a tuple.

Then runState doInteraction (Register 3, Register 5) produces (8, (Register 0, Register 8)) as expected.

(In general I think you could do this sort of “zooming in” on subvalues with lenses, but I’m not really familiar enough to offer an example.)

Jon Purdy
  • 53,300
  • 8
  • 96
  • 166
4

You could also use Gabriel Gonzales' Pipes library for the case you've illustrated. The tutorial for the library is one of the best pieces of Haskell documentation in existence.

Below illustrates a simple example (untested).

-- machine 1 adds its input to current state
machine1 :: (MonadIO m) => Pipe i o m ()
machine1 = flip evalStateT 0 $ forever $ do
               -- gets pipe input
               a <- lift await
               -- get current local state
               s <- get
               -- <whatever>
               let r = a + s
               -- update state
               put r
               -- fire down pipeline
               yield r

-- machine 2 multiplies its input by current state
machine2 :: (MonadIO m) => Pipe i o m ()
machine2 = flip evalStateT 0 $ forever $ do
               -- gets pipe input
               a <- lift await
               -- get current local state
               s <- get
               -- <whatever>
               let r = a * s
               -- update state
               put r
               -- fire down pipeline
               yield r

You can then combine using the >-> operator. An example would be to run

run :: IO ()
run :: runEffect $ P.stdinLn >-> machine1 >-> machine2 >-> P.stdoutLn

Note that is possible, although a little more involved to have bi-directional pipes, which is gives you communications between both machines. Using some of the other pipes ecosystems, you can also have asynchronous pipes to model non-deterministic or parallel operation of machines.

I believe the same can be achieved with the conduit library, but I don't have much experience with it.

OllieB
  • 1,431
  • 9
  • 14
  • This looks very nice. It seems to correspond more to coroutines than to the more object-based model I had in mind. (Roughly speaking, the difference is that in a coroutine it's the stateful object's choice what to do next, whereas in OOP that choice is made by the calling function.) Coroutines are a super-useful feature that's lacking in most imperative languages, so I'm very happy to know this exists in Haskell. – N. Virgo Oct 11 '16 at 04:58
  • Yes, its a different model. It allows you to write the machines behaviour independently, and model their interactions by combining them, rather than writing the program at the top level. This may not be what you were after of course, but its nice it exists. I find it useful for doing stateful DSP pipelines, because it allows you to focus on only 1 algorithm at a time. – OllieB Oct 11 '16 at 11:16