Clean and type-safe state machine implementation in a statically typed language?

Question

I implemented a simple state machine in Python:

import time

def a():
    print "a()"
    return b

def b():
    print "b()"
    return c

def c():
    print "c()"
    return a


if __name__ == "__main__":
    state = a
    while True:
        state = state()
        time.sleep(1)

I wanted to port it to C, because it wasn't fast enough. But C doesn't let me make a function that returns a function of the same type. I tried making the function of this type: typedef *fn(fn)(), but it doesn't work, so I had to use a structure instead. Now the code is very ugly!

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

typedef struct fn {
    struct fn (*f)(void);
} fn_t;

fn_t a(void);
fn_t b(void);
fn_t c(void);

fn_t a(void)
{
    fn_t f = {b};

    (void)printf("a()\n");

    return f;
}

fn_t b(void)
{
    fn_t f = {c};

    (void)printf("b()\n");

    return f;
}

fn_t c(void)
{
    fn_t f = {a};

    (void)printf("c()\n");

    return f;
}

int main(void)
{
    fn_t state = {a};

    for(;; (void)sleep(1)) state = state.f();

    return EXIT_SUCCESS;
}

So I figured it's a problem with C's broken type system. So I used a language with a real type system (Haskell), but the same problem happens. I can't just do something like:

type Fn = IO Fn
a :: Fn
a = print "a()" >> return b
b :: Fn
b = print "b()" >> return c
c :: Fn
c = print "c()" >> return a

I get the error, Cycle in type synonym declarations.

So I have to make some wrapper the same way I did for the C code like this:

import Control.Monad
import System.Posix

data Fn = Fn (IO Fn)

a :: IO Fn
a = print "a()" >> return (Fn b)

b :: IO Fn
b = print "b()" >> return (Fn c)

c :: IO Fn
c = print "c()" >> return (Fn a)

run = foldM (\(Fn f) () -> sleep 1 >> f) (Fn a) (repeat ())

Why is it so hard to make a state machine in a statically typed language? I have to make unnecessary overhead in statically typed languages as well. Dynamically typed languages don't have this problem. Is there an easier way to do it in a statically typed language?

This is a simplified version for exposition of the idiom I am using to implement my state machine. — megazord, Oct 08 '11 at 21:43
You can use `newtype Fn = Fn (IO Fn)` instead. Unlike the `data` definition, this has zero overhead. The reason why cycles are reported as errors is because (1) they usually *are* errors, (2) they make it difficult to print good error messages, (3) they make type checking much more difficult (4) I'm sure there are more reasons. — Dietrich Epp, Oct 08 '11 at 22:03
By the way, you don’t need to explicitly cast function calls to `void` to discard their return values; that happens automatically. — Jon Purdy, Oct 08 '11 at 22:05
I am aware of that I do it because I want to show my intent to discard the value. — megazord, Oct 08 '11 at 22:11

score 22 · Answer 1 · answered Oct 08 '11 at 22:26

In Haskell, the idiom for this is just to go ahead and execute the next state:

type StateMachine = IO ()
a, b, c :: StateMachine
a = print "a()" >> b
b = print "b()" >> c
c = print "c()" >> a

You need not worry that this will overflow a stack or anything like that. If you insist on having states, then you should make the data type more explicit:

data PossibleStates = A | B | C
type StateMachine = PossibleStates -> IO PossibleStates
machine A = print "a()" >> return B
machine B = print "b()" >> return C
machine C = print "c()" >> return A

You can then get compiler warnings about any StateMachine that forgot some states.

pat · Accepted Answer · 2011-10-09T19:31:30.090

15

If you use newtype instead of data, you don't incur any overhead. Also, you can wrap each state's function at the point of definition, so the expressions that use them don't have to:

import Control.Monad

newtype State = State { runState :: IO State }

a :: State
a = State $ print "a()" >> return b

b :: State
b = State $ print "b()" >> return c

c :: State
c = State $ print "c()" >> return a

runMachine :: State -> IO ()
runMachine s = runMachine =<< runState s

main = runMachine a

Edit: it struck me that runMachine has a more general form; a monadic version of iterate:

iterateM :: Monad m => (a -> m a) -> a -> m [a]
iterateM f a = do { b <- f a
                  ; as <- iterateM f b
                  ; return (a:as)
                  }

main = iterateM runState a

Edit: Hmm, iterateM causes a space-leak. Maybe iterateM_ would be better.

iterateM_ :: Monad m => (a -> m a) -> a -> m ()
iterateM_ f a = f a >>= iterateM_ f

main = iterateM_ runState a

Edit: If you want to thread some state through the state machine, you can use the same definition for State, but change the state functions to:

a :: Int -> State
a i = State $ do{ print $ "a(" ++ show i ++ ")"
                ; return $ b (i+1)
                }

b :: Int -> State
b i = State $ do{ print $ "b(" ++ show i ++ ")"
                ; return $ c (i+1)
                }

c :: Int -> State
c i = State $ do{ print $ "c(" ++ show i ++ ")"
                ; return $ a (i+1)
                }

main = iterateM_ runState $ a 1

edited Oct 09 '11 at 19:31

answered Oct 08 '11 at 22:30

pat

12,587
1
23
52

interesting, though you still have O(n) wrapping – megazord Oct 08 '11 at 22:51
1

The wrapping is there to keep the type-checker happy, but there is no runtime overhead. In general, there may be a many-to-many relationship between states. Wrapping at the point of definition is O(n), wrapping at the point of use could be O(N^2). – pat Oct 08 '11 at 22:59
touche ----------------------------------- – megazord Oct 08 '11 at 23:08
1

"`iterateM` is not tail-recursive, so it causes a space-leak". No, tail recursion has nothing to do with it. If I'm not mistaken, it causes a space leak because you are collecting a huge list full of `()`. – Dan Burton Oct 09 '11 at 03:07
1

@Dan it's collecting a huge list of the visited `State`s, but I was hoping that it would be recognized that the result is not used, and would be garbage collected as it was created. I guess I'm confused about tail recursion inside the `Monad`. Doesn't `IO` sequence the recursive call to `iterateM`, and only perform the `(:)` after it produces a result (which is never)? – pat Oct 09 '11 at 05:01
1

What would this look like if the State of each subsequent state depended on the previous state? Such as if you wanted it to print a(1) b(2) c(3) etc... – zmanian Oct 09 '11 at 19:01
1

@zmanian see update to see how to thread additional state through the state machine – pat Oct 09 '11 at 19:32

score 11 · Answer 3 · edited Oct 09 '11 at 16:48

11

In the C-like type systems functions are not first order citizens. There are certain restrictions on handling them. That was a decision for simplicity and speed of implementation/execution that stuck. To have functions behave like objects, one generally requires support for closures. Those however are not naturally supported by mosts processors' instruction sets. As C was designed to be close to the metal, there was no support for them.

When declaring recursive structures in C, the type must be fully expandable. A consequence of this is, that you can only have pointers as self-references in struct declarations:

struct rec;
struct rec {
    struct rec *next;
};

Also every identifier we use has to be declared. One of the restrictions of function-types is, that one can not forward declare them.

A state machine in C usually works by making a mapping from integers to functions, either in a switch statement or in a jump table:

typedef int (*func_t)();

void run() {
    func_t table[] = {a, b, c};

    int state = 0;

    while(True) {
        state = table[state]();
    }
}

Alternatively you could profile your Python code and try to find out why your code is slow. You can port the critical parts to C/C++ and keep using Python for the state machine.

edited Oct 09 '11 at 16:48

Peter Mortensen

30,738
21
105
131

answered Oct 08 '11 at 21:57

ebo

8,985
3
31
37

3

I think that was the problem, trying to do something in a way idiomatic in one language and applying it in exactly the same way to another language where it isn't. They are usually not 1-1. – Jeff Mercado Oct 08 '11 at 22:01
This is even worse than my solution. Functions are "first class" citizens in Haskell, and my Haskell implementation has the exact problem as the C implementation. So I don't think your first sentence is relevant. – megazord Oct 08 '11 at 22:03
1

Can't talk for Haskell. But that's why it does not work the same way in Python and C. – ebo Oct 08 '11 at 22:05
It has nothing to do with whether functions are "first class". This is an issue with the type system restricting the kind of code I can write. – megazord Oct 08 '11 at 22:09
If functions behaved like structs, you could forward declare a function-type and use it for the final declaration... – ebo Oct 08 '11 at 22:13
I don't think you understand the problem. – megazord Oct 08 '11 at 22:18
4

You are trying to force a duck-typing solution on a statically typed language, which obviously does not work? C does not allow that, for historic reasons. I gave an example of the standard way to implement a lean and fast state machine in C? – ebo Oct 08 '11 at 22:23
"C does not allow that, for historic reasons." ? I understand it could be because I'm trying to do sometihng that works in dynamic languages, but your reason why it doesn't work is obviously false. – megazord Oct 08 '11 at 22:27
@ebo: It's not a duck-typing solution megazord is using. In the original Python version, all of the functions just have an intuitive type something like "function -> function". Whereas in C and Haskell, the type of a function *contains* the types of its arguments and return value, which makes a function that returns a function of its own type inexpressible in the type system. Duck typing is all about objects *informally* conforming to undeclared interfaces; nothing at all related to this problem. – Ben Oct 09 '11 at 00:15

score 11 · Answer 4 · answered Oct 08 '11 at 22:07

11

The problem with your Haskell code is, that type only introduces a synonym, which is quite similar to what typedef in C does. One important restriction is, that the expansion of the type must be finite, you can't give a finite expansion of your state machine. A solution is using a newtype: A newtype is a wrapper that does only exist for the type checker; there is absolutely zero overhead (excluded stuff that occurs because of generalization that can't be removed). Here is your signature; it typechecks:

newtype FN = FN { unFM :: (IO FN) }

Please note, that whenever you want to use an FN, you have to unpack it first using unFN. Whenever you return a new function, use FN to pack it.

answered Oct 08 '11 at 22:07

fuz

88,405
25
200
352

I know about newtype (though I didn't think of using it to eliminate the overhead here). But then I still have to go around wrapping/unwrapping in *every state*, unlike in a dynamically typed language. – megazord Oct 08 '11 at 22:13
1

@megazord: You can just define a helper `next = return . Fn` to eliminate the repetition. – hammar Oct 08 '11 at 22:17
12

@megazord: Because Haskell isn't Python. Type safety comes at _some_ cost, but then again it's a compile time error if you accidentally return an integer or a string as the next state, or forget to return a next state at all. Your Python code will not detect that until run time. – hammar Oct 08 '11 at 22:23

score 6 · Answer 5 · answered Oct 09 '11 at 04:45

As usual, despite the great answers already present, I couldn't resist trying it out for myself. One thing that bothered me about what is presented is that it ignores input. State machines--the ones that I am familiar with--choose between various possible transitions based on input.

data State vocab = State { stateId :: String
                         , possibleInputs :: [vocab]
                         , _runTrans :: (vocab -> State vocab)
                         }
                      | GoalState { stateId :: String }

instance Show (State a) where
  show = stateId

runTransition :: Eq vocab => State vocab -> vocab -> Maybe (State vocab)
runTransition (GoalState id) _                   = Nothing
runTransition s x | x `notElem` possibleInputs s = Nothing
                  | otherwise                    = Just (_runTrans s x)

Here I define a type State, which is parameterized by a vocabulary type vocab. Now let's define a way that we can trace the execution of a state machine by feeding it inputs.

traceMachine :: (Show vocab, Eq vocab) => State vocab -> [vocab] -> IO ()
traceMachine _ [] = putStrLn "End of input"
traceMachine s (x:xs) = do
  putStrLn "Current state: "
  print s
  putStrLn "Current input: "
  print x
  putStrLn "-----------------------"
  case runTransition s x of
    Nothing -> putStrLn "Invalid transition"
    Just s' -> case s' of
      goal@(GoalState _) -> do
        putStrLn "Goal state reached:"
        print s'
        putStrLn "Input remaining:"
        print xs
      _ -> traceMachine s' xs

Now let's try it out on a simple machine that ignores its inputs. Be warned: the format I have chosen is rather verbose. However, each function that follows can be viewed as a node in a state machine diagram, and I think you'll find the verbosity to be completely relevant to describing such a node. I've used stateId to encode in string format some visual information about how that state behaves.

data SimpleVocab = A | B | C deriving (Eq, Ord, Show, Enum)

simpleMachine :: State SimpleVocab
simpleMachine = stateA

stateA :: State SimpleVocab
stateA = State { stateId = "A state. * -> B"
               , possibleInputs = [A,B,C]
               , _runTrans = \_ -> stateB
               }

stateB :: State SimpleVocab
stateB = State { stateId = "B state. * -> C"
               , possibleInputs = [A,B,C]
               , _runTrans = \_ -> stateC
               }

stateC :: State SimpleVocab
stateC = State { stateId = "C state. * -> A"
               , possibleInputs = [A,B,C]
               , _runTrans = \_ -> stateA
               }

Since the inputs don't matter for this state machine, you can feed it anything.

ghci> traceMachine simpleMachine [A,A,A,A]

I won't include the output, which is also very verbose, but you can see it clearly moves from stateA to stateB to stateC and back to stateA again. Now let's make a slightly more complicated machine:

lessSimpleMachine :: State SimpleVocab
lessSimpleMachine = startNode

startNode :: State SimpleVocab
startNode = State { stateId = "Start node. A -> 1, C -> 2"
                  , possibleInputs = [A,C]
                  , _runTrans = startNodeTrans
                  }
  where startNodeTrans C = node2
        startNodeTrans A = node1

node1 :: State SimpleVocab
node1 = State { stateId = "node1. B -> start, A -> goal"
              , possibleInputs = [B, A]
              , _runTrans = node1trans
              }
  where node1trans B = startNode
        node1trans A = goalNode

node2 :: State SimpleVocab
node2 = State { stateId = "node2. C -> goal, A -> 1, B -> 2"
              , possibleInputs = [A,B,C]
              , _runTrans = node2trans
              }
  where node2trans A = node1
        node2trans B = node2
        node2trans C = goalNode

goalNode :: State SimpleVocab
goalNode = GoalState "Goal. :)"

The possible inputs and transitions for each node should require no further explanation, as they are verbosely described in the code. I'll let you play with traceMachine lessSipmleMachine inputs for yourself. See what happens when inputs is invalid (does not adhere to the "possible inputs" restrictions), or when you hit a goal node before the end of input.

I suppose the verbosity of my solution sort of fails what you were basically asking, which was to cut down on the cruft. But I think it also illustrates how descriptive Haskell code can be. Even though it is very verbose, it is also very straightforward in how it represents nodes of a state machine diagram.

I can't remember why I modelled my state machine differently, I'm a bit drunk at the moment (Thanksgiving), I'll get back to you. loooks nice for something with more features tohugh — megazord, Oct 09 '11 at 18:45
"I think it also illustrates how descriptive Haskell code can be" Indeed, I could tell what this does just by looking at the type `State`. I believe this could be represented as an `Arrow`. — djenga49, Oct 09 '11 at 19:02

score 4 · Answer 6 · edited Oct 12 '11 at 22:46

Iit's not hard to make state machines in Haskell, once you realize that they are not monads! A state machine like the one you want is an arrow, an automaton arrow to be exact:

newtype State a b = State (a -> (b, State a b))

This is a function, which takes an input value and produces an output value along with a new version of itself. This is not a monad, because you cannot write join or (>>=) for it. Equivalently once you have turned this into an arrow you will realize that it's impossible to write an ArrowApply instance for it.

Here are the instances:

import Control.Arrow
import Control.Category
import Prelude hiding ((.), id)

instance Category State where
    id = State $ \x -> (x, id)

    State f . State g =
        State $ \x ->
            let (y, s2) = g x
                (z, s1) = f y
            in (z, s1 . s2)

instance Arrow State where
    arr f = let s = State $ \x -> (f x, s) in s
    first (State f) =
        State $ \(x1, x2) ->
            let (y1, s) = f x1
            in ((y1, x2), first s)

Have fun.

score 3 · Answer 7 · edited May 13 '12 at 13:14

3

You can get the same effect in C as in Python code,- just declare that functions return (void*):

#include "stdio.h"

typedef void* (*myFunc)(void);

void* a(void);
void* b(void);
void* c(void);

void* a(void) {
    printf("a()\n");
    return b;
}

void* b(void) {
    printf("b()\n");
    return c;
}

void* c(void) {
    printf("c()\n");
    return a;
}

void main() {
    void* state = a;
    while (1) {
        state = ((myFunc)state)();
        sleep(1);
    }
}

edited May 13 '12 at 13:14

Peter Mortensen

30,738
21
105
131

answered Oct 08 '11 at 23:56

Agnius Vasiliauskas

10,935
5
50
70

2

I came up with the same thing, but I thought the whole point was to not subvert the type checker by using casts! – pat Oct 09 '11 at 00:01
@pat That was my reasoning also. – megazord Oct 09 '11 at 00:03
It seems to me that you @megazord simply don't know what you want... If you really want some solution most similar to your Python code - then `void*` is the only way to go. Otherwise - change formulation of your question to reflect your real problem. – Agnius Vasiliauskas Oct 09 '11 at 00:16
4

@0x69, megazord has several C solutions, but the point of the question was whether there is a clean type-safe solution. The struct wrapper solution is type-safe, but not clean. Your solution is clean but not type-safe. – pat Oct 09 '11 at 00:19
4

It's not like the Python version is type-safe. Isn't the requirement here speed, not necessarily type-safety? – Zach Oct 09 '11 at 17:56

score 3 · Answer 8 · answered Oct 09 '11 at 13:20

What you want is a recursive type. Different languages have different ways of doing this.

For example, in OCaml (a statically-typed language), there is an optional compiler/interpreter flag -rectypes that enables support for recursive types, allowing you to define stuff like this:

let rec a () = print_endline "a()"; b
and b () = print_endline "b()"; c
and c () = print_endline "c()"; a
;;

Although this is not "ugly" as you complained about in your C example, what happens underneath is still the same. The compiler simply worries about that for you instead of forcing you to write it out.

As others have pointed out, in Haskell you can use newtype and there won't be any "overhead". But you complain about having to explicitly wrap and unwrap the recursive type, which is "ugly". (Similarly with your C example; there is no "overhead" since at the machine level a 1-member struct is identical to its member, but it is "ugly".)

Another example I want to mention is Go (another statically-typed language). In Go, the type construct defines a new type. It is not a simple alias (like typedef in C or type in Haskell), but creates a full-fledged new type (like newtype in Haskell) because such a type has an independent "method set" of methods that you can define on it. Because of this, the type definition can be recursive:

type Fn func () Fn

IMO this is the best answer to "why is it so hard to implement a state machine using functions in statically typed languages?": It actually isn't! Type inference gives you much of the goodness of dynamic typing *and* confidence in your code at compile time. — Joh, Oct 11 '11 at 08:47

score 2 · Answer 9 · edited May 23 '17 at 11:50

Your problem has been had before: Recursive declaration of function pointer in C

C++ operator overloading can be used to hide the mechanics of what is essentially the same as your your C and Haskell solutions, as Herb Sutter describes in GotW #57: Recursive Declarations.

struct FuncPtr_;
typedef FuncPtr_ (*FuncPtr)();

struct FuncPtr_
{
  FuncPtr_( FuncPtr pp ) : p( pp ) { }
  operator FuncPtr() { return p; }
  FuncPtr p;
};

FuncPtr_ f() { return f; } // natural return syntax

int main()
{
  FuncPtr p = f();  // natural usage syntax
  p();
}

But this business with functions will, in all likelihood, perform worse than the equivalent with numeric states. You should use a switch statement or a state table, because what you really want in this situation is a structured semantic equivalent to goto.

that's the same thing with an abuse of syntax. almost as bad as casting — djenga49, Oct 09 '11 at 00:20
It's like a super concentrated liquid form of evil. But it works. — megazord, Oct 09 '11 at 00:56

Joh · Answer 10 · 2011-10-12T13:51:40.340

An example in F#:

type Cont = Cont of (unit -> Cont)

let rec a() =
    printfn "a()"
    Cont (fun () -> b 42)

and b n =
    printfn "b(%d)" n
    Cont c

and c() =
    printfn "c()"
    Cont a

let rec run (Cont f) =
    let f = f()
    run f

run (Cont a)

Regarding the question "why is it so hard to implement state machines using functions in statically typed languages?": That's because the type of of a and friends is a little bit weird: a function that when returns a function that returns a function that returns a function...

If I remove Cont from my example the F# compiler complains and says:

Expecting 'a but given unit -> 'a. The resulting type would be infinite when unifying 'a and unit -> 'a.

Another answer shows a solution in OCaml whose type inference is strong enough to remove the need for declaring Cont, which shows static typing is not to blame, rather the lack of powerful type inference in many statically typed languages.

I don't know why F# doesn't do it, I would guess maybe this would make the type inference algorithm more complicated, slower, or "too powerful" (it could manage to infer the type of incorrectly typed expressions, failing at a later point giving error messages that are hard to understand).

Note that the Python example you gave isn't really safe. In my example, b represents a family of states parameterized by an integer. In an untyped language, it's easy to make a mistake and return b or b 42 instead of the correct lambda and miss that mistake until the code is executed.

This is identical to my C code. Of course it's shorter because it has type inference and is functional. But my Haskell code is pretty much the same size, but safer due to monadic IO. — megazord, Oct 12 '11 at 04:50
Not identical to your C (or Python) code. See `b` and my remark about family of states. If you use functions to encode your state machine, it's only natural to try to take advantage of parameters to limit the number of states. — Joh, Oct 12 '11 at 13:52

djenga49 · Answer 11 · 2011-10-08T23:40:19.550

-6

The Python code you posted will be transformed into a recursive function, but it will not be tail call optimized because Python has no tail call optimization, so it will stack overflow at some point. So the Python code is actually broken, and would take more work to get it as good as the Haskell or C versions.

Here is an example of what I mean:

so.py:

import threading
stack_size_bytes = 10**5
threading.stack_size(10**5)
machine_word_size = 4

def t1():
    print "start t1"
    n = stack_size_bytes/machine_word_size
    while n:
        n -= 1
    print "done t1"

def t2():
    print "start t2"
    n = stack_size_bytes/machine_word_size+1
    while n:
        n -= 1
    print "done t2"

if __name__ == "__main__":
    t = threading.Thread(target=t1)
    t.start()
    t.join()
    t = threading.Thread(target=t2)
    t.start()
    t.join()

shell:

$ python so.py
start t1
done t1
start t2
Exception in thread Thread-2:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 530, in __bootstrap_inner
    self.run()
  File "/usr/lib/python2.7/threading.py", line 483, in run
    self.__target(*self.__args, **self.__kwargs)
  File "so.py", line 18, in t2
    print "done t2"
RuntimeError: maximum recursion depth exceeded

edited Oct 08 '11 at 23:40

answered Oct 08 '11 at 23:10

djenga49

52
4

Incorrect. The Python functions return the next function to call rather than calling them themselves, so tail call optimization is irrelevant here. – hammar Oct 08 '11 at 23:14
why -1? This is how Python works, if any loop exceeds stack_size_bytes/machine_word_size, you get a stack overflow. You have to be careful in this paradigm. This is why it has built in hash tables, because they don't need as many iterations. – djenga49 Oct 08 '11 at 23:15
@hammar, that is in fact recursion – djenga49 Oct 08 '11 at 23:16
1

`a` returns `b` **not** `b()`, big difference! – pat Oct 08 '11 at 23:18
@pat it doesn't matter, any loop gets compiled to recursive function no matter what – djenga49 Oct 08 '11 at 23:20
@hammar: the code wont overflow because it will take forever due to sleep/print overhead, I'm writing a better example – djenga49 Oct 08 '11 at 23:28
@djenga49 Are you saying that `for i in [1..10]: print i` will be compiled into a recursive function?? – pat Oct 08 '11 at 23:29
@djenga49: Runs without crashing, as expected. (Python 2.7.1, Ubuntu 11.04) – hammar Oct 08 '11 at 23:53
1

@djenga49 I'm running Python 2.7.2 on OSX, and it didn't like 10**5 for a stack size. I change to 2**15, and even changed `machine_word_size` to 1, and it still works fine: `$ python so.py start t1 done t1 start t2 done t2` – pat Oct 08 '11 at 23:57
@djenga: AMD Phenom(tm) II N620 Dual-Core. Also tried increasing the number of iterations. Still no overflow. – hammar Oct 09 '11 at 00:04
2

@djenga49 If what you're saying were generally true, it would be **impossible** to write Python programs that run indefinitely, as all unbounded iteration would inevitably lead to unbounded recursion. This is plainly not the case! – Ben Oct 09 '11 at 00:05
1

@Ben, any function can be represented as a finite map, and thus a hash table. This is why Python is so centered around hash tables. – djenga49 Oct 09 '11 at 00:07
@djenga49 are you talking about memoization as a means to avoid unbounded recursion? – pat Oct 09 '11 at 00:13
@djenga39: Python is so centred around hash tables because it uses them to implement dictionaries, and its class/object system is little more than syntactic sugar for dictionary manipulation. They have nothing to do with its implementation of functions, or of loops. And while you can represent any *mathematical* function with a finite domain as a finite map, Python functions are not mathematical functions: there is no finite map representation of `current_time()`. – Ben Oct 09 '11 at 00:20
@Ben, current_time returns a pure value that is consumed by the IO monad. – djenga49 Oct 09 '11 at 00:23
2

@djenga49 Not in Python it doesn't. Python loops are **not** implemented by transforming them to recursive functions which are then somehow memoized by treating impure operations as monadic computations which can be returned and then executed by IO monad driver (where on earth did you get that idea from?). They are implemented by directly executing each byte code operation in turn, some of which are backwards conditional branches. – Ben Oct 09 '11 at 00:32
1

@djenga49 And I've just executed your code verbatim, and it runs fine. If this really does crash for you with the given error, I find it curious that it happens on the `print` line, after the loop you say causes a stack overflow. Try to give us an example of this "loop with no calls or memory allocation causes stack overflow" behaviour that **doesn't** rely on threading and changing the stack size. It won't take too long; a simple `def foo(): return foo()` dies in less than a second on my system, so it's not like you need to prune it back for a fast fail. – Ben Oct 09 '11 at 00:35
It still crashes for me. The statically typed version is better after all. – megazord Oct 09 '11 at 00:35
OK, what version of Python/OS/CPU causes the stack overflow? – pat Oct 09 '11 at 00:38
@megazord Does your original state machine implementation crash with a stack overflow? I just copied your example code, took out the prints and sleeps, and it's been running the whole time I've been typing this note. Constant memory usage. – Ben Oct 09 '11 at 00:39
@megazord what is still crashing for you? Your original state machine python code, or djenga49's while-loop-as-recursion python code? If the statically typed version is working, it's probably because it's written in a different language, not because it is statically typed! – pat Oct 09 '11 at 00:42
@Ben, the code transformation is hidden from the user, so it tries to link up the original source with the transformed souce as good as possible. I don't see how it isn't crashing for some of you, maybe it's a different version of python or different processor. I'm running Gentoo with kernel 2.6.39-gentoo-r3 32-bit. Maybe has something to do with PAE or PSE, etc. The Python thread.stack_size isn't exactly reliable also. – djenga49 Oct 09 '11 at 00:43
@megazord oh, so the statically typed version of the state-machine code is better than djenga49's while-loop-as-recursion code. – pat Oct 09 '11 at 00:43
@djenga49 how did you come by your knowledge of python internals? I'd like to read about why imperative loops are converted to recursion with memoization. – pat Oct 09 '11 at 00:47
4

@djenga49 So show us a `while True` loop that causes this behaviour. There's no need to get a loop that cleverly executes one too many times, an infinite loop would demonstrate your point much more elegantly. While you're at it, try `from dis import dis; dis(t2)` on your strange system; on mine you can plainly see the backwards jump operation used to implement the loop. – Ben Oct 09 '11 at 00:49

Clean and type-safe state machine implementation in a statically typed language?

11 Answers11

Linked