Blocking Threads in Haskell

Question

I'm start coding async with Haskell, and for now I'm using forkIO which create a green thread(It's that correct? is a green thread?) and then I'm using a MVar to communicate from the new thread to the main thread once I finish and I have the value. Here my code:

responseUsers :: ActionM ()
responseUsers = do emptyVar <- liftAndCatchIO $newEmptyMVar
                   liftAndCatchIO $ forkIO $ do
                                             users <- getAllUsers
                                             putMVar emptyVar users
                   users <- liftAndCatchIO $ takeMVar emptyVar
                   json (show users)

After read MVar class I can see is a block thread class where if the MVar is empty block the thread until is filled.

I'm coming from Scala where in other do avoid block we have the concept of callbacks in the Future object, where thread A can create a Thread B and receive a Future.

Then subscribe in a callback function onComplete which it will be invoked once the thread B has finish with the value.

But during that time, thread A is not blocked and can be reused for another operations.

For instance in our Http server framework like Vertx or Grizzly normally is configure to have small number of OS threads(4-8) since they should never be blocked.

Dont we have another pure no blocking mechanism in Haskell?

Regards

Haskell threads are very cheap, since the runtime can run many of them on a single OS thread. They are indeed green threads. Often, making threads block is the simplest and most efficient choice -- the runtime is going to reuse the OS thread for other green threads, anyway, so nothing is lost. Synchronize green threads using `MVar`s or `TVar`s (or any other concurrency primitive), as you did above, and don't be afraid to call `forkIO` many times (even for short tasks) if you need more concurrency. — chi, Aug 18 '18 at 12:45
So you said that for instance in my Http server implementation when I receive a request is not running in OS but in a green thread so the program can continue receiving request?. if that's so, is doing that automatically. — paul, Aug 18 '18 at 12:50
Because here in my example the OS thread is the one that is creating the green thread and blocking until the green thread finish — paul, Aug 18 '18 at 12:51
In the code above there's too much synchronization. One new thread is spawned, and the original thread immediately waits for it to complete (`takeMVar`). That's pointless, as it is. It could be useful, instead, if the original thread did something after the `forkIO` and before the `takeMVar`. — chi, Aug 18 '18 at 13:24
No, the idea was to free the OS thread to be able to process new request, and leave the green thread to persist/read database — paul, Aug 18 '18 at 13:45
I'm not familiar with Scotty, but that might have already started a Haskell thread for you. If so, you can write synchronous code in your handler, since it is being already run in its own thread. To test this hypothesis, add some delay/debug prints, and access your server from two windows at the same time -- they should be processed in parallel. — chi, Aug 18 '18 at 13:57
Ok I will try that. Now I want to rephrase my question. How can Haskell from a OS thread create a green thread and don’t be blocked waiting for that thread to be finish. — paul, Aug 18 '18 at 14:36
The RTS does not really perform blocking system calls -- it uses something like select/poll in POSIX systems to wait for IO and dispatch it to waiting threads. Further, it performs "context switches" between green threads every N ms or so. — chi, Aug 18 '18 at 15:36

score 4 · Accepted Answer · answered Aug 18 '18 at 17:26

Okay, there's a lot to unpack here. First, let's discuss your specific code example. The correct way to write your responseUsers handler for Scotty is:

responseUsers :: ActionM ()
responseUsers = do
  users <- getAllUsers
  json (show users)

Even if getAllUsers takes a day and a half to run and a hundred clients are all making getAllUsers requests at once, nothing else will block, and your Scotty server will continue handling requests. To see this, consider the following server:

{-# LANGUAGE OverloadedStrings #-}

import Web.Scotty
import Control.Concurrent
import Control.Monad.IO.Class
import qualified Data.Text.Lazy as T

main = scotty 8080 $ do
  get "/fast" $ html "<h1>Fast Response</h1><p>I'm ready!"
  get "/slow" $ liftIO (threadDelay 30000000) >> html "<h1>Slow</h1><p>Whew, finally!"
  get "/pure" $ html $ "<h1>Answer</h1><p>The answer is " 
                <> (T.pack . show . sum $ [1..1000000000])

If you compile this and start it up, you can open multiple browser tabs to:

http://localhost:8080/slow
http://localhost:8080/pure
http://localhost:8080/fast

and you'll see that the fast link returns immediately, even while the slow and pure links are blocked on an IO and pure computation respectively. (There's nothing special about threadDelay -- it could have been any IO action, like accessing a database or reading a big file or proxying to another HTTP server or whatever.) You can keep launching multiple additional requests for fast, slow, and pure, and the slow ones will chug away in the background while the server continues to accept more requests. (The pure computation is a little different than the slow computation -- it will only block the first time around, all threads waiting on it will return an answer at once, and subsequent requests will be fast. If we tricked Haskell into recomputing it for every request, or if it actually depended on some information supplied in the request as might be the case in a more realistic server, it would act more or less like the slow computation, though.)

You don't need any kind of callback here, and you don't need the main thread to "wait" on the result. The threads that are forked by Scotty to handle each request can perform whatever computation or IO activity is needed and then return the response to the client directly, without affecting any other threads.

What's more, unless you compiled this server with -threaded and provide a thread count >1 either at compile or run time, it only runs in one OS thread. So, by default, it's doing all this in a single OS thread automatically!

Second, this isn't actually anything special about Scotty. You should think of the Haskell runtime as providing a threaded abstraction layer on top of the OS thread mechanism, and the OS threads are an implementation detail that you don't have to worry about (well, except in unusual situations, like if you're interfacing with an external library that requires certain things to happen in certain OS threads).

So, all Haskell threads, even the "main" thread, are green, and are run on top of a sort of virtual machine that will run just fine on top of a single OS thread, no matter how many green threads block for whatever reason.

Therefore, a typical pattern for writing an asynchronous request handler is:

loop :: IO ()
loop = do
  req <- getRequest
  forkIO $ handleRequest req
  loop

Note that there's no callback needed here. The handleRequest function runs in a separate green thread for each request that can perform long-running pure CPU-bound computations, blocking IO operations, and whatever else is needed, and the handling thread doesn't need to communicate the result back to the main thread in order to finally service the request. It can just communicate the result to the client directly.

Scotty is basically built around this pattern, so it automatically dispatches multiple requests without requiring callbacks or blocking OS threads.

Thank you so much for such a great explanation about how Haskell take care of the OS threads and we run only green threads!. Regarding the use of ForkIO in Scotty is just because I'm learning async programing — paul, Aug 18 '18 at 18:09

Blocking Threads in Haskell

1 Answers1