Okay, there's a lot to unpack here. First, let's discuss your specific code example. The correct way to write your responseUsers
handler for Scotty is:
responseUsers :: ActionM ()
responseUsers = do
users <- getAllUsers
json (show users)
Even if getAllUsers
takes a day and a half to run and a hundred clients are all making getAllUsers
requests at once, nothing else will block, and your Scotty server will continue handling requests. To see this, consider the following server:
{-# LANGUAGE OverloadedStrings #-}
import Web.Scotty
import Control.Concurrent
import Control.Monad.IO.Class
import qualified Data.Text.Lazy as T
main = scotty 8080 $ do
get "/fast" $ html "<h1>Fast Response</h1><p>I'm ready!"
get "/slow" $ liftIO (threadDelay 30000000) >> html "<h1>Slow</h1><p>Whew, finally!"
get "/pure" $ html $ "<h1>Answer</h1><p>The answer is "
<> (T.pack . show . sum $ [1..1000000000])
If you compile this and start it up, you can open multiple browser tabs to:
http://localhost:8080/slow
http://localhost:8080/pure
http://localhost:8080/fast
and you'll see that the fast
link returns immediately, even while the slow
and pure
links are blocked on an IO and pure computation respectively. (There's nothing special about threadDelay
-- it could have been any IO action, like accessing a database or reading a big file or proxying to another HTTP server or whatever.) You can keep launching multiple additional requests for fast
, slow
, and pure
, and the slow ones will chug away in the background while the server continues to accept more requests. (The pure
computation is a little different than the slow
computation -- it will only block the first time around, all threads waiting on it will return an answer at once, and subsequent requests will be fast. If we tricked Haskell into recomputing it for every request, or if it actually depended on some information supplied in the request as might be the case in a more realistic server, it would act more or less like the slow
computation, though.)
You don't need any kind of callback here, and you don't need the main thread to "wait" on the result. The threads that are forked by Scotty to handle each request can perform whatever computation or IO activity is needed and then return the response to the client directly, without affecting any other threads.
What's more, unless you compiled this server with -threaded
and provide a thread count >1 either at compile or run time, it only runs in one OS thread. So, by default, it's doing all this in a single OS thread automatically!
Second, this isn't actually anything special about Scotty. You should think of the Haskell runtime as providing a threaded abstraction layer on top of the OS thread mechanism, and the OS threads are an implementation detail that you don't have to worry about (well, except in unusual situations, like if you're interfacing with an external library that requires certain things to happen in certain OS threads).
So, all Haskell threads, even the "main" thread, are green, and are run on top of a sort of virtual machine that will run just fine on top of a single OS thread, no matter how many green threads block for whatever reason.
Therefore, a typical pattern for writing an asynchronous request handler is:
loop :: IO ()
loop = do
req <- getRequest
forkIO $ handleRequest req
loop
Note that there's no callback needed here. The handleRequest
function runs in a separate green thread for each request that can perform long-running pure CPU-bound computations, blocking IO operations, and whatever else is needed, and the handling thread doesn't need to communicate the result back to the main thread in order to finally service the request. It can just communicate the result to the client directly.
Scotty is basically built around this pattern, so it automatically dispatches multiple requests without requiring callbacks or blocking OS threads.