How to execute deterministic code from an untrusted third party safely?

Question

I understand that there are several questions similar to mine, however my problem is a little different and I haven't found a proper answer.

I know a simple way to run code from untrusted sources is to create a container, a jail with limited resources, and wait for a timeout; but I would like a different solution. I need the result to be deterministic, that is, this code cannot have any side effects, not even in an isolated environment. The code will receive an input and must always return the same output based on that input.

The natural way I thought was to require this code to be purely functional with no side effects. I thought of the Haskell language. Is it possible to somehow disable side effects in Haskell (monads) and run code purely functional? How do I execute code in Haskell disabling any possible side effects and all sorts of I/O?

I don't mind at first if the code goes into an infinite loop and uses a lot of memory, but if it were possible to limit the execution time and memory usage it would be even a plus.

It's safe to _evaluate_ a program of type `Foo -> IO ()`, because all it does is compute an IO action. To actually perform the IO, you need to ask the Haskell runtime system to run the action it gives you - usually this is done using `>>=` in code which is reachable from `main`. (To put it another way, `let io = print "hello" in ...` doesn't actually print anything by itself.) The real issue is `unsafePerformIO` and friends. — Benjamin Hodgson, Jul 21 '22 at 22:55
Maybe have a look at [lambdabot](https://hackage.haskell.org/package/lambdabot). It's been a long time since I used it, but I recall it was designed to allow untrusted people to run Haskell snippets from IRC. Perhaps there's also something called "haskell-plugins" that can help in your task. Also see: https://wiki.haskell.org/Safely_running_untrusted_Haskell_code — chi, Jul 21 '22 at 23:13
@BenjaminHodgson could you please tell me more about `unsafePerformIO` and why is this an issue? And how to avoid it, so? — Felipe, Jul 22 '22 at 00:41
@chi Let me ask you: https://hackage.haskell.org/package/mueval really cut all kind of I/O or it let some kinds with limited resources? My goal is to remove completely any chance of I/O. — Felipe, Jul 22 '22 at 00:50
@Felipe [`unsafePerformIO`](https://hackage.haskell.org/package/base-4.16.2.0/docs/System-IO-Unsafe.html#v:unsafePerformIO) takes a value of type `IO a` and converts it to a value of type `a`; exactly what you're not supposed to be able to do with `IO`! It does so by "cheating"; hooking the runtime system to actually execute the IO action (of type `IO a`) if it ever needs the result value (of type `a`). From a purity point of view it's utterly broken, but sometimes you can build pure interfaces around things that use impurity internally, and that's what it's for. — Ben, Jul 22 '22 at 01:19
If determinism and avoiding side effects is important, maybe a language like [Dhall](https://dhall-lang.org/) would be a better fit?! — sjakobi, Jul 22 '22 at 01:29
@sjakobi unfortunately this script language does weird things like import a content directly from an external URL so it is not deterministic. — Felipe, Jul 22 '22 at 01:38
@Felipe AFAICS, mueval should prevent all IO. In its web page, it says that running `let foo = readFile \"/etc/passwd\" >>= print in foo` will produce the string `` without running the actual action. — chi, Jul 22 '22 at 07:40
Regarding Dhall, if you want to disable HTTP imports, you can build the Haskell interpreter with `-f-with-http`: https://hackage.haskell.org/package/dhall#flags — sjakobi, Jul 22 '22 at 13:36

Ben · Accepted Answer · 2022-07-22T02:10:49.737

Out of the box Haskell doesn't give you the kind of safety you're looking for. unsafePerformIO is an obvious security hole (but there are others too).

unsafePerformIO takes a value of type IO a and converts it to a value of type a; exactly what you're not supposed to be able to do with IO! It does so by "cheating"; hooking the runtime system to actually execute the IO action (of type IO a) if it ever needs the result value (of type a). From a purity point of view it's utterly broken; since it allows you to lie with the type system and give a non-IO type to a computation that will actually perform arbitrary side effects. But sometimes in advanced usage you can build pure interfaces around things that use impurity internally, and that's what it's for.

"Polite" Haskell doesn't use unsafePerformIO to smuggle side effects that matter¹ into pure computations, so we genuinely ignore it when reasoning about pure code. But you're talking about running untrusted code; you can't trust it to be "polite". Using unsafePerformIO to smuggle side effects into pure functions is exactly the sort of thing an adversary will put into their code to break out of your jail. So you can't ignore it (nor other unsafe functions; the known-unsafe ones provided by GHC will have unsafe in the name). Basically, Haskell is not inherently safer than C in this regard (indeed someone can use the FFI to call arbitrary C from Haskell and call it pure!); it uses purity as a language feature to help developers write code, not as a security feature to restrict code you don't trust. Indeed even compiling untrusted Haskell code is not actually safe in this sense; compile-time code (e.g. using TemplateHaskell) can execute arbitrary side effects!

You may be interested in Safe Haskell; this is an opt-in system (through language extensions) in GHC that tries to lock down the "back door" features of Haskell, so that (among other guarantees) you can trust that a pure computation (that does not have an IO type) is actually pure.

WARNING: I've never actually attempted to use Safe Haskell, and I can't speak to its suitability for your purpose. My understanding is that you cannot simply turn on LANGUAGE Safe and compile and run any old code. It's not that safe. It is a tool that hardens up Haskell's type system guarantees so that you can use those guarantees as part of the restrictions you need to build a sandbox for running untrusted code, but I don't believe Haskell's type system guarantees are sufficient on their own. You should definitely do further research if you want to use Safe Haskell for this purpose.

¹ Of course, which side effects "matter" is a matter of taste and context-dependency, and upstream code might not always agree with you on this.

Thank you @Ben, perfect explanation. I am thinking about create a new language in Racket using https://docs.racket-lang.org/guide/languages.html to restrict some unsafe native I/O Racket functions and macros, letting only the safe ones. — Felipe, Jul 22 '22 at 02:13
I ended up with this solution: https://pastebin.com/raw/twnDVzqA to run inside Racket, but I am not sure if it's 100% correct and if it's 100% safe and created another question about it: https://stackoverflow.com/questions/73074984/defining-a-purely-functional-r5rs-env-in-racket — Felipe, Jul 22 '22 at 04:09
@Felipe I'm no Racketeer (is that what Racket programmers call themselves? I would if I were :) ), so I'll leave actually answering that question to people who know anything. But it looks like that's based on blacklisting all the potential side-effecting calls? If I needed something like this and it didn't matter what language I supported, I would see if I could find a language/environment that is safe by construction, rather than trying to lock down all the unsafe features of a general purpose language like Racket or Haskell. I don't know of such a thing to point you at, however. — Ben, Jul 22 '22 at 07:11
You have a point, thank you. Although your answer is excellent and complete, I will wait a little longer before accepting it. Thanks again! — Felipe, Jul 25 '22 at 18:27

How to execute deterministic code from an untrusted third party safely?

1 Answers1