15

I would do something like this (pseudo code):

1. load sensitive encrypted data from file
2. decrypt the data
3. do something with the unencrypted data
4. override the data safely / securely (for example with random data)

The time that the sensitive data lies plain (unencrypted) in memory should be as short as possible.

The plain data must not be leaked in any way.

A. Can such a program be written in Haskell or OCAML?

B. Can it be prevented that the data gets leaked, i.e. by being silently copied in the background by the garbage collector?

C. Can the plain data be properly overridden in memory?

As far as I know garbage collectors (GCs) can make copies of data silently in the background. I guess that is done by generational GC algorithms, but I don't know for sure.

I know that it still would be possible for an attacker to get the plain data if the attacker manages to get the memory of the program at the right time / state. I just consider to do that to raise security because I do not have the context (i.e. OS, swapping etc.) under control.

lehins
  • 9,642
  • 2
  • 35
  • 49
user573215
  • 4,679
  • 5
  • 22
  • 25
  • I find this an interesting question, but at least as far as GHC is concerned I'm pretty sure you can't get any guarantees of that kind. Generally, the tendency for memory leaks is Haskell's single biggest weakness as of today, in my opinion. – leftaroundabout Jun 02 '20 at 09:24
  • 4
    I believe GHC supports "pinned" data that the garbage collector isn't allowed to move around. (It's for interop with external C libraries and the like.) There's a lot of manual memory management involved, but it seems like it *might* do what you're after. – MathematicalOrchid Jun 02 '20 at 10:38
  • 4
    There is already a data type like this called `ScrubbedBytes`, which is implemented in `memory` package and is used precisely for this purpose by `cryptonite` library: https://www.stackage.org/haddock/nightly-2020-06-01/memory-0.15.0/Data-ByteArray.html#t:ScrubbedBytes It is allocated as pinned, so it doesn't move and memory is cleaned before being garbage collected. – lehins Jun 02 '20 at 15:20
  • @lehins very intriguing; why don't you make this answer? – leftaroundabout Jun 02 '20 at 15:54
  • @leftaroundabout In a process of writing it up ;) – lehins Jun 02 '20 at 16:04

2 Answers2

11

I already mentioned this in a comment, but I think it is a really good question and deserves an answer.

There is already a data type ScrubbedBytes that has the following properties:

  • It is allocated with newAlignedPinnedByteArray#, which means while the newly allocated MutableByteArray# is referenced anywhere in your code it will not be GCed, but it will also not going to get moved or copied around.
  • Upon allocation a weak pointer is created with mkWeak# and a finalizer gets added to the newly allocated memory. This means that whenever scrubbed bytes are no longer referenced in your code and before GC deallocates the memory a scrubber will get invoked that will write zeros into the memory.
  • Equality will not short circuit, thus guarding against timing attacks.

There is one small gotcha to this scrubber. There is a small chance that it will not get executed, in particular if a program exits right before the GC should cleanup the memory. (See more info on weak pointers.) Therefore, I would recommend implementing it using bracket pattern. Here is how you can get it done with primitive package:

import Control.Exception
import Control.Monad.Primitive (RealWorld)
import qualified Data.Primitive.ByteArray as BA

withScrubbedMutableByteArray ::
     Int -- ^ Number of bytes
  -> (BA.MutableByteArray RealWorld -> IO a)
  -- ^ Action to execute
  -> IO a
withScrubbedMutableByteArray c f = do
  mba <- BA.newPinnedByteArray c
  f mba `finally` BA.setByteArray mba 0 c (0 :: Word8)

Reason why using finally is safer is because you will have stronger guarantees that the memory will be zeroed out. For example user hitting Ctrl-C in a correct setup will not prevent scrubber from running.

lehins
  • 9,642
  • 2
  • 35
  • 49
  • Can't you use `finalize` in your `finally`? That will allow the memory to be scrubbed and freed early, but ensure that it happens before you exit the block. – dfeuer Jun 02 '20 at 18:26
  • @dfeuer In case of `ScrubbedBytes` I don't think it is possible because there is no way to access the created `Weak` pointer (it is simply discarded). In general though it would be possible, but there is no point, because `finally` will ensure that scrubbing will happen and there is no longer a need to rely on a weak pointer – lehins Jun 02 '20 at 19:34
  • The advantage of the weak pointer (as long as you can grab hold of it—sounds like that API could use a tweak) is that it can scrub and deallocate early if the code drops the reference before the `finally` block completes. – dfeuer Jun 02 '20 at 23:37
  • That link for weak references is broken. Looks like a multiple-paste incident. – Carl Jun 02 '20 at 23:46
  • @Carl thanx, fixed. @dfeuer yes early scrubbing could be an advantage. I am not an avid user of `memory`, so maybe someone can propose an improvement to the API ;) – lehins Jun 03 '20 at 07:36
  • I just opened [an issue](https://github.com/vincenthz/hs-memory/issues/83) for that. – dfeuer Jun 04 '20 at 02:43
5

In OCaml it can be easily done using Bigarrays which are not governed by GC, never copied, and never examined by it. You can use Unix.map_file to load it and ocaml-struct to handle the loaded data nicely (if it is structured). OCaml is used extensively for writing low-level security-related code, see the mirage project (it has tons of cryptographic-related libraries), ocaml-tls a pure implementation of the TLS protocol in OCaml, and Project Everest which uses OCaml as the target language.

When decrypting/encrypting and otherwise processing the secret data you should be careful and do not put it in a boxed type, including strings and int64 integers. If you will take a look at mirage-crypto you will find out that all algorithms are implemented using integers only, which are represented as immediates and are never touched by GC.

ivg
  • 34,431
  • 2
  • 35
  • 63