28

I don't understand why Box::new doesn't return an Option or Result.

The allocation can fail because memory is not unlimited, or something else could happen; what is the behavior in such cases? I can't find any information about it.

Stargateur
  • 24,473
  • 8
  • 65
  • 91
  • 2
    If you ran out of heap space, surely you have bigger problems. I can't think of any situations where you wouldn't want to just panic at this point. – Peter Hall Jul 05 '17 at 10:32
  • 6
    @PeterHall Sometime you want to handle the error for a server to keep running for other client or if you need to clean before exit. – Stargateur Jul 05 '17 at 10:35
  • 8
    I'm not sure I would want to handle that failure at the point of allocation though. This falls into the category of unexpected and catastrophic exceptions, which are hard to recover from. In your example of a server request, it would be better to `catch_unwind` from the point of handling the problematic request, so you can continue to accept new requests thereafter. – Peter Hall Jul 05 '17 at 11:07
  • @PeterHall I didn't know `catch_unwind()` function. Yep, this is a very good way to solve my both cases. But this force user to use unwinding implementation, not a big issue. Note: I appreciate your point, but I think that "how recover from a panic" is another question. – Stargateur Jul 05 '17 at 11:19
  • 3
    @Stargateur, judging by [the source of default OOM handler](https://github.com/rust-lang/rust/blob/692b5722363be2de18a27b46db59950124a5101d/src/liballoc/oom.rs#L15-L20), it just aborts the process. You'll need to use nightly Rust to change OOM behavior. – red75prime Jul 05 '17 at 11:56
  • @PeterHall I though memory allocation failures result in abortion, not in a panic? (I would prefer a panic) – CodesInChaos Jul 05 '17 at 15:28
  • 1
    Recovering from failed big allocations (e.g. a multi megabyte vector) is often possible. Recovering from small allocations (e.g. box) rarely is. – CodesInChaos Jul 05 '17 at 15:32
  • @CodesInChaos That will be great if you make an answer that answer the behavior of OOM. No one actually answer this part of my question :p. (maybe too broad ?) – Stargateur Jul 05 '17 at 15:35
  • @CodesInChaos I did not realise this! – Peter Hall Jul 05 '17 at 15:51
  • _"No one actually answer this part of my question"_ — Doesn't Matthieu M's answer cover the OOM behaviour well enough? – Peter Hall Jul 05 '17 at 16:01

4 Answers4

37

A more general form is What to do on Out Of Memory (OOM)?

There are many difficulties in handling OOM:

  • detecting it,
  • recovering from it (gracefully),
  • integrating it in the language (gracefully).

The first issue is detecting it. Many OSes today will, by default, use swap space. In this case, your process is actually in trouble way before you get to the OOM situation because starting to use swap space will significantly slow down the process. Other OSes will kill low-priority processes when a higher process requires more memory (OOM killer), or promise more memory than they currently have in the hope it will not be used or will be available by the time it is necessary (overcommit), etc...

The second issue is recovering. At the process level, the only way to recover is to free memory... without allocating any in the mean time. This is not as easy as it sounds, for example there is no guarantee that panicking and unwinding will not require allocating memory (for example, the act of storing a panic message could allocate if done carelessly). This is why the current rustc runtime aborts by default on OOM.

The third issue is language integration: memory allocations are everywhere. Any use of Box, Vec, String, etc... So, if you shun the panic route and use the Result route instead, you need to tweak nearly any mutating method signature to account for this kind of failure, and this will bubble in all interfaces.

Finally, it's notable that in domains where memory allocation failure need be handled... often times memory allocation is not allowed to start with. In critical embedded software, for example, all memory is allocated up-front and there is a proof that no more than what is allocated will be required.

This is important, because it means that there are very few situations where (1) dynamic memory allocation is allowed and (2) its failure must be handled gracefully by the process itself.

And at this point, one can only wonder how much complexity budget should be spent on this, and how much complexity this will push unto the 99% of programs which do not care.

Matthieu M.
  • 287,565
  • 48
  • 449
  • 722
  • 3
    Image handling however can benefit from fallible allocations. Unpacked 16K HDR image requires around 2GB of memory and there's no way to check if the allocation will succeed. – red75prime Jul 05 '17 at 12:05
  • 10
    @red75prime: Oh, I'm not saying that it's never useful, I'm saying it's *rarely* is. As such, not worrying about it the right default, but it would indeed be nice to have *another* way to deal with fallible allocations (`Box::try_new()` ?). – Matthieu M. Jul 05 '17 at 12:27
  • 1
    A bunch of fallible functions existing in parallel with panicking functions doesn't seem like elegant solution, taking into account the rarity you mentioned. Maybe something like `fn try_allocT>(f: F) -> Option`? – red75prime Jul 05 '17 at 12:49
  • Can you complete your answer by adding the behavior of this error in rust ? What documentation if any say ? Or rust choice to let this undefined behavior ? – Stargateur Jul 05 '17 at 16:07
  • @Stargateur: I linked to the experimental `alloc::oom` module, and the default OOM handler. If you feel like it, you can tweak this handler, however note that the signature being `() -> !` does not let you much choice (panicking, infinite loop, abort/exit, ... but no returning null). This is not a *language* thing however, just a `std` thing. If you implement your own `std`, you can pick another behavior. – Matthieu M. Jul 05 '17 at 16:53
  • And don't forget the systems that happily promise you any memory you ask for and only find they don't have it when you actually try to use it. – Jan Hudec Jul 05 '17 at 19:00
  • @JanHudec: It was implicitly contained in the "etc..." but I added it. I wonder what other surprises there are, given the number of OSes in existence I doubt I'm exhaustive here... – Matthieu M. Jul 05 '17 at 19:07
  • Can I conclude from that answer and the comments above, that Rust will never be used on small embedded systems? I mean ``std`` at least is then unusable. Even tiny systems with little RAM are sometimes forced to deal with heap, for example if some esoteric protocols of some esoteric field bus are required. Sometimes such requirements are discovered/added by customers late in the dev cycle. And then? Major rewrite without std? – BitTickler May 09 '19 at 22:51
  • @BitTickler: There are two layers here: Rust the language and `std` the library. Rust is used successfully on small systems in `#![no_std]` mode **now**, and there are explorations of what customisable/fallible allocation would entail. There's an [Embedded WG](https://rust-embedded.github.io/blog/) specifically interested in using Rust in every kind of embedded environment. – Matthieu M. May 10 '19 at 07:19
  • 1
    The "overcommit" behaviour is a bit chicken-egg. Overcommit exists in part *because* programmers don't typically implement graceful ways of responding to the kernel telling you it can't give you more memory, and so, the default behaviour is to *always* succeed. You can set a kernel flag `vm.overcommit_memory=2` which leads to the kernel being more keen to tell you when you're running out. https://serverfault.com/questions/606185/how-does-vm-overcommit-memory-work#606193 – Kent Fredric May 16 '19 at 00:30
  • @KentFredric: True. Though AFAIK, even kernel developers don't typically implement graceful ways of responding to memory exhaustion. They tend to for large allocations, but regularly "forget" for small allocations (single struct, etc...) \o/ – Matthieu M. May 16 '19 at 06:38
  • @MatthieuM. We pay it today https://lore.kernel.org/rust-for-linux/62371527c2a74bce82881a8a09d65e10@AcuMS.aculab.com/ I strongly think it was a big mistake and that doesn't cost anything to make this be handle by user like **any** other api error. – Stargateur Apr 15 '21 at 16:45
12

I found the following communication between the Rust developers regarding some of the lower-level functions in liballoc not returning Options: PR #14230.

Especially the following parts explain some of the reasons behind it:

huonw:

Hm... shouldn't the lowest level library not be triggering task failure? Are we planning to have any lower-level libraries returning Option or something?

alexcrichton:

I found that it was quite common to want to trigger task failure, much more so than I originally realized. I also found that all contexts have some form or notion of failure, although it's not always task failure.

huonw:

I was thinking from the perspective of task failure not being recoverable at the call site, i.e. a higher level library is free to fail, but the absolute lowest building blocks shouldn't, so that people can handle problems as they wish (even if it's just manually triggering task failure). If liballoc isn't designed to be the lowest level allocation library, failing is fine. (BTW, I think you may've misinterpreted my comment, because I wasn't talking about libcore, just liballoc.)

alexcrichton:

Oops, sorry! I believe that the core allocator interface (located in liballoc) will be specced to not fail!(), just the primitives on top of them (for example, the box operator).

Perhaps we could extend the box syntax to allow returning Option one day to accommodate this use case, because I'd definitely like to be able to re-use this code!

Community
  • 1
  • 1
ljedrz
  • 20,316
  • 4
  • 69
  • 97
  • 3
    "Perhaps we could extend the box syntax to allow returning Option one day to accommodate this use case, because I'd definitely like to be able to re-use this code!", maybe I should ask them :p. – Stargateur Jul 05 '17 at 11:22
  • 1
    @Stargateur You may not get what you want, but you'll almost certainly get friendly feedback! – Kyle Strand Jul 05 '17 at 21:35
6

This is a language design decision. You have to consider not just the logic of a single operation (Box::new, for example) but how it will affect the language ergonomics. If we were to handle the memory allocation errors with the Return mechanics then these errors would've started to bubble up pretty much everywhere. Even if the method doesn't allocate any memory on the heap currently, it might resort to it in the future. Suddenly a simple change in implementation would be stuck because you'd have to change the API, which with semantic versioning means a major release. All that for a little benefit, because the out of memory handling isn't very reliable or useful in the presence of swapping and memory killers (often you should stop allocating the memory long before you get an out of memory error).

The subject was much discussed on reddit.

One proposed solution I've seen is to treat the out of memory as a panic, unwinding and terminating the corresponding task.

ArtemGr
  • 11,684
  • 3
  • 52
  • 85
  • 1
    "Even if the method doesn't allocate any memory on the heap currently, it might resort to it in the future.", I disagree. You choose that your function don't return an error or an option, so you decide that she never fail. It's your problem if you have a bad design in the first place. In this case you can panic yourself so you don't have a breaking change if you want. Thanks for the reddit link. – Stargateur Jul 05 '17 at 10:53
  • 4
    @Stargateur You have a point, but it's a fine one. In most software the heap allocation is widely used and to consider it during the interface design would be a strain. – ArtemGr Jul 05 '17 at 11:05
  • As others have commented, OOM doesn't panic so you can't unwind, unfortunately :/ – Peter Hall Jul 05 '17 at 16:02
  • cf. https://www.reddit.com/r/rust/comments/ms2nl7/linus_torvalds_concerns_about_panics_in_rust_code/ on that – ArtemGr Apr 17 '21 at 02:14
2

For someone coming from 2021, linus has same concerns with you.

I do think that the "run-time failure panic" is a fundamental issue.

Hopefully that will be solved through rust team effects.

https://github.com/rust-lang/rust/pull/84266

https://lwn.net/ml/linux-kernel/YHdSATy9am21Tj4Z@localhost/

https://lkml.org/lkml/2021/4/14/1099

nuclear
  • 3,181
  • 3
  • 19
  • 38