4

For a process I'm trying to run I need to have a std::vector of std::tuple<long unsigned int, long unsigned int>. The test I'm doing right now should create a vector of 47,614,527,250 (around 47 billion) tuples but actually crashes right there on creation with the error terminate called after throwing an instance of 'std::bad_alloc'. My goal is to use this script with a vector roughly twice that size. The code is this:

arc_vector = std::vector<std::tuple<long unsigned int, long unsigned int>>(arcs);

where arcs is a long unsigned int with the cited value.

Can I, and in that case how do I, increase the memory size? This script is running on a 40-core machine with something like 200GB of memory so I know memory itself is not an issue.

gioaudino
  • 573
  • 8
  • 23
  • 4
    Are you sure that your OS allows you to allocate 47 billion elements *contiguously*? And even if the OS allows it, are you sure that the memory has that much free space in a single place? (The exception would indicate the answer is **no**) – UnholySheep Jul 04 '18 at 14:11
  • 9
    You need `sizeof(std::tuple)` multiplied by 47 billion bytes of *contiguous* memory available. On a 64-bit machine (with 64-bit `long`) that's well over 750 GiB of memory as one single available chunk. – Some programmer dude Jul 04 '18 at 14:12
  • 5
    By my back of the napkin calculation, you'll need about 800 GB of RAM in your machine. – Eljay Jul 04 '18 at 14:12
  • 1
    Platform/Compiler? On Linux that would required more memory than on Windows – Severin Pappadeux Jul 04 '18 at 14:13
  • 2
    All in all, you probably need to rethink your algorithm and your need to have all the elements in memory at the same time. – Some programmer dude Jul 04 '18 at 14:14
  • You are all so right. Thing is I'm building this huge thing cause it's a parameter for a function of a legacy library. The machine I'm using runs Fedora 27. At first I tried to call `push_back` on the vector adding a single element at a time but got the same result (in 2hrs instead of straight away though, of course) – gioaudino Jul 04 '18 at 14:18
  • Does your Fedora 27 run on a machine with at least one terabyte of RAM? If your answer to this question is "no", then you can't use a vector. Forget it. It's not going to work. It's a waste of time. You'll need to figure out some other way, that does not use vectors, to do whatever you need to be done. Unless you have a terabyte of RAM in your machine, you can't use vectors, you can't use any container, pretty much. – Sam Varshavchik Jul 04 '18 at 14:22
  • what @Eljay said, if `long int` is 32 bit and no alignment bytes are wasted the vector you are trying to allocate is 380gb, so memory IS an issue even in your case where you have only 250 mb of RAM (and if you are on linux 64bit ABI, `unsigned long int` is double that size) – pqnet Jul 04 '18 at 14:24
  • 3
    @gioaudino If your legacy library API requires an array of nearly a hundred billion elements, then I suspect the most likely scenario is you misunderstand how that library is supposed to be used. The fact that you are building a vector of `std::tuple` as a parameter for an old library is also suspicious. – François Andrieux Jul 04 '18 at 14:27
  • Do you **need** all the data in memory *at once*? Can you process the data in blocks or chunks? – Thomas Matthews Jul 04 '18 at 16:47

2 Answers2

11

47 billion tuples times 16 bytes each tuple is 780 billion bytes, which is about 760 gb. Your machine has less than 1/3 of the memory required for that, so you really need another approach, regardless of the reason your program crashes.

A proposal I can give you is to use a memory mapped file of 1TB to store that array, and if you really need to use a vector as interface you might write a custom allocator for it that uses the mapped memory. That should sort out your lack of main memory in a quasi-transparent way. If your interface requires a standard vector, with standard allocators, you are better re-designing that.

Another point to add, check what value you have for ulimit for the user running the process, because it might have a more strict limit of virtual memory than 760 gb.

pqnet
  • 6,070
  • 1
  • 30
  • 51
  • Nice solution, but it will likely be very slow... And depending on what the program does it might just become a simple copy of data that is already on disk. In that case it might be a better solution to modify the individual values directly in the input file. – Sander Jul 04 '18 at 14:59
  • 2
    @Sander whether it will be slow or not depends solely on what the algorithm does. Note that since you have a working set bigger than your main memory you need to have some disk access anyway. If your algorithm can be executed in a streaming fashion, or some other predictable memory access pattern, you might get better results by explicitly implementing that, but in most cases you'll end up implementing a poor caching mechanism of a disk data structure, which the operating system probably can make better: memory mapping is pretty fast compared to stream io – pqnet Jul 04 '18 at 15:09
  • Or just add the necessary amount of swap, can be tried very easily. – geza Jul 04 '18 at 19:00
4

You may well have a machine with a lot of memory but the problem is that you require that memory to be contiguous.

Even with memory virtualisation, that's unlikely.

For that amount of data, you'll need to use a different storage container. You could roll your own based on a linked list of vectors that subdivide the data, a vector of pointers to subdivided vectors of your tuples, or find a library that has such a construction already built.

Bathsheba
  • 231,907
  • 34
  • 361
  • 483
  • 2
    What about `std::deque`? – François Andrieux Jul 04 '18 at 14:15
  • @FrançoisAndrieux: I'd worry about the element access times. – Bathsheba Jul 04 '18 at 14:16
  • Element access complexity is constant for `std::deque`. A linked-list based solution would likely have access time problems though. – François Andrieux Jul 04 '18 at 14:16
  • @FrançoisAndrieux: Thinking some more on this, a std::vector of (not necessarily bare) pointers would be a good starting point. – Bathsheba Jul 04 '18 at 14:17
  • 6
    i don't think continuous memory is an issue because the you will never allocate 750G in one page. the OS will probably have to split into multiple pages so the address would be continuous in virtual address space but not in real address space – Tyker Jul 04 '18 at 14:19
  • @Tyker: On that point we'll have to agree to disagree. I'm careful enough to point out that you'll still get a problem with most operating systems, even with virtualisation. Fortunately that point is trivial to verify. – Bathsheba Jul 04 '18 at 14:20
  • @Bathsheba probably depends on how much RAM is Available and the page size used by the OS – Tyker Jul 04 '18 at 14:25
  • @IvanRubinson: Yes, that looks promising on a cursory glance. I don't think Boost has something, but worth checking there too. – Bathsheba Jul 04 '18 at 14:31
  • 2
    Why do you think that contiguous memory is the problem? On a 64-bit machine, the address space is very-very huge. Even if the OS doesn't use the whole 64-bit, like the usual 48-bit address space is more than enough. – geza Jul 04 '18 at 18:23
  • @geza: Well the OP *observes* a problem which is pretty trivial to replicate. Reality is good enough for me. – Bathsheba Jul 04 '18 at 18:46
  • 1
    Sorry, I don't understand what you mean by "Reality is good enough for me.". OP doesn't have the adequate amount of memory, so the OS refuses the allocation. Continuity doesn't matter here in my opinion. As I've said, a 48-bit address space is easily enough to hold a 1TB buffer. – geza Jul 04 '18 at 19:35
  • Could you clarify whether you have any reason other than the OPs question for your claim that it is required for the memory to be contiguous? If you are basing this on the question it would be better to clarify that as a guess in your answer. Your wording suggests that you are presenting it as a priori knowledge: "you require that memory to be contiguous. Even with memory virtualisation, that's unlikely" – pooya13 Aug 08 '19 at 00:59