-2

I try to load a large file (20gb) and load it into a matrix. However I get a bad_alloc error when it tries to load the file in the matrix. My code is working on Mac but doesn't on Linux.

Here is my code:

std::ifstream ifs(filename, std::ifstream::binary);

loadModel(ifs);

void loadModel(std::istream& in) {
     input_ = std::make_shared<Matrix>();
     input_->load(in); // bad_alloc
}
Paul Bénéteau
  • 775
  • 2
  • 12
  • 36
  • How much RAM / virtual memory do you have available on each box? – w08r Feb 22 '20 at 18:38
  • Mac: 16gb and Linux 8gb. – Paul Bénéteau Feb 22 '20 at 18:38
  • You'd normally use memory mapped files when you have more data than memory – Alan Birtles Feb 22 '20 at 18:49
  • Running your application under `gdb` set a breakpoint on the line `input_->load(in)`. When execution stops there do [`catch throw`](https://sourceware.org/gdb/current/onlinedocs/gdb/Set-Catchpoints.html#Set-Catchpoints) and continue. `gdb` should then stop the next time an exception is thrown. At that point you can examine the stack/variables etc. – G.M. Feb 22 '20 at 18:49
  • Sorry, but that's clearly off-topic: You must extract and provide a [mcve], but your code is far away from that. Pay attention to the dependency on the data file as well. In any case, the meaning of `bad_alloc` is documented, so you might want to work on your question as well. – Ulrich Eckhardt Feb 22 '20 at 18:52
  • I can't use gdb because I have a strange error: "at src/a_file.cpp:10311 10311 src/a_file.cpp: No such file or directory." – Paul Bénéteau Feb 22 '20 at 18:55
  • 1
    Those errors are because the source files in question have been compiled without [debugging information](https://sourceware.org/gdb/current/onlinedocs/gdb/Compilation.html#Compilation). – G.M. Feb 22 '20 at 19:02

1 Answers1

1

bad alloc means an error during memory allocation. Probably your matrix does not fit into operating memory available.

You can check available memory with free command

$ free
              total        used        free      shared  buff/cache   available
Mem:       32780268     2055964    29109172      193300     1615132    30106808
Swap:        999420           0      999420

In this output, it tells that 29GB available.

Tarek Dakhran
  • 2,021
  • 11
  • 21
  • I have 7GB available, does it mean my server can't hold more than 7gb variable ? – Paul Bénéteau Feb 22 '20 at 18:42
  • 3
    Yes. 7GB means that you can only have 7GB of data loaded. That's what "gigabyte" means. If you have a ten pound sack of potatoes, you can't get more than ten pounds of potatoes inside it. Same general principle. – Sam Varshavchik Feb 22 '20 at 18:43
  • Okay, but isn't the computer supposed to create virtual memory? When I monitor the memory of my Mac during the process, the ram consumption of the program actually goes way higher than my 16gb of physical memory. What about Linux? – Paul Bénéteau Feb 22 '20 at 18:48
  • 2
    A computer does not "create virtual memory". You are confusing virtual memory with swap space. Your Mac is probably configured with some amount, possibly dynamic amount, of swap space. A Linux system can also be configured with swap space. See the `Swap` line, above. However the only one who would know whether your Linux system has any amount of swappable RAM would be you. Nobody else here could possibly know that. – Sam Varshavchik Feb 22 '20 at 18:50
  • 2
    The general advice here, do not load the whole 20GB into RAM. Create another approach that will load smaller data chunks and process them. – Tarek Dakhran Feb 22 '20 at 18:51
  • I created more swap space in my linux: was 2gb and I set it to 20gb. My program now runs but it is very slow (logically). :'( I don't know how to load smaller data chunks of a pre-trained machine elarning model. It would a be huge work to make it works. – Paul Bénéteau Feb 22 '20 at 19:06
  • Is there a way to cache the variable allocation or something? In order to not wait 3min for the program to launch every time.... – Paul Bénéteau Feb 22 '20 at 19:17
  • 1
    The fastest fix would be to just buy more RAM. – Jesper Juhl Feb 22 '20 at 19:46