I need to do some matrix operations on my computer. These matrices are large 1000000x1000000 and more, some operations requiring TB of memory. Obviously these cannot be directly loaded into memory and computed. What approaches can I use to solve these matrices on my computer? Assuming that the matrices cannot be reduced further using matrix optimizations and are already stored in compact form. I am thinking about using some memory mapped scheme but need some ideas.
Asked
Active
Viewed 764 times
4
-
Have you tried just using a 64-bit OS and letting the VM subsystem do the heavy lifting ? – Paul R Sep 20 '10 at 12:14
-
What disks do you store these on? – sbi Sep 20 '10 at 12:18
-
What operating system are you running? Win, Linux or a POSIX compliant Unix based system ( like MacOSX )? Whatever you run it should be the relevant 64bit version as Paul pointed out. – Robert S. Barnes Sep 20 '10 at 16:16
-
Yes, I am running on 64-bit Linux Gentoo. It runs out of memory when I try to allocate these matrices. – user236215 Sep 21 '10 at 00:27
1 Answers
2
Two suggestions:
Use the mmap2 system call to map the files containing both the input and output data. This allows you to map files up to 2^44 bytes and treat them as if they were already in memory. I.e. you just use a standard pointer syntax to access the data and the OS takes care of either reading or writing it from / to disk without you needing to worry about it. Not only that, but mmap is many times significantly faster than manual file I/O - See this SO post.
Read "What every programmer should know about memory" by Ulrich Drepper. One of the example problems he deals with is highly optimizing matrix operations.

Community
- 1
- 1

Robert S. Barnes
- 39,711
- 30
- 131
- 179