I'm trying to memory map a huge file (approx. 100GB) in order to store a B-Tree with billions of key-value pairs. The memory is to small to keep all data in memory therefore I'm trying to map a file from disk and instead of using malloc I return and increment a pointer to the mapped region.
#define MEMORY_SIZE 300000000
unsigned char *mem_buffer;
void *start_ptr;
void *my_malloc(int size) {
unsigned char *ptr = mem_buffer;
mem_buffer += size;
return ptr;
}
void *my_calloc(int size, int object_size) {
unsigned char *ptr = mem_buffer;
mem_buffer += (size * object_size);
return ptr;
}
void init(const char *file_path) {
int fd = open(file_path, O_RDWR, S_IREAD | S_IWRITE);
if (fd < 0) {
perror("Could not open file for memory mapping");
exit(1);
}
start_ptr = mmap(NULL, MEMORY_SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
mem_buffer = (unsigned char *) start_ptr;
if (mem_buffer == MAP_FAILED) {
perror("Could not memory map file");
exit(1);
}
printf("Successfully mapped file.\n");
}
void unmap() {
if (munmap(start_ptr, MEMORY_SIZE) < 0) {
perror("Could not unmap file");
exit(1);
}
printf("Successfully unmapped file.\n");
}
main method:
int main(int argc, char **argv) {
init(argv[1]);
unsigned char *arr = (unsigned char *) my_malloc(6);
arr[0] = 'H';
arr[1] = 'E';
arr[2] = 'L';
arr[3] = 'L';
arr[4] = 'O';
arr[5] = '\0';
unsigned char *arr2 = (unsigned char *) my_malloc(5);
arr2[0] = 'M';
arr2[1] = 'I';
arr2[2] = 'A';
arr2[3] = 'U';
arr2[4] = '\0';
printf("Memory mapped string1: %s\n", arr);
printf("Memory mapped string2: %s\n", arr2);
struct my_btree_node *root = NULL;
insert(&root, arr, 10);
insert(&root, arr2, 20);
print_tree(root, 0, false);
// cin.ignore();
unmap();
return EXIT_SUCCESS;
}
The problem is that I receive Cannot allocate memory
(errno is 12) if the requested size is bigger than the actual memory or a Segmentation fault
if the requested space is outside of the mapped region. I was told that it is possible to map files bigger than the actual memory.
Will the system manage the file by itself or am I responsible for mapping only the amount of free memory and when accessing further space I have to unmap and map to another offset.
Thank you
EDIT
OS: Ubuntu 14.04 LTS x86_64
bin/washingMachine: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, BuildID[sha1]=9dc831c97ce41b0c6a77b639121584bf76deb47d, not stripped