0

I am using the mpz_class (using MPIR 2.5.1 with Visual Studio C++ 2010, and the C++ version of MPIR), and for me it's not feasible to store huge numbers in memory, so I want to do it with binary files.

I already finished this with text files, but when I use 100,000+ bit numbers, binary files should will (hopefully) save a lot of space.

I have written a short example to help you understand what I'm trying to do:

ofstream binFile;
binFile.open ("binary.bin", ios::out | ios::binary);

mpz_class test;
test.set_str("999999999999999",10);

binFile.write((char *)(&test), sizeof(test));

cout << "NUMBER: " << test << "\tSIZE: " << sizeof(test) << endl;
binFile.close();

I am trying to write the character-data representing the mpz_class instance. Then, to test it, I tried to read the file:

ifstream binFile2;
binFile2.open("binary.bin", ios::in | ios::binary);

mpz_class num1 = 0; 
binFile2.read ((char *)(&num1), sizeof(num1));

cout << "NUMBER: " << num1 << "\tSIZE: " << sizeof(num1) << endl;
binFile2.close();

Many examples I see online use this method for storing class data into binary files, but my output is this:

NUMBER: 999999999999999 SIZE: 12

NUMBER: 8589934595      SIZE: 12

Why can't I store class data directly, and then read it again? There is no way the instance of mpz_class can be size 12, is this the size of the pointer??

I have also tried this, but I think it's basically the same thing:

char* membuffer = new char[12]; //sizeof(test) returned 12
binFile2.read (membuffer , sizeof(test));
memcpy(&test, &membuffer, sizeof(test))

Any advice on how to fix this would be appreciated. Thanks.

Dragan R.
  • 598
  • 4
  • 18
  • 2
    Have you heard of the [pimpl idiom](http://www.gamedev.net/page/resources/_/technical/general-programming/the-c-pimpl-r1794)? That may be why the object is only 12 bytes. Unless you've written the class yourself and know how it uses memory, you generally can't serialize arbitrary objects to disk by copying the object's memory contents, since it may, e.g. reference heap-allocated data, other objects, etc. – Cameron Aug 29 '12 at 21:27
  • No I haven't heard of it, I also don't know how the class uses memory - I was hoping there was a way to find out how much memory it actually uses and write that to bin files with some delimiter, but I'm not sure how (if possible) to do that. I think the class almost certainly dynamically allocates data, so there's no way to serialize this? – Dragan R. Aug 29 '12 at 21:39

1 Answers1

1

I think you need to spend more time with the GMP manual (section 12.1):

Conversions back from the classes to standard C++ types aren’t done automatically, instead member functions like get_si are provided (see the following sections for details).

So, what you probably need to do is call mpz_class::get_str and mpz_class::set_str. Anyway, the C++ interface is just a light wrapper around the C API, so you're probably better off using the low-level stuff, since it's much better documented. In this case, you would have to use mpz_get_str and mpz_set_str (for integers).

Just keep in mind that there's no API function that can provide a direct binary serialization of the GMP data types, so you need to work with strings. I'm not sure if there are certain limitations to the size of these beasts, so you should test your code thoroughly if you plan to make use of such large numbers. Maybe the best choice is to extract a string representation in base 62 (maximum allowed) so that it doesn't blow up your memory (in base 2 it will eat up one byte for every bit) and then write that to file.

Mihai Todor
  • 8,014
  • 9
  • 49
  • 86