Allocating large amount of memory and usage of size_t?

Question

In my application ,I am allocating memory to store "volume data" which read from stack of bitmap images.

I stored the data in a "unsigned char" and ,during allocation, first I try to allocate continuous memory-block for entire data.if that fails then tries for scattered allocation.(one small memory-block for each image).

unsigned char *data;

here is my method to allocate memory,I call with "tryContinouseBlock=true".

 bool RzVolume::initVolumeData(int xsize, int ysize, int zsize, int bbpsize,bool tryContinouseBlock) {
        this->nx = xsize;
        this->ny = ysize;
        this->nz = zsize;
        this->bbp_type=bbpsize;

        bool succ = false;

        if (tryContinouseBlock) {
            succ = helper_allocContinouseVolume(xsize, ysize, zsize, bbpsize);
        }

        if (!succ) {
            succ = helper_allocScatteredVolume(xsize, ysize, zsize, bbpsize);
        } else {
            isContinousAlloc = true;
        }
        if (!succ) {
            qErrnoWarning("Critical ERROR - Scattered allocation also failed!!!!");
        }
        return succ;

    }



    bool RzVolume::helper_allocContinouseVolume(int xsize, int ysize, int zsize,
            int bbpsize) {
        try {
            data = new unsigned char*[1];
            int total=xsize*ysize*zsize*bbpsize;
            data[0] = new unsigned char[total];
            qDebug("VoxelData allocated - Continouse! x=%d y=%d Z=%d bytes=%d",xsize,ysize,zsize,xsize * ysize * zsize * bbpsize);
        } catch (std::bad_alloc e) {
            return false;
        }

        return true;

    }

bool RzVolume::helper_allocScatteredVolume(int xsize, int ysize, int zsize,
        int bbpsize) {
    data = new unsigned char*[zsize];
    //isContinousAlloc=false;
    int allocCount = 0;
    try { //Now try to allocate for each image
        for (int i = 0; i < zsize; i++) {
            data[i] = new unsigned char[xsize * ysize];
            allocCount++;
        }
    } catch (std::bad_alloc ee) {
        //We failed to allocated either way.Failed!

        //deallocate any allocated memory;
        for (int i = 0; i < allocCount; i++) {
            delete data[i];
        }
        delete data;
        data = NULL;
        return false;
    }
    qDebug("VoxelData allocated - Scattered!");
    return true;
}

I want this code to run in both 32bit and 64bit platforms.

Now the problem is, even in 64Bit environment (with 12Gb memory) ,helper_allocContinouseVolume() method fails when I load (1896*1816*1253) size of data (with bbpsize=1). Its because of, I use "int" datatype for memory address access and the maxmum of "int" is 4294967295.

In both 32bit and 64bit environment following code gives the value "19282112".

 int sx=1896;
 int sy=1816;
 int sz=1253;
 printf("%d",sx*sy*sz);

where the correct value should be "4314249408".

So which datatype should I use for this ? I want to use the same code in 32bit and 64bit environments.

score 2 · Accepted Answer · answered Sep 28 '11 at 06:27

2

I encounter the same problem very often when working on workstations with > 32GB of memory and large datasets.

size_t is generally the right datatype to use for all indices in such situations as it "usually" matches the pointer size and stays compatible with memcpy() and other library functions.

The only problem is that on 32-bit, it may be hard to detect cases where it overflows. So it may be worthwhile to use a separate memory computation stage using the maximum integer size to see if it's even possible on 32-bit so that you can handle it gracefully.

answered Sep 28 '11 at 06:27

Mysticial

464,885
45
335
332

+1, except that `size_t` is the type of argument taken by `new` so it is **always** the right type to use. – K-ballo Sep 28 '11 at 06:30
can I access arrays using size_t index? like "size_t tindex=zstacknum * (nx * ny*bbp_type);memAdd = &(data[0][tindex]); "? – Ashika Umanga Umagiliya Sep 28 '11 at 06:46
1

Yes, you can access arrays with any integer index. – Mysticial Sep 28 '11 at 06:48
i changed my above test code to "size_t total=sx*sy*sz" .But still it prints "19282112" ? – Ashika Umanga Umagiliya Sep 28 '11 at 06:53
Are on 32-bit? Did you update the `printf` to print a 64-bit integer? – Mysticial Sep 28 '11 at 06:55
yes.Im on 32bit.I changed my printf to **printf("total bytes=%zu",total)** – Ashika Umanga Umagiliya Sep 28 '11 at 06:57
1

`size_t` will be a 32-bit integer on 32bit. In that case you won't be able to run the size anyway since you'd need more memory than the address space. – Mysticial Sep 28 '11 at 06:59
I tested on 64Bit.It also print the same value.I use Mingw64 compiler. – Ashika Umanga Umagiliya Sep 28 '11 at 07:06
1

You might want to do a sizeof(size_t) to make sure it's a 64-bit integer when compiled for x64. If it isn't 64-bit on x64, then there's definitely a problem. In that case you won't be able to run the size anyway because virtually all libraries `new`, `malloc` all take `size_t` as the operand. So it's only 32-bits, you'll never be able to allocate it (contiguously). – Mysticial Sep 28 '11 at 07:11
sizeof(size_t) on 64Bit gave value 8.So it seems everything is fine. – Ashika Umanga Umagiliya Sep 28 '11 at 07:24
1

Then the only thing left I can I think of is to make sure you cast up. `size_t total = (size_t)sx*sy*sz` Anyways, it's getting late where I'm gonna sleep. – Mysticial Sep 28 '11 at 07:27

score 1 · Answer 2 · edited Sep 28 '11 at 22:43

1

Use ptrdiff_t from <stddef.h>.

Reason: it's signed, thus avoiding the problems with implicit promotion where unsigned is involved, and it has the requisite range on any system except 16-bit (in the formal it works nicely also on 16-bit, but that's because the formal has the silly requirement of at least 17 (sic) bits).

edited Sep 28 '11 at 22:43

Lightness Races in Orbit

378,754
76
643
1,055

answered Sep 28 '11 at 06:55

Cheers and hth. - Alf

142,714
15
209
331

I agree with the choice of `ptrdiff_t` in general (although I don't follow your arguments concerning 16 bit systems), but if there's any chance of the total size (`nx * ny * nz`) being more than 2147483648, all of the intermediate arithmetic should be done using `long long`, with a check against `std::numeric_limits::max()` before attempting the allocation. (Alternatively, it's possible to guard against overflow before multiplying. Even more robust, since multiplying three `int`s could overflow even `long long`, but a bit more complicated.) – James Kanze Sep 28 '11 at 07:39
@James: I would just use `ptrdiff_t` also for the calculation. In general I prefer to `typedef ptrdiff_t Size`, and then use `Size`. The thing about 16-bit and 17-bit is a value range requirement in the C99 standard, and I believe it was there also in C89. Cheers! – Cheers and hth. - Alf Sep 28 '11 at 08:10
You can't just use `ptrdiff_t` for the calculation, because it might silently overflow. (His example values for `nx`, `ny` and `nz` will overflow a `ptrdiff_t` on my machine.) As for the 17-bit value, I'll have to find my copy of C99; I don't remember any value range requirements on `ptrdiff_t` (but it's been a long time since I've looked at the C standard). – James Kanze Sep 28 '11 at 08:49
@James: guarding against final overflow is a different issue from ensuring no overflow in the intermediate calculations. and thinking about this now we're both wrong. the calculation should simply done using `double`, with a static assert that `double` is at least 64 bits (e.g. the 64 bit IEEE representation can handle 53-bit integers exactly, which as of 2011 is more than ample). – Cheers and hth. - Alf Sep 28 '11 at 09:01
Using double is another solution. (I suppose that on a 64 bit machine, you could possibly overflow 53 bits, but I don't think realistically that it's something you have to consider. Now, anyway.) Although the static assert shouldn't be for 64 bits, but on the number of digits in the mantissa (multiplied by the base, with possibly some fudge factor for the bases other than 2). Checking before each multiplication is also a possibility: the most robust, but also by far the most work. Using `long long` seems to me the simplest solution, however. – James Kanze Sep 28 '11 at 09:23

score 1 · Answer 3 · answered Sep 28 '11 at 07:10

size_t is defined to be large enough to describe the largest valid object size. So generally, when allocating objects, that is the right size to use.

ptrdiff_t is defined to be able to describe the difference between any two addresses.

Use the one that fits your purpose. That way you'll be ensured it has the appropriate size.

Allocating large amount of memory and usage of size_t?

3 Answers3