1

When I run the code below, on the second iteration of the loop the whole OS hangs. If I open the task manager, it cleary shows that there's a huge memory leak. After I start the code execution, all the memory is gone in 4 seconds.

Here's the code:

void matrix_vector_multiplication_comparison()
{    
    for (unsigned DIMS_SIZE = 64; DIMS_SIZE <= 2048; DIMS_SIZE += 64)
    {

        __declspec(align(16))float* m1 = generate_random_1d_matrix(DIMS_SIZE * DIMS_SIZE);
        __declspec(align(16))float* m2 = generate_random_1d_matrix(DIMS_SIZE * DIMS_SIZE);
        __declspec(align(16))float* v1 = generate_random_1d_matrix(DIMS_SIZE);
        __declspec(align(32))float* v2 = generate_random_1d_matrix(DIMS_SIZE);
        __declspec(align(16))float* res1 = new float[DIMS_SIZE];
        __declspec(align(16))float* res2 = new float[DIMS_SIZE];
        __declspec(align(32))float* res3 = new float[DIMS_SIZE];


// ........ other stuff here...........

        delete[] m1;
        delete[] m2;
        delete[] v1;
        delete[] v2;
        delete[] res1;
        delete[] res2;
        delete[] res3;
    }
} 

When I comment out everything in my code and leave only __declspec(align()) declarations and delete[]'s inside my for loop, the memory leak is still there and it shows that the problem is actually with those __declspecs.

The functions generate_random_1d_matrix, get_random_float and main look like this:

float* generate_random_1d_matrix(unsigned const int dims)
{
    size_t i;
    float* result = new float[dims * dims];
    for (i = 0; i < dims * dims; ++i)
        result[i] = get_random_float(10, 100);
    return result;
}

inline float get_random_float(float min, float max)
{
    float f = (float)rand() / RAND_MAX;
    return min + f * (max - min);
}

int main()
{
    matrix_vector_multiplication_comparison();
    return 0;
}

Could anybody tell me what's going wrong here and how to solve that memory problem?

Update

changed the code provided. I left only the parts that actually produce the problem.

Denis Yakovenko
  • 3,241
  • 6
  • 48
  • 82

2 Answers2

4
delete[] m1, m2, v1, v2, res1, res2, res3;

This does not do what you think it does. You are using the comma operator, while you probably meant to pass multiple things to delete[]. You need to delete every variable on it's own:

delete[] m1;
delete[] m2;
delete[] v1;
delete[] v2;
delete[] res1;
delete[] res2;
delete[] res3;
nvoigt
  • 75,013
  • 26
  • 93
  • 142
2

Try lowering 2048 to a more reasonable number. As it is you are trying to allocate millions of floats in large blocks, which doesn't seem reasonable. (It might actually be 10s of millions)

Even at just 128, you are trying to allocate 128^4*2 floats, which is over 200 million. I low balled a little in my previous explanation. even 64 is probably approaching too high.

I'm almost positive the problem is that in generate_random_1d_matrix when you use dims*dims you should be just using dims. Its a 1d matrix after all.

Ethan Fine
  • 519
  • 3
  • 8
  • even if I lower the `DIMS_SIZE` to 128 instead of 2048, it still hangs – Denis Yakovenko Nov 21 '15 at 21:19
  • well, I actually must use 2048 in my function... any ideas how to make this possible? – Denis Yakovenko Nov 21 '15 at 21:21
  • I edited my post. I did my math wrong. You actually square dim_size TWICE. This is quadratic growth here. You're trying to allocate tens of millions even at just 64 – Ethan Fine Nov 21 '15 at 21:21
  • Why do you need giant 1d matrices with millions of floats? If you can explain the purpose I might be able to help – Ethan Fine Nov 21 '15 at 21:23
  • it's for the sake of studying. This is actually the university task. With AVX such a big matrix can be multiplied by another one in like 1-2 seconds. – Denis Yakovenko Nov 21 '15 at 21:25
  • 1
    No, it can't. I think you misunderstood your assignment. 2048^4 = 17592186044416. That is the size of the matrices you claim to want to make. – Ethan Fine Nov 21 '15 at 21:27
  • it can actually. why it is squared? it is 2048^2 – Denis Yakovenko Nov 21 '15 at 21:28
  • 2
    generate_random_1d_matrix(DIMS_SIZE * DIMS_SIZE) is the first squaring, new float[dims * dims] is the second squaring. – Ethan Fine Nov 21 '15 at 21:29
  • Man, here it is! THANKS, it was actually my mistake with squares! You're a lifesaver! If I remove the square in my `generate_1d_matrix` function, everything works perfectly even with 2048 sizes. Thanks! – Denis Yakovenko Nov 21 '15 at 21:35