5

In my program I load some images, extract some features from them and use a cv::Mat to store these features. Based on the number of images I know the cv::Mat will be 700.000 x 256 in size (rows x cols) which is about 720Mb. But when I run my program when it gets about 400.000 x 256 (400Mb) and tries to add more it simply crashes with a Fatal Error. Can anyone confirm that indeed 400Mb is the limit of cv::Mat's storage capacity? Should I check for more issues? Possible ways to overcome this problem?

DimChtz
  • 4,043
  • 2
  • 21
  • 39
  • It is not. You should be able to allocate memory as it is available. You should check for more issues. – Robert Prévost Nov 10 '16 at 01:59
  • @RobertPrévost But `cv::Mat::push_back()` throws an "outOfMemory" error exception while my system has more than enough ram. – DimChtz Nov 10 '16 at 02:01
  • `cv::Mat::push_back` probably allocates another matrix of at least the same size. Do you have enough space to hold twice the data? – Robert Prévost Nov 10 '16 at 02:04
  • `cv::Mat::push_back` just adds one more row. – DimChtz Nov 10 '16 at 02:05
  • Write a small program that in a `while(1)` loop adds a row and prints out the `data` pointer of the `cv::Mat` as `size_t`. You will eventually see that it changes value. Or you will get an out of memory exception. – Robert Prévost Nov 10 '16 at 02:10
  • Well, it might be hard to see in this fashion if you don't start out small enough. It really depends on the allocator. But essentially, when it can't give you back more space, the allocator gives a new pointer and the values are copied. At that time you have to have twice the memory. A similar result is true for std::vector::push_back. – Robert Prévost Nov 10 '16 at 02:17
  • @RobertPrévost Just before the crash my program uses 1.1GB or ram. Maybe at that time it goes twice (like you said) which means 2.2GB which essentially exceeds the 2GB limit of 32bit applications? – DimChtz Nov 10 '16 at 02:20

3 Answers3

4

Digging the source code, by using the push_back:

it checks if there is enough space for a new element, if not, it reallocates the matrix, with space for (current_size * 3 + 1) / 2 (see here). In your example, by around 400,000 * 256 (total of 102,400,000 elements) it tries another allocation, so it tries to allocate space for 307,200,001 / 2 = 153,600,000 elements. But in order to move this, it needs to allocate a new space and then copy the data

From matrix.cpp:

Mat m(dims, size.p, type());
size.p[0] = r;
if( r > 0 )
{
    Mat mpart = m.rowRange(0, r);
    copyTo(mpart);
}

*this = m;

So it essentially:

  1. Allocates a new matrix, using default constructor for all newly created elements
  2. Copy over the data and then delete the old data
  3. Create a new header for this matrix, with enough columns
  4. Points this elements to the newly allocated data (freeing old allocated memory)

Meaning that, in your case, it needs enough space for (600,000 + 400,000) * 256 - 1GB of data, using 4 bytes integers. But also, it creates an auxiliary matrix of one row and, in this case, 600,000 columns, which accounts for 2,400,000 extra bytes.

So, by the next iteration, when it reaches the 600,000 columns, it tries to allocate 900,000x256 elements (900Mb) + the 600,000x256 elements (600Mb) + 600,000 (~3.4Mb). So, just by allocating this way (using push_back), you are doing several reallocations.

In other words: since you already know the approximate size of the matrix, using reserve is a must. It is several times faster (will avoid reallocations and copies).

Also, as a workaround, you could try inserting to the transposed matrix and then after the process is done, transpose it again.

Side question: shouldn't this implementation use realloc instead of malloc/memcpy?

Bruno Ferreira
  • 1,621
  • 12
  • 19
1

Created a matrix as follows. Used CV_8UC4, as it gives roughly 700 MB. No problem whatsoever. So no, 400Mb is not the limit. 700Mb is not the limit. Tried it with twice as much (1400000 rows, 1.4Gb) - still not the limit (my default image viewer could not display the resulting BMP file, though).

const unsigned int N_rows = 700000;
const unsigned int N_cols = 256;
cv::Mat m(N_rows, N_cols, CV_8UC4);
for (int r = 0; r < m.rows; ++r)
{
    for (int c = 0; c < m.cols; ++c)
    {
        m.data[(r*N_cols + c) * 4] =  c % 256;
    }
}
cv::imwrite("test.bmp", m);

Possible ways to overcome the problem:

  • Allocate enough space in cv::Mat at the beginning, maybe even with a little extra, to make sure. If your problem is caused by re-allocations, this will help.
  • If your application is 32bit, as your comment suggests, convert it to 64bit to increase the memory limit.
  • Even for a 32bit application, 720Mb shouldn't be a problem. However, your application probably uses more memory for other stuff. If that's the case, it may help to move your image into a separate process, so it gets its own separate 2Gb. Inter-process communication is pain, though.
  • If you still have to handle files that don't fit in the memory, maybe look into memory-mapped files, I think OpenCV has at least some support of those, but can't say more, never used them.
  • Use a collection of smaller matrices, reading/writing/splitting/joining them, depending on what you need.
Headcrab
  • 6,838
  • 8
  • 40
  • 45
1

There is no strict limit on the size of a cv::Mat. You should be able to allocate memory as long as it is available.

Here is a small program that shows what can happen to the data pointer when running cv::Mat::push_back a number of times. Playing around with the values for rows and cols can result in one or many values printed for a.data before eventually an out-of-memory exception is thrown.

#include <opencv2/opencv.hpp>
int main()
{
  int rows = 1, cols = 10000;
  cv::Mat a(rows, cols, CV_8UC1);
  while(1) {
    a.push_back(cv::Mat(1, cols, CV_8UC1));
    std::cout << (size_t) a.data << std::endl;
  }
}

It really depends on the allocator on what the above code does for various values of rows and cols. So, consideration should be given for small and large initial sizes for a.

Remember that like the C++11 std::vector, the elements in a cv::Mat are contiguous. Access to the underlying data can be obtained through the cv::Mat::data member. Calling std::vector::push_back or cv::Mat::push_back continuously may result in a reallocation of the underlying memory. In this case, the memory has to be moved to a new address and roughly twice the amount of memory may be necessary to move old to new (baring any tricky algorithm that utilizes less memory).

Robert Prévost
  • 1,667
  • 16
  • 22
  • Well, I add one row each time I call `cv::Mat::push_back` which is 700.000 calls in total. I guess what I need to do is to reduce the total calls of `cv::Mat::push_back`. – DimChtz Nov 10 '16 at 02:45
  • 2
    If I remember correctly, a fairly common memory allocation strategy for those contiguous structures that allow dynamic growth is to allocate twice the previous size when run out of space. Which probably means roughly _three_ times the size is required to move: the new memory block twice the old size + the old memory block, which cannot be deleted until the data is copied. – Headcrab Nov 10 '16 at 02:48
  • @Headcrab Is it possible to use `cv::Mat` with un-contiguous data? – DimChtz Nov 10 '16 at 02:58
  • 1
    @DimChtz Frankly, I have no idea, but I doubt it, because `cv::Mat` provides raw data access with pointer arithmetic (see an example in my answer), it won't work if the underlying data is not contiguous. But you can probably split your data into several smaller `cv::Mat`'s, and with a little additional labor have the same effect. And I am not certain `cv::Mat` reallocates memory the way I said, although it might, but it might as well use a different strategy. – Headcrab Nov 10 '16 at 03:05
  • @DimChtz, as far as I know the data must be contiguous. Another strategy might be to allocate the entire `cv::Mat` once and not use `cv::Mat::push_back`. Otherwise, perhaps a different data structure might be better (e.g., `std::vector`). – Robert Prévost Nov 10 '16 at 03:10
  • @Headcrab In case I split data into several smaller `cv::Mat` can I add them to a single bigger `cv::Mat` without `cv::Mat::push_back()` and of course without contiguous data? – DimChtz Nov 10 '16 at 03:11
  • @DimChtz You can join them as you please, until you exceed your memory limit, but you cannot escape the curse of contiguous data, not with a single `cv::Mat`. I updated my answer, have a look. – Headcrab Nov 10 '16 at 03:59