5

I have 1,2,...,n piece of vectors. Every vector has more than 10000 elements and I have to get the cartesian product of these vectors. I have a code, what's working, but only under 1000 elements and under 4 vector. I'd like to write the cartesian product to a file, but if the output file is bigger than 1GB i got : "terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc".

My primary question is, how can I fix this memory allocation error?

Here is a runable piece of my code:

#include <iostream>
#include <vector>
#include <algorithm>
#include <time.h>
#include <fstream>
#include <math.h>

using namespace std;

vector<double> makeVectorByRange(double min, double max, double step){
    vector<double> out = {};
    for( ; min <= max; min+=step){
        out.push_back(min);
    }
    return out;
} 


void cart_product_solve_and_write_to_file (const vector<vector<double>>& inpV) {
    vector<vector<double>> out = {{}};

    std::ofstream outputFile;
    std::fixed;

    for (auto& u : inpV) {
        vector<vector<double>> r;
        r.clear();
        for (auto& x : out) {

            //make/open file, append
            outputFile.open ("out.csv", std::ofstream::out | std::ofstream::app);
            outputFile.precision(8);

            for (auto y : u) {
                r.push_back(x);
                r.back().push_back(y);

                if( r.back().size() == inpV.size() ){

                    // write the input parameters of griewank to file
                    for(double asd : r.back()){
                        outputFile << asd << ";";
                    }

                    outputFile << "; \n";
                    outputFile << std::flush;

                    r.back().clear();
                }
            }
            //close file
            outputFile.close();
        }
        out.swap(r);
    }  
}

// Pure cartesian product. This function returns the cartesian product as a vector, but if the input vectors are too big, it has an error
/*
vector < vector<double> > cartesian_product (const vector< vector<double> >& inpV) {
    vector< vector<double> > out = {{}};
    for (auto& u : inpV) {
        vector< vector<double> > r;
        for (auto& x : out) {
            for (auto y : u) {
                r.push_back(x);
                r.back().push_back(y);
            }
        }
        out.swap(r);
    }
    return out;
}
*/

int main(){

    clock_t tStart = clock();

    // it works
    // const vector<vector<int> > test ={ {0,1,2,3,4}, {5,6,7}, {8,9,10,11,12,13} };
    // vector<vector<int> > cart_prod = cartesian_product(test);

    vector <vector<double> > test = {};
    test.push_back( makeVectorByRange( 0, 0.5, 0.001) );
    test.push_back( makeVectorByRange( 0, 0.5, 0.001) );
    test.push_back( makeVectorByRange( 0, 0.5, 0.001) );

    cart_product_solve_and_write_to_file(test);

    printf("Time taken: %.6fs\n", (double)(clock() - tStart)/CLOCKS_PER_SEC);

    return 0;
}

Quentin
  • 62,093
  • 7
  • 131
  • 191
vizvezetek
  • 51
  • 4
  • 2
    Hm. The Cartesian product of vectors is usually something you _iterate_ over, so that you see only one combination at a time. Why do you want to store all combinations? – M Oehm Apr 01 '19 at 12:36
  • 6
    _how can I fix this memory allocation error?_ Iterate over all the combinations and store them in the file. Do not keep them in the memory at all. – Daniel Langr Apr 01 '19 at 12:37
  • 1
    In addition you know the target size of the vector (as that is based on the input vector sizes) so you should use reserve() to avoid reallocations - not only because of improved speed, but also to prevent memory fragmentation (which could result in a big block not being able to allocate although there is enough free memory available, but fragmented). But as said by the others, if possible you should ideally not store the results in memory at all, just write them to the file if that is all you need. – EmDroid Apr 01 '19 at 12:45
  • An also if you really like to use a lot of memory, make sure that you build the application as 64-bit and not 32-bit (although even in 64-bit, trying to allocate large amount of memory might not fail if you have enough virtual memory available, but can still bring your computer to halt because of swapping). – EmDroid Apr 01 '19 at 12:46
  • 2
    [OT]: Why open/close stream repeatedly ? – Jarod42 Apr 01 '19 at 12:47
  • @MOehm Because i'd like to use this numbers at the D dimensional functions. For example the Ackley (http://www.sfu.ca/~ssurjano/ackley.html) function. – vizvezetek Apr 01 '19 at 12:47
  • @Jarod42 because, when stops the running (because of the errors), then i have partial results. – vizvezetek Apr 01 '19 at 12:54
  • You can just do flush instead (of closing/opening the file). – EmDroid Apr 01 '19 at 12:56
  • this loop `for (auto& x : out)` is a dead code since `out` is always empty, so this code is not only messy, but apparently is invalid. `out` starts as empty then is swapped with `r` which is always empty before swap. – Marek R Apr 01 '19 at 12:56
  • @DanielLangr Thank you! But could you write me, how? Where should I modify this code? This is my problem. Thanks again. – vizvezetek Apr 01 '19 at 12:57
  • @DanielLangr Yes, I know. What you see in main() this is a test. This function works with 4, 40 and 400 vectors too. But if I have 400 vectors, then i don't want to write 400 nested loops. – vizvezetek Apr 01 '19 at 13:18
  • @vizvezetek Sorry, I didn't understand your code at first. Will write an answer. – Daniel Langr Apr 01 '19 at 13:24
  • @MarekR It isn't. Run it and you can see! ;) – vizvezetek Apr 01 '19 at 13:25
  • @DanielLangr No problem. I'm waiting for your answer. – vizvezetek Apr 01 '19 at 13:28

1 Answers1

2

You need to iterate over all combinations of the resulting cartesian product. This is typically achieved by recursion. In each recursion level you then iterate over elements of one input vector.

Here is a sample solution for printing the resulting combinations to std::cout. You can easily modify it for printing to a file by providing an additional reference parameter to an opened std::ofstream object to the recursive function.

#include <iostream>
#include <vector>

template <typename T>
void vector_cartesian_product_helper(
  const std::vector<std::vector<T>>& v, std::vector<T>& combination, size_t level)
{
  if (level == v.size()) {
    for (auto elem : combination)
      std::cout << elem << ";";
    std::cout << "\n";
  }
  else {
    for (const auto& elem : v[level]) {
      combination[level] = elem;
      vector_cartesian_product_helper(v, combination, level + 1);
    }
  }
}

template <typename T>
void vector_cartesian_product(const std::vector<std::vector<T>>& v)
{
  std::vector<T> combination(v.size());
  vector_cartesian_product_helper(v, combination, 0);
}

int main(){
  std::vector<std::vector<int>> test = {{0,1,2,3,4}, {5,6,7}, {8,9,10,11,12,13}};
  vector_cartesian_product(test);
}

Live demo: https://wandbox.org/permlink/PoyEviWGDtEpvN1z

It works with vectors of any sizes and does uses additional memory only of O(N) where N is a number of vectors.

Daniel Langr
  • 22,196
  • 3
  • 50
  • 93