0

I have a 3d vector of structs. I refer to where the struct is located in the 3d vector by the struct's i j and k. However, when making millions of these, it takes up a bunch of memory because the object stores so much data.

How can I efficiently figure out what i,j,k a particular struct object is without storing that information in the struct itself. Can I do it with some sort of memory arithmetic?

#include <iostream>
#include <vector>
#include <string>

using namespace std;

int main() {
    struct MyStruct {
        size_t i;
        size_t j;
        size_t k;
        bool some_bool;
    };

    vector<vector<vector<MyStruct>>> three_d_struct_v;

    size_t max_i = 1000000;
    size_t max_j = 10;
    size_t max_k = 10;

    for(size_t i = 0; i < max_i; i++) {
        for(size_t j = 0; j < max_j; j++) {
            for(size_t k = 0; k < max_k; k++) {
                three_d_struct_v.emplace_back(MyStruct{i,j,k,false});
            }
        }
    }


    return 0;
}
ParoX
  • 5,685
  • 23
  • 81
  • 152

3 Answers3

1

There is quite a simple way to do this with real arrays. Multilevel std::vector<> won't do, because the memory allocated by all the different line vectors is not contiguous. But with the builtin arrays of the language, this is quite simple:

//Get the memory
bool myData* = new bool[max_i*max_j*max_k];
inline size_t getIndex(size_t i, size_t j, size_t k) { return (i*max_j + j)*max_k + k; }
inline size_t getI(size_t index) { return index/max_j/max_k; }
inline size_t getJ(size_t index) { return (index/max_k)%max_j; }
inline size_t getK(size_t index) { return index%max_k; }

Now you can talk about the indices much in the same way you could talk about pointers to your structs. If you really must do it the C++ way, you can convert references and indices like this:

bool& referenceToElement = myData[anIndex];
size_t recoveredIndex = &referenceToElement - myData;

However, in C you can do much better:

bool (*myData)[max_j][max_k] = malloc(max_i*sizeof(*myData));
myData[i][j][k] = true;    //True 3D array access!

The calculation performed by myData[i][j][k] is precisely the same as the calculation of myData[getIndex(i, j, k)] in the C++ example above. And, as before, you can retrieve the index using pointer arithmetic.

C++ also has multidimensional arrays, but it requires the array dimensions to be compile time constants (and you need to use new instead of malloc()). In C, there is no such restriction, the array sizes may be calculated at runtime.

cmaster - reinstate monica
  • 38,891
  • 9
  • 62
  • 106
  • What about just a 1 level vector, using a wrapper class such as this: located here: "class Array3D" http://stackoverflow.com/questions/2178909/how-to-initialize-3d-array-in-c Is that continuous? – ParoX Jan 29 '14 at 20:34
  • Yes, a 1 level vector is guaranteed to be contiguous. Only the multilevel one is not, because each inner vector will allocate its own little bit of memory. – cmaster - reinstate monica Jan 29 '14 at 20:54
  • I unmarked this as a solution because it's a bit incomplete. Specifically referencing in the way you mention won't work for vectors because it returns an iterator and all sorts of mess. I had issues getting a wrapper to do what I want. – ParoX Feb 01 '14 at 05:13
  • That is precisely why I did *not* suggest using a vector. All I said was, that the memory of a vector is contiguous *after you asked me*. Of course, you can always get a pointer from an iterator by using `&*iterator`, whatever that iterator is; but for vectors, the iterator should already be a pointer with which you can do all the pointer arithmetic required (you were not using reverse iterators, were you?). Some people are religious about using vectors and other RAII types, I'm not. But I didn't want to unnecessarily offend people by warning against the usage of vectors... – cmaster - reinstate monica Feb 01 '14 at 08:16
0

Is this what you are looking for? You can consider m to be an index of all max_i * max_j * max_k structs.

This is untested. You may have to do some casting when calling div on size_t types.

#include <cstdlib> // divmod

size_t max_i = 1000000;
size_t max_j = 10;
size_t max_k = 10;

size_t N = max_i * max_j * max_k; // beware of overflow

for( size_t m=0 ; m<N ; ++m )
{
    div_t q = div( m, max_k );
    size_t k = q.rem;

    q = div( q.quot, max_j );
    size_t j = q.rem;

    q = div( q.quot, max_i );
    size_t i = q.rem;

    // Now i, j, k are set. Do as thou shall.
}
Matt
  • 20,108
  • 1
  • 57
  • 70
  • I am having a hard time following this. How can this help me reference a particular struct? Lets say I want to reference `three_d_struct_v[635454][4][3]` – ParoX Jan 29 '14 at 19:49
  • @BHare I think I have misunderstood your problem so this answer may not be helpful. However, this brings up the question, if you know 635454, 4, 3 then why would you need to reference the struct, since these would be the i, j, k values you are looking for? – Matt Jan 29 '14 at 19:51
  • As I understand, OP wants to know i,j,k having only the struct but without keeping i,j,k in the struct, something like, check where the struct is in the memory and find out the numbers. – Luke B. Jan 29 '14 at 19:56
  • Correct. I want to know where any particular struct is (if it exists) within the i,j,k indices of the vector without storing i,j,k in the struct – ParoX Jan 29 '14 at 19:58
  • @BHare If there is no risk of overflow, you can store a single number `m` instead of all three `i`, `j`, `k` using the formula `m = (i*max_j+j)*max_k+k`. Then the above body of the `for` loop tells you how to get `i`, `j`, `k` back from `m`. – Matt Jan 29 '14 at 20:04
0

In your case, you can actually figure this out fairly easily by storing a minimal amount of metadata.

Since you have two relatively small vectors, it's possible to store the starting position of all j/k combinations.

size_t max_i = 1000000;
size_t max_j = 10;
size_t max_k = 10;

You would want to restructure your vectors to be stored [k][j][i]. If you store the 100 possible j/k combinations in a std::map, then you can find the j/k values by finding the largest address smaller than the the address of your vector. From there, you calculate the offset in bytes and divide by the size of your struct to figure out i.

It becomes far less practical if max_j and max_k become large.

patros
  • 7,719
  • 3
  • 28
  • 37
  • j and k are small in the example but they have no limits besides being more than 0 and less than unsigned long int. – ParoX Jan 29 '14 at 21:55
  • @BHare then it is probably not a great idea to give them such small values when asking for help. – patros Jan 29 '14 at 22:06