0

When I write linear algebra programs in C++, I use the Armadillo library. It is based on templates, which provide me a way to define vectors of any length that don't necessarily require additional memory allocation, since they are statically assigned an appropriate memory buffer at compile-time. When I use arma::Col<double>::fixed<3> the compiler creates a "new type" on the fly so that the vector contains a buffer of exactly 3 doubles.

Now I'm working on a linear algebra program in C, and I'm using the GNU Scientific Library (GSL). In order to instantiate a 3D vector, I do: gsl_vector_alloc(3) that returns a gsl_vector*. The problem is that this operation causes the dynamic allocation of a small portions of memory, and this happens millions of times during program runtime. My program is wasting a lot of resources to perform tens of millions of malloc/free operations.

I've inspected the internal structure of gsl_vector:

typedef struct
{
   size_t size;
   size_t stride;
   double * data;
   gsl_block * block;
   int owner;
} gsl_vector;

For the library to work correctly, data should point to the first element of the vector, usually inside a gsl_block structure like this:

typedef struct
{
   size_t size;
   double * data;
} gsl_block;

which contains another data pointer. So, for instantiating a simple 3D vector, this sequence of mallocs happen:

  1. A gsl_vector structure is malloc'd (something around 40 bytes on x86_64).
  2. A gsl_block structure is malloc'd (16 bytes) and the block pointer of the gsl_vector is set to the memory address of the gsl_block just allocated
  3. An array of 3 doubles is malloc'd and its memory address is assigned to both data pointers (the one in gsl_block and the one in gsl_vector).

I obtained a 40% performance gain by removing two mallocs. I created my custom gsl_vector creation routine, that allocates an array of 3 doubles and sets the data pointer of the gsl_vector to the address of this array. Then I return a gsl_vector (not a pointer).
But doing so, I still get millions of malloc(3 * sizeof(double)) operations.

I didn't manage to "embed" the array of 3 doubles inside the gsl_vector struct, since if the data pointer points to something which is inside the struct itself (hacky!), then the pointer is no longer valid when the vector is copied elsewhere!

Do you have any ideas (apart switching to C++ or rolling my own linear algebra library)? I am open to any suggestion.

gd1
  • 11,300
  • 7
  • 49
  • 88

2 Answers2

3

It looks to me as if you are misunderstanding the purpose of the gls_block data structure. It seems to me that you should just use it to allocate a large chunk of data in an gsl_block data structure and then split that chunk up for the use in multiple gsl_vector. If you'd allocate all your gsl_vector in one go by allocating an array of them, you are almost there. You just have two calls to malloc and some bookkeeping during initialization.

What this would impose on you, is that you'd have to think precisely in advance which gsl_vector you need. But that is the "price" to pay when you use a language that has no builtin garbage collection. If you invest in that, most of the times it has the advantage of structuring your code, you'll probably learn a lot on how you could organize your computation.

Jens Gustedt
  • 76,821
  • 6
  • 102
  • 177
  • Hi Jens, the problem with gsl_vectorFlex is that the flexible array member has to be the last one, and the (precompiled) GSL implementation will eventually reach the other members of the struct, instead, because it has no notion of your modified struct. As of gsl_block, I perfectly know why is there, and there's even a GSL function to preallocate a block of appropriate length. But if I could easily pre-allocate memory for all the vectors I needed, then I wouldn't made this question. – gd1 Apr 13 '13 at 18:03
  • Moreover, you cannot append variable length array at the end and make *data point to it (within the same struct), because if the object is copied (and I have no control over it), then the pointer is no longer valid. – gd1 Apr 13 '13 at 18:06
  • @gd1, I didn't say it's easy. And no, for the second part, I didn't mean to have a flexible member (variable length array is something different!) in `gsl_vector`. You wouldn't need that then if you have the flexible array member in `gsl_block`. – Jens Gustedt Apr 13 '13 at 18:48
  • Ok but there is no need for a flexible member in gsl_block. You can allocate blocks of any length via appropriate APIs of the library itself and the gsl_block will just retain a pointer. – gd1 Apr 13 '13 at 21:33
0

C is a bit too primitive to do this properly.

If you need to use a function from GSL, you might be able to still use C++ Armadillo vectors and matrices with it.

For example, you can obtain a pointer to the memory used by a vector or matrix via the .memptr() member function. This also works for fixed size matrices/vectors.

Alternatively, you can tell Armadillo to use an already allocated block of memory, by giving it a pointer during vector or matrix construction.

mtall
  • 3,574
  • 15
  • 23