-5

I need to write a custom malloc for GPU programming. Will this work correctly?

void* malloc(int size, int* bytesUsed, uchar* memory){
  int startIdx = (*bytesUsed);
  (*bytesUsed) += size;
  return (void*)(memory+startIdx);
}

I'm new to C programming, I might have made pointer-arithmetic related errors or something... the idea is bytesUsed gives you the index into memory of the first free address, so you increment it by size and then return the incremented index as a pointer.

Elliot Gorokhovsky
  • 3,610
  • 2
  • 31
  • 56
  • If you're on POSIX check out http://linux.die.net/man/2/sbrk – lost_in_the_source Jul 14 '16 at 22:24
  • 1
    What happens when you need to `free`? – Oliver Charlesworth Jul 14 '16 at 22:24
  • @OliverCharlesworth I don't need to free; when the workgroup finishes all memory is seized and reused for the next workgroup. (I'm using openCL). – Elliot Gorokhovsky Jul 14 '16 at 22:25
  • 3
    If you are new to C why try to run before you are able to walk? Answer to the question is no – Ed Heal Jul 14 '16 at 22:26
  • @stackptr I'm on the GPU using openCL :(. – Elliot Gorokhovsky Jul 14 '16 at 22:26
  • @EdHeal Because I need to port some computations to the GPU... I'm using pyopencl and doing all the host stuff in python, but there's a certain amount of unavoidable C stuff. And I need malloc because I'm building a tree node by node in shared (local in openCL-speak) memory. – Elliot Gorokhovsky Jul 14 '16 at 22:27
  • @EdHeal I would also appreciate an explanation of how I can fix it... – Elliot Gorokhovsky Jul 14 '16 at 22:28
  • @RenéG I suppose you might find [this](https://forums.khronos.org/showthread.php/7441-Memory-allocation-inside-kernel) conversation on `malloc` in OpenCL usefull – Eli Korvigo Jul 14 '16 at 22:30
  • 1
    @EliKorvigo Yes, I saw that; the solution they propose is to pass a pointer to a large array and manually manage memory inside. Which is what I'm trying to do. – Elliot Gorokhovsky Jul 14 '16 at 22:31
  • For a start - is free possible? Is running out of memory catered for? – Ed Heal Jul 14 '16 at 22:32
  • 1
    @EdHeal I don't need free and I don't need to worry about running out: this is for a very specific one-time use. I only need unsafe malloc without free (by unsafe I mean I'm fine with segfaults if I go over since I know I'm not going to go over). What I'm asking is, is what I wrote good enough for that? – Elliot Gorokhovsky Jul 14 '16 at 22:33
  • 1
    Famous last words "don't need to worry". Fatal flaw not checking if there is enough memory and returning `NULL`. After that use `unsigned` types, then the size limit will catch accidental negatives. – Weather Vane Jul 14 '16 at 22:34
  • 2
    Should be `return memory+startIdx;` – user3386109 Jul 14 '16 at 22:34
  • Order independent transparency deals with dynamic allocation in shaders, perhaps you could borrow their technique. – harold Jul 14 '16 at 22:34
  • @user3386109 Good catch! You're completely right. I will fix that. Thanks for actually looking at the code instead of just being pedantic about stuff that doesn't matter for what I'm doing. – Elliot Gorokhovsky Jul 14 '16 at 22:35
  • Writing unsafe code is a very bad habit to get into – Ed Heal Jul 14 '16 at 23:03

2 Answers2

2

I'm not sure if this simple stack-based solution will work for you

#include <stdint.h>
const size_t ALLOCSIZE = 1024;
typedef uint8_t byte;

static byte buf[ALLOCSIZE];
static byte *pbuf = buf;

byte *alloc(size_t n)
{
    /* if there is room */
    if (buf + ALLOCSIZE - pbuf >= n) {
        pbuf += n;
        return pbuf - n;
    } else
        return NULL;
}

I didn't provide a free, since you said you didn't need to deallocate.

lost_in_the_source
  • 10,998
  • 9
  • 46
  • 75
1

[Edit 2023] sizeof(max_align_t) corrected to alignof(max_align_t).

There are some issues:

  1. Largest problem is alignment. The returned pointer needs to be aligned. Since this malloc() is not given the pointer type needed, use max_align_t "which is an object type whose alignment is as great as is supported by the implementation in all contexts" C11dr §7.19 2. Note: *bytesUsed needs this alignment too. So apply similar code should if other code affects it.

     if (size%alignof(max_align_t)) {
       size += alignof(max_align_t) - size%alignof(max_align_t);
     }
     // or
     size = (size + alignof(max_align_t) - 1)/alignof(max_align_t)*alignof(max_align_t);
    
  2. No detection for out-of-memory.

  3. Avoid re-using standard library names. Code can define them in later, if needed.

     // void* malloc(int size, int* bytesUsed, uchar* memory);
     void* RG_malloc(int size, int* bytesUsed, uchar* memory);
    
     // if needed
     #define malloc RF_malloc
    
  4. malloc() expects a different type for allocations: size_t, not int.

     // void* malloc(int size, int* bytesUsed, uchar* memory);
     void* malloc(size_t size, size_t* bytesUsed, uchar* memory);
    
  5. Cast is not needed.

     // return (void*)(memory+startIdx);
     return memory + startIdx;
    
  6. More clear to use unsigned char than uchar, which hopefully is not something else.

Putting this all together

void* malloc(size_t size, size_t* bytesUsed, unsigned char* memory){
  size = (size + alignof(max_align_t) - 1)/alignof(max_align_t)*alignof(max_align_t);
  if (RG_ALLOC_SIZE - *bytesUsed > size) {
    return NULL;
  }
  size_t startIdx = *bytesUsed;  // See note above concerning alignment.
  *bytesUsed += size;
  return memory + startIdx;
}

Additionally, RG_free() is not coded. If that is needed, this simply allocation scheme would need significant additions.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256