5

I'm reading Vulkan Memory Allocation - Memory Host and seems that VkAllocationCallbacks can be implemented using naive malloc/realloc/free functions.

typedef struct VkAllocationCallbacks {
   void*                                   pUserData;
   PFN_vkAllocationFunction                pfnAllocation;
   PFN_vkReallocationFunction              pfnReallocation;
   PFN_vkFreeFunction                      pfnFree;
   PFN_vkInternalAllocationNotification    pfnInternalAllocation;
   PFN_vkInternalFreeNotification          pfnInternalFree;
} VkAllocationCallbacks;

But I see only two possible reasons to implement my own vkAllocationCallback:

  • Log and track memory usage by Vulkan API;
  • Implement a kind of heap memory management, it is a large chunk of memory to be used and reused, over and over. Obviously, it can be a overkill and suffer same sort of problems of managed memory (as in Java JVM).

Am I missing something here ? What sort of applications would worth implementing vkAllocationCallbacks ?

Alex Byrth
  • 1,328
  • 18
  • 23
  • 2
    "*Am I missing something here ?*" Why do you feel that these reasons are insufficient justification for the feature? – Nicol Bolas Apr 29 '16 at 18:39
  • My point is more that your question seems to suggest that these reasons are not sufficient to warrant allowing allocation callbacks. – Nicol Bolas Apr 29 '16 at 19:40

3 Answers3

11

From the spec:

Since most memory allocations are off the critical path, this is not meant as a performance feature. Rather, this can be useful for certain embedded systems, for debugging purposes (e.g. putting a guard page after all host allocations), or for memory allocation logging.

With an embedded system, you might have grabbed all the memory right at the start, so you don't want the driver calling malloc because there might be nothing left in the tank. Guard pages and memory logging (for debug builds only) could be useful for the cautious/curious.

I read on a slide somewhere (can't remember where, sorry) that you definitely should not implement allocation callbacks that just feed through to malloc/realloc/free because you can generally assume that the drivers are doing a much better job than that (e.g. consolidating small allocations into pools).

I think that if you're not sure whether you ought to be implementing allocation callbacks, then you don't need to implement allocation callbacks and you don't need to worry that maybe you should have.

I think they're there for those specific use cases and for those who really want to be in control of everything.

rawrex
  • 4,044
  • 2
  • 8
  • 24
Columbo
  • 6,648
  • 4
  • 19
  • 30
  • "*With an embedded system, you might have grabbed all the memory right at the start, so you don't want the driver calling malloc because there might be nothing left in the tank.*" It should be noted that Vulkan implementations are still allowed to allocate memory, regardless of your callbacks. That's what the "internal" allocation callbacks are for; they are called when the implementation allocates memory directly. It's still required to inform you when it does so, but it is not required to allow you to interfere. – Nicol Bolas Apr 29 '16 at 19:43
  • @NicolBolas - At specs we can read "If pfnAllocation is unable to allocate the requested memory, it must return NULL. If the allocation was successful, it must return a valid pointer to memory allocation containing at least size bytes, and with the pointer value being a multiple of alignment." I guess the functions at vkAllocationCallbacks does double job: (i)play malloc/realloc/free and (ii) play callbacks. I didn't try yet but I guess user implementation overrides driver's default allocator, if set. I'll try and give feedback – Alex Byrth Apr 29 '16 at 19:54
  • 1
    You misunderstood my point. The implementation is allowed to make "internal" allocations. It is required to *inform* you of this (which is what `pfnInternalAllocation` and `pfnInternalFree` are for). But if you look at those function signatures, the allocate function doesn't return a pointer. And the free function doesn't take a pointer. They exist solely for the implementation to inform you that it allocated/freed memory, but you can't allocate/free the memory for the implementation. – Nicol Bolas Apr 29 '16 at 20:00
  • Actually PFN_vkAllocationFunction, PFN_vkReallocationFunction, PFN_vkReallocationFunction all return (void*) . Also PFN_vkFreeFunction has void* pMemory parameter, which is the pointer to be freed. – Alex Byrth Apr 29 '16 at 20:31
  • 1
    @AlexByrth All three of the function pointer types you mentioned are irrelevant. `pfnInternalAllocation` is of type `PFN_vkInternalAllocationNotification` and `pfnInternalFree` is of type `PFN_vkInternalFreeNotification`. Both return `void`. – Colonel Thirty Two Apr 29 '16 at 21:14
  • @ColonelThirtyTwo, thanks for your reply. Indeed, those PFN_*Notification returns void. But I was talking about the remaining functions, and afaik they are delegate functions to allocate, reallocate and free functions. As shown in Vulkan.h (Rev. 1.0.8) they does return (void*). At header in line 1129 -> "typedef void* (VKAPI_PTR *PFN_vkAllocationFunction)" ; and at line 1135 -> "typedef void* (VKAPI_PTR *PFN_vkReallocationFunction)"; – Alex Byrth Apr 30 '16 at 15:39
  • From specs 10.1 Host Memory "Vulkan provides applications the opportunity to perform host memory allocations on behalf of the Vulkan implementation." – Alex Byrth Apr 30 '16 at 15:43
  • 1
    @AlexByrth The point Nicol Bolas was making is that Vulkan implementations are allowed to use their own allocators if they wish, and disregard the one you pass in. That makes it fairly unsuitable for the case of an embedded system allocating all of the memory up front. – Colonel Thirty Two Apr 30 '16 at 16:58
  • 1
    @AlexByrth: "*Vulkan implementations are allowed to use their own allocators if they wish, and disregard the one you pass in.*" I want to clarify that point. It's not so much that implementations will disregard what you passed in. Generally speaking, implementations will use your allocation functions when provided. But some implementations may need to allocate memory in a way that would be difficult for you. For example, if it need to allocate memory that can contain executable code. `malloc` can't do that; you'd need a system-level call, one which would be different on different platforms. – Nicol Bolas May 02 '16 at 01:39
4

This answer is an attempt to clarify and correct some of the information in the other answers...

Whatever you do, don't use malloc/free/realloc for a Vulkan allocator. Vulkan can and probably does use aligned memory copies to move memory. Using unaligned allocations will cause memory corruption and bad things will happen. The corruptions may not show themselves in an obvious way either. Instead use the posix aligned_alloc/aligned_free/aligned_realloc. They can be found in 'malloc.h' on most systems. (under windows use _aligned_alloc,ect) The function aligned_realloc is not well know but it is there (and has been there for years). BTW The alloc's for my test card had alignment requests all over the place.

One thing that is non-obvious about passing an application specific allocator to Vulkan is that at least some Vulkan objects "remember" the allocator. For example I passed an allocator to the vkcreateinstance function and was very surprised to see messages coming from my allocator when allocating other objects (which I had passed a nullptr too for the allocator). It made sense when I stopped to think about since objects that interact with the vulkan instance may cause the instance to make additional allocations.

This all play's into Vulkan's performance since individual allocators could be written and tuned to a specific allocation task. Which could have an impact on process startup time. But more importantly, a "block" allocator that places instance allocations, for example, near each other could have an impact on overall performance since they could increase cache coherency. (Instead of having the allocations scattered all over memory) I realize that this kind of performance "enhancement" is very speculative, but a carefully tuned application could have an impact. (Not to mention the numerous other performance critical paths in Vulkan that deserve more attention.)

Whatever you do don't attempt to use the aligned_alloc class of functions as a "release" allocator as they have very poor performance compared to Vulkan's built-in allocator (on my test card). Even in simple programs there was a very noticable performance difference compared to Vulkan's allocator. (sorry i didn't collect any timing information but no way was I going to repeatedly sit through those lengthy startup times.)

When it comes to debugging, even something as simple as plain old printf's can be enlightening inside the allocators. It is also easy to add simple statistic's collecting. But expect a severe performance penalty. They can also be useful as debug hooks without writing a fancy debug allocator or adding yet another debug layer.

btw...my test card was nvidia using release drivers

2

I implemented my own VkAllocatorCallback using plain C's malloc()/realloc()/free(). It is a naive implementation, and completely ignores the alignment parameter. Taking in account that malloc in 64 bits OS always return pointers with 16 (!) bytes alignment, which is pretty huge alignment, that would not be a problem in my tests. See Reference.

For information completeness, a 16 bytes alignment is also 8/4/2 bytes aligned.

My code is the following:

  /**
   * PFN_vkAllocationFunction implementation
   */
  void*  allocationFunction(void* pUserData, size_t  size,  size_t  alignment, VkSystemAllocationScope allocationScope){

    printf("pAllocator's allocationFunction: <%s>, size: %u, alignment: %u, allocationScope: %d",
        (USER_TYPE)pUserData, size, alignment, allocationScope);
   // the allocation itself - ignore alignment, for while
   void* ptr = malloc(size);//_aligned_malloc(size, alignment);
   memset(ptr, 0, size);
   printf(", return ptr* : 0x%p \n", ptr);
   return ptr;  
}

/**
 * The PFN_vkFreeFunction implementation
 */
void freeFunction(void*   pUserData, void*   pMemory){
    printf("pAllocator's freeFunction: <%s> ptr: 0x%p\n",
    (USER_TYPE)pUserData, pMemory);
    // now, the free operation !    
    free(pMemory);
 }

/**
 * The PFN_vkReallocationFunction implementation
 */
void* reallocationFunction(void*   pUserData,   void*   pOriginal,  size_t  size, size_t  alignment,  VkSystemAllocationScope allocationScope){
    printf("pAllocator's REallocationFunction: <%s>, size %u, alignment %u, allocationScope %d \n",
    (USER_TYPE)pUserData, size, alignment, allocationScope);       
    return realloc(pOriginal, size);
 }

/**
 * PFN_vkInternalAllocationNotification implementation
 */
void internalAllocationNotification(void*   pUserData,  size_t  size,   VkInternalAllocationType allocationType, VkSystemAllocationScope                     allocationScope){
  printf("pAllocator's internalAllocationNotification: <%s>, size %uz, alignment %uz, allocationType %uz, allocationScope %s \n",
    (USER_TYPE)pUserData, 
    size, 
    allocationType, 
    allocationScope);

}

/**
 * PFN_vkInternalFreeNotification implementation
 **/
void internalFreeNotification(void*   pUserData, size_t  size,  VkInternalAllocationType  allocationType, VkSystemAllocationScope                     allocationScope){
    printf("pAllocator's internalFreeNotification: <%s>, size %uz, alignment %uz, allocationType %d, allocationScope %s \n",
            (USER_TYPE)pUserData, size, allocationType, allocationScope);
}



 /**
  * Create Pallocator
  * @param info - String for tracking Allocator usage
  */
static VkAllocationCallbacks* createPAllocator(const char* info){
    VkAllocationCallbacks* m_allocator =     (VkAllocationCallbacks*)malloc(sizeof(VkAllocationCallbacks));
    memset(m_allocator, 0, sizeof(VkAllocationCallbacks));
    m_allocator->pUserData = (void*)info;
    m_allocator->pfnAllocation = (PFN_vkAllocationFunction)(&allocationFunction);
    m_allocator->pfnReallocation = (PFN_vkReallocationFunction)(&reallocationFunction);
    m_allocator->pfnFree = (PFN_vkFreeFunction)&freeFunction;
    m_allocator->pfnInternalAllocation = (PFN_vkInternalAllocationNotification)&internalAllocationNotification;
    m_allocator->pfnInternalFree = (PFN_vkInternalFreeNotification)&internalFreeNotification;
   // storePAllocator(m_allocator);
   return m_allocator;
  }

`

I used the Cube.c example, from VulkanSDK, to test my code and assumptions. Modified versions is available here GitHub

A sample of output:

pAllocator's allocationFunction: <Device>, size: 800, alignment: 8, allocationScope: 1, return ptr* : 0x00000000061ECE40 
pAllocator's allocationFunction: <RenderPass>, size: 128, alignment: 8, allocationScope: 1, return ptr* : 0x000000000623FAB0 
pAllocator's allocationFunction: <ShaderModule>, size: 96, alignment: 8, allocationScope: 1, return ptr* : 0x00000000061F2C30 
pAllocator's allocationFunction: <ShaderModule>, size: 96, alignment: 8, allocationScope: 1, return ptr* : 0x00000000061F8790 
pAllocator's allocationFunction: <PipelineCache>, size: 152, alignment: 8, allocationScope: 1, return ptr* : 0x00000000061F2590 
pAllocator's allocationFunction: <Device>, size: 424, alignment: 8, allocationScope: 1, return ptr* : 0x00000000061F8EB0 
pAllocator's freeFunction: <ShaderModule> ptr: 0x00000000061F8790
pAllocator's freeFunction: <ShaderModule> ptr: 0x00000000061F2C30
pAllocator's allocationFunction: <Device>, size: 3448, alignment: 8, allocationScope: 1, return ptr* : 0x000000000624D260 
pAllocator's allocationFunction: <Device>, size: 3448, alignment: 8, allocationScope: 1, return ptr* : 0x0000000006249A80 

Conclusions:

  • The user implemented PFN_vkAllocationFunction, PFN_vkReallocationFunction,PFN_vkFreeFunction really does malloc/realoc/free operations in behalf of Vulkan. Not sure if they performs ALL allocations, as Vulkan may choose alloc/free some portions by itself.

  • The output provided by my implementations shows that typical alignment requested is 8 bytes, in my Win 7-64/NVidia. This shows that there is room for optimization, like as kind managed memory, where you grab a large chunk of memory and sub-allocate for your Vulkan app (a memory pool). It may* reduces memory usage (think 8 bytes before and up to 8 bytes after each alloc'ed block). It also may be faster, as malloc() call can last longer than a direct pointer to your own pool of memory already alloc'ed.

  • At least with my current Vulkan drivers, the PFN_vkInternalAllocationNotification and PFN_vkInternalFreeNotification doesn't run. Perhaps a bug in my NVidia drivers. I'll check in my AMD later.

  • The *pUserData is to be used to both debug info and/or management. Actually, you can used it to pass a C++ object, and play all required performance job over there. It's a sort of obvious info, but you can change it for each call or VkCreateXXX object.

  • You can use a single and generic VkAllocatorCallBack allocator for all application, but I guess that using a customised allocator may lead to better results. I my test, VkSemaphore creation shows a typical pattern on intense alloc/free of small chunks (72 bytes), which may be addressed with reuse of a previously chunk on memory, in a customised allocator. malloc()/free() already reuse memory when possible, but is tempting try to use our own memory manager, at least for short lived small blocks of memory.

  • Memory alignment maybe an issue to implement VkAllocationCallback (there is no _aligned_realoc function available, but only _aligned_malloc and _aligned_free). But only if Vulkan requests alignments bigger than malloc's default (8 bytes for x86, 16 for AMD64, etc. must check ARM defaults). But so far, seens Vulkan actually request memory with lower alignment than malloc() defaults, at least on 64bit OS's.

Final Thought:

You can live happy until the end of time just setting all VkAllocatorCallback* pAllocator you find as NULL ;) Possibly Vulkan's default allocator already does it all better than yourself.

BUT...

One of highlights of Vulkan benefits was the developer would be put in control of everything, including memory-management. Khronos presentation, slide 6

Alex Byrth
  • 1,328
  • 18
  • 23
  • My findings just confirms what @Columbo sad in the accepted answer. Thanks ! – Alex Byrth May 02 '16 at 00:08
  • "*They are not dummy functions.*" Nobody said or implied that they were dummy functions. "*Perhaps a bug in drivers.*" Or because the implementation isn't allocating memory internally and thus doesn't need to notify you of it. Or because you never did anything that would require internal memory allocation. Also, it's a really good idea to avoid drawing conclusions from the behavior of a single implementation. – Nicol Bolas May 02 '16 at 01:33
  • "*The *pUserData can be used to both info and management. Actually, you can used it to pass a C++ object, and play all required performance job.*" I'm curious: was any of that somehow not obvious from the API and specification? Passing a pointer to callback function pointers is a time-honored tradition for allowing such callbacks to use, for example, C++ objects or other state data. – Nicol Bolas May 02 '16 at 01:35
  • @NicolBolas I guess I misunderstood most of your initial points. Next time I'll try harder to got your ideas. – Alex Byrth May 02 '16 at 17:30
  • 1
    Using C-style cast: m_allocator->pfnAllocation = (PFN_vkAllocationFunction)(&allocationFunction); Can hide potential calling conventions mismatch and cause a nasty crash. – Dorian Oct 31 '22 at 13:42