2

I understand that the C standard library allows for the resizing of a memory allocation through the use of the realloc function, like in the following example:

char *a = malloc(10);
char *b = realloc(a, 8);

In this case, a and b could potentially be equal, and the allocation has effectively been shrunk by two bytes from the right.

However, I'm wondering if there's a way to shrink a memory allocation from the left, something like this:

char *a = malloc(10);
char *b = /* ... */;  // shrink a from the left by 2 bytes

Where (a + 2) == b, and the original 2 bytes at the start of a are now free for the allocator to use. This should happen without having to copy the data to a new location in memory. Just shrink the allocation.

I'm aware that using realloc to shrink the memory from the right or manually copying the data to a new, smaller allocation might be an option, but these methods don't suit my needs.

Is there any way to achieve this using C's standard library or any other library that provides this functionality?

I'm not asking for 100% guarantee, realloc also could return a pointer to a different location, but it is likely that it will.

Thank you in advance for your insights.

I could shift all bytes to the left and try to shrink, but it involves copying.

Ddystopia
  • 31
  • 4

4 Answers4

5

When you allocate something on the heap, pretty much every implementation out there requires more data to be allocated in addition to the user data. The bytes you see through the malloc & friends interface is just the user data. A common way for the standard library/OS to implement heap allocation is to first allocate a segment header, followed by the data.

This means that to the immediate "left" (lower addresses) of the data part of the segment sits the header. Leaving a gap there between header and data doesn't make any sense. Rather, you'd have to move the whole thing to a new memory location.


but these methods don't suit my needs.

And what are those needs, exactly? When using dynamic memory we must be aware that:

  • It is much slower than statically allocated memory at the point of allocation and initialization.
  • It always comes with a memory overhead. Overhead in terms of headers/look-up tables etc, overhead in terms of dealing with alignment, overhead in terms of heap fragmentation.

For example:

char *a = malloc(10);
char *b = realloc(a, 8);

Probably consumes much more RAM memory overall than lets say char c[12];. And it will almost certainly be some ~100 to 1000 times slower.

It is also quite likely that the execution time overhead involved in calling realloc is far more expensive than a memcpy/memmove call.

So what is the actual problem you are trying to solve here?

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • This is true for "small" amounts, but `malloc()` will probably use dedicated pages when you `alloc()` large buffers which have their meta-data not stored before that but in a extra table somewhere in the kernel. – 12431234123412341234123 Jun 21 '23 at 11:19
  • @12431234123412341234123 The dirty details of it all is a very complex subject and I'll pass from explaining the specifics given some OS. For those truly interested, I rather recommend looking at bare metal library implementations of heaps, since those are much more straight forward. – Lundin Jun 21 '23 at 11:25
  • Closer to my original problem would be that: In Rust, std::alloc::RcBox has that layout: ```rust #[repr(C)] struct RcBox { strong: Cell, weak: Cell, value: T, } ``` And when it is converted back to `Box`, it does new allocation of the size of `size_of::()`, copies `value` and drops original `RcBox`. I wanted to try to implement it more efficient. My original idea was to cut `strong` and `weak`, so it would have the same layout as Boxed value. Now I think it is impossible, so I'll try to shift bytes of `value` from right to left + realloc – Ddystopia Jun 21 '23 at 11:41
  • @Touchme Naturally you can always develop some sort of bloat lib on top of the raw C standard library. That's what higher level languages do, C++ std::vector would fit the bill quite well too. So if you want such features and don't care anything about performance, then go with a higher level language. C is close to the metal, it is fast and memory efficient, but it is also brittle and generally evil. `malloc` was originally designed as just a thin wrapper around various raw Unix API calls. As most of the C standard library. – Lundin Jun 21 '23 at 11:51
2

With more informations about your system, we could answer the question in more direct way (if it is a good idea or not is a different question).

If you are using a POSIX system, you can maybe use mmap() and munmap() for this. This makes only sense when you need large amounts of memory which is multiple times the page size and worth freeing. You can reserve memory with mmap() (check man mmap). After you no longer need the first N bytes of it (N must be a larger than a page), you can use munmap() to free M pages (N>=M*pageSize), and still use the memory after that.

I made a short demonstration for that:

#include <stdio.h>
#include <sys/mman.h>

int main(void)
  {
    //Map pages for the memory needed
    unsigned char *t=mmap(NULL,1024*1024,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANONYMOUS,-1,0);
    printf("mmap returned address %p for 1MiB\n",t);
    if(!t)
      { perror("mmap"); return -1; }
    printf("Writing to the mapped pages\n");
    t[12]='A';

    printf("unmap the first 16 KiB we don't need anymore\n",t);
    if(munmap(t,16*1024)<0)
      { perror("munmap"); return -1; }
    printf("Writing after the unmapped pages, to the still mapped pages\n");
    t[16*1024+12]='A';

    #if 0 //Set to 0 when you don't want to see the segement fault
      printf("Create segment fault\n");
      t[15]='A'; //accessing the unmapped pages will cause a segment fault or any other UB.
    #endif

    printf("unmap the rest of the memory\n");
    if(munmap(t+16*1024,(1024-16)*1024)<0)
      { perror("munmap 2"); return -1; }
    printf("Test done successfully\n");
    return 0;
  }


This demonstration does not check the page size, for a real program you have to get the page size and make sure to only map and unmap whole pages. You also will have a harder time when you want to grow the reserved memory (make sure you don't collide with other pages, ..., maybe move_pages() can become handy when available on your system). There are probably similar solution for other, non-Posix, platforms that have a MMU.

2

Even though it is always an intriguing idea to manage memory like this yourself, I might be able to provide some explanation of why it is also a bad idea in most circumstances:

Managing a heap efficiently is an actual science in itself. This article provides a good overview and has nice references for a deep dive.

I will take Linux as an example here. It shows why freeing "from the left" efficiently as a user is really difficult to do - assuming an ordinary computer-system.

Let's say you got a system with maybe 200 MB of memory and you can ignore everything besides the heap. You run the system with multiple programs and they all allocate memory. That would look something like this:

programs allocate memory

Now imagine some of the programs start to free (blocks in blue). Suddenly your heap does not look that pretty anymore. New programs start to allocate memory, more of the old ones start to free, your memory looks like Swiss cheese - holes all over it - in no time. This is memory fragmentation. Now you have to keep a list of the free memory spaces, where you are still able to allocate. (Even ignoring compaction of memory in this case)

fragmented memory

To make matters worse: If you free memory, how do you know if the memory blocks next to that free chunk are still in use? You would have to process the list every time you free. If you do not, your memory can be free, but still fragmented like this:

freed memory

There are Solutions: Linux does this among other things with a so called buddy-allocator. The core idea is to only allocate chunks to the power of 2 and divide those.

This causes some overhead when allocating, but boy is it blazingly fast when freeing large chunks of memory. If a small chunk of memory is freed, the operating system (OS) checks if the "buddy" is already free. Now the OS is able to free the larger chunk consisting of the two small ones. You can check for the correlating "buddy" again and free even bigger parts of the memory without much of a search. Repeat until you have freed the biggest possible chunk of memory.

the buddy system

This is a much simplified explanation of what really happens (like slab allocation and the switch from your user-space function to the operating system). Still - imagine how you are trying to free only minor parts of memory on the "left" in this system. This would cause even more overhead and it hardly scales well.

My advice: let the operating system do its job. And this sort of memory management is an OS job. Work with what you got.

led
  • 56
  • 4
  • When you talking about the OS: Why care about fragmentation? Most likely the OP is using a system with a MMU that uses virtual memory. The kernel can redirect any address of a process to any page the kernel "wants" to. Of course this is a bit different for the user space but then it is no longer the job of the kernel (or OS if you want) but the clib. And i also think there are faster solutions than processing the list every time without using `buddy-allocator` (which is probably one of the fastest ways). – 12431234123412341234123 Jun 22 '23 at 17:11
  • You are right about the fragmentation. I mainly wanted to point out why free-space-management is difficult as is. Starting with Page-Tables and Virtualisation seemed like overkill to me. It is imprecise and i will rework this as soon as i get a chance to. Regarding the buddy-allocator: I mean - there are probably a thousand good ways of doing this - i guess a dedicated garbage-collector could do this even faster. Again - a huge simplification. I might have to rephrase that too. – led Jun 24 '23 at 20:36
0

realloc or any other standard library function doesn't allow you to control in what direction the memory region is contracted. In fact, it's possible that calling realloc(a, 8) will contract it from the left side, though that's not how it's typically implemented.

Solution 1 - don't realloc, just do pointer arithmetic

You could do this manually using pointer arithmetic though:

char *a = malloc(10);
char *b = a + 2;

// TODO: free(a), or
//       free(b - 2)

Note that free(b) wouldn't be allowed, because free must be called exactly with the same pointer as returned by malloc or realloc.

Solution 2 - contract from the right, view array in reverse

Another possibility is interacting with the array in reverse, always. Instead of indexing it with a[i], index it with a[end - i - 1].

int end = 10;
char *a = malloc(end);
// first byte of reverse array is at a a[9]

end = 8;
a = realloc(a, end);
// first byte is now at a[7]

This way, even if we contract from the right, it is as if we contracted from the left.

Jan Schultke
  • 17,446
  • 6
  • 47
  • 96
  • 2
    Re “In fact, it's possible that calling `realloc(a, 8)` will contract it from the left side”: This is not possible unless all fundamental types have an alignment requirement of two bytes or less in the C implementation. `malloc` and `realloc` are required to return an address with fundamental alignment, so if the initial address is aligned, that address changed by two bytes will not have fundamental alignment unless fundamental alignment is two bytes or less. – Eric Postpischil Jun 21 '23 at 10:50
  • Solution 2 is very clever, thank you. Maybe it will help someone in the future, but it is not suitable for me. – Ddystopia Jun 21 '23 at 11:34