"Weak reference" in Linux memory manager?

Question

In Java, a weak reference is garbage collected if memory out. In Linux, malloc() always returns a strong reference, ie. the pointer is never freed until the caller call free() function.

I want to allocate a buffer for caching, which could be freed automatically when the memory is running out, like following:

cache_t cache;
if (! cache_alloc(&cache))
    die("Memory out");

cache_lock(&cache); // realloc cache mem if it is collected

if (! cache->user_init) { // The "user_init" maybe reset if the cache mem is collected
  // lazy-init the cache...
  load_contents(cache->mem, ...);
  cache->user_init = 1;
}

// do with cache..
stuff_t *stuff = (stuff_t *) cache->mem;
...

cache_unlock(&cache);

It seems the buff and cache in the output of vmstat is disk IO related:

$ vmstat
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 0  0  51604 554220  13384 314852    3   10   411   420  702 1063  8  3 75 14

Well, I want to know more about whether the cache in my example could be reflected in the "cache" column in the vmstat output.

OS does it for you anyway with MMU, paging etc.. so why bother? — , Sep 28 '12 at 01:44
@VladLazarenko I think the OP is talking about virtual address space running out, not physical memory. Nothing that the OS does addresses the former. — Jim Balter, Sep 28 '12 at 02:05

score 1 · Accepted Answer · answered Sep 28 '12 at 01:45

1

There really isn't a good way of doing it - the C memory model simply doesn't allow for the same kind of behavior that the Java memory model allows. Java's memory model of course builds on the C model when interfacing with the operating system, which is why the Java heap must be manually limited by the application launcher.

The "buff" and "cache" columns relate to the page/disk cache and internal buffers used by the kernel. These caches are automatically handled by the kernel - for instance, reading a file will place the contents in the "cache" usage number, in the same way that running out of memory will commit it to a swap device ("swpd").

answered Sep 28 '12 at 01:45

Yann Ramin

32,895
3
59
82

You are right. But Java do a lot more then "just free it when memory out". I'm not much concern about other OS than Linux, and I guess maybe there's Linux specific solution. – Lenik Sep 28 '12 at 01:54
When Linux (and other OSes) give you a memory address (via `sbrk`, `mmap`, etc), they are committing that this memory address will always be valid. There is no direct `this may be available` kind of pointer in a C memory model. – Yann Ramin Sep 28 '12 at 01:57
I remember Win32 has something like allocate a buffer with `MOVEABLE`, `DISCARDABLE` flags, and returns a memory handle. – Lenik Sep 28 '12 at 01:57
Certainly, every pointer in C is a strong reference. But the "mem" pointer in "cache_t" is supposed to be moved or freed by MMU not C. So, the function is implementable I think. – Lenik Sep 28 '12 at 02:02
There's nothing that the Java memory model provides that can't be done in C ... you just have to do it yourself. – Jim Balter Sep 28 '12 at 02:07
I can write my version of malloc, say `mymalloc`, and allocate all normal buffers in my application using `mymalloc`. In `mymalloc` it frees the "cache" if the original malloc returns NULL. The problem is, when other applications in the system used up the memory, I had no chance to free cache, except to setup a polling thread to monitor the available memory. I think it's not efficient. – Lenik Sep 28 '12 at 02:08
@YannRamin You do know that Java runs on Linux (and other OSes), don't you? Memory allocated by sbrk or mmap can be unallocated. – Jim Balter Sep 28 '12 at 02:10
@XièJìléi "when other applications in the system used up the memory" -- you're talking about physical memory? This is what swapping is for. If processes have allocated memory they aren't using, the OS will reclaim the physical pages. – Jim Balter Sep 28 '12 at 02:12
@Jim Balter: Yes, there is swap. But swapping is very slow. If my cache occupies a lot of memory (1G for example), I don't want it be swapped out, instead I like it be freed. The swap is also possible used up, though.. – Lenik Sep 28 '12 at 02:16
The idea behind the question is, I think maybe I can install a system-wide malloc hook, or maybe some special malloc function for Linux with special flags. It's not that hard to implement.. – Lenik Sep 28 '12 at 02:24
@XièJìléi You aren't understanding. If programs have allocated large amounts of memory that they aren't using, that memory will get swapped out ... once. It will never get swapped back in. And there's no "system-wide malloc" ... malloc isn't a system function. You seem unclear on the difference between virtual memory (address space) that is per-process, and physical memory that is shared by all processes. – Jim Balter Sep 28 '12 at 02:27
But hey, if you think you got an answer (that says there isn't a good way to do it, while you say it's not that hard to implement), I won't argue. Good luck with that. – Jim Balter Sep 28 '12 at 02:30
@Jim Balter: "It will never get swapped back in" then, what swap is used for? I'm confused.. sorry for my poor english. The "malloc-hook" here I mean the underlying memory manager hook, malloc() is certainly not a system function. There must be an allocate function in the system which do the actual memory allocation. – Lenik Sep 28 '12 at 02:31
@XièJìléi It will never swapped be in because (ex hypothesi) the program isn't using it, but the OS doesn't know that, so it has to swap it out to reuse the physical page. Try it: run a bunch of processes that malloc large chunks of memory and then go to sleep ... there will be no impact (other than use of more space in the swap file). – Jim Balter Sep 28 '12 at 02:34
@XièJìléi "There must be an allocate function in the system which do the actual memory allocation" -- no, there is no such function ... not for **physical** memory. malloc calls sbrk or mmap, which allocate **address space**. – Jim Balter Sep 28 '12 at 02:38
@XièJìléi Again, "You seem unclear on the difference between virtual memory (address space) that is per-process, and physical memory that is shared by all processes" -- if you were clear, then you would specify which you mean when you write "memory". – Jim Balter Sep 28 '12 at 02:40
@JimBalter: Yes, I am aware of course. However, the Java memory model as seen by Java-language applications has a different set of possibilities. – Yann Ramin Sep 28 '12 at 03:07
@YannRamin There is nothing available in the Java memory model that isn't available in C. "the C memory model simply doesn't allow for the same kind of behavior that the Java memory model allows" is utter nonsense ... since Java can be implemented in C, any memory model offered to Java programs can be offered to C programs. This isn't even about memory models, it's about OS memory management facilities. – Jim Balter Sep 28 '12 at 04:53

"Weak reference" in Linux memory manager?

1 Answers1