4

I'm writing a cache server in java that will cache image data (jpgs, pngs, tiff etc) in memory for fast access over http. The images are rendered by another service, which is an expensive operation, so I want to cache them on my cache server.

There are several reasons why I'm writing it from scratch, so the answer I'm looking for is not [some clever software product]

Question: How can I keep certain a set of data objects in main memory, and ensure that data is actually in main memory when I need it, and not pushed to disk by a virtual memory manager? That is, how can i do this in Java?

Further information: Objects could be referenced with any interval, e.g. days or say years apart to be a bit extreme :-)

EDIT: I have found this SO post which asks "can you keep objects in contiguous memory?" - This is not the question I'm asking, although it could help, if objects were referenced all the time, I presume. And btw, the answer to that question was "no", except obviously for value-types in arrays.

Community
  • 1
  • 1
  • "The images are rendered by another service, which is an expensive operation, so I want to cache them on my cache server." *what* is an expensive operation? The rendering, the network transfer, or the disk access? If the cache server is accessed remotely, e.g. over HTTP, caching will only help avoid disk access which may not be your bottleneck. – Emerson Farrugia Jan 22 '12 at 21:47
  • What will your implementation do that any of the existing cache libraries/frameworks/systems/apps don't? – Dave Newton Jan 22 '12 at 21:54
  • Expensive with regards to CPU. – Pimin Konstantin Kefaloukos Jan 22 '12 at 21:56
  • Dave Newton: it is a cache system for geographical map images, like GeoWebCache and Map Proxy, but differs from these systems in important ways. – Pimin Konstantin Kefaloukos Jan 22 '12 at 21:58

2 Answers2

1

I strongly doubt you can do this in Java alone. You'll probably have to use something like mlock through JNI, as well as the requisite JNI incantations to pin the cached objects graphs in memory so the GC doesn't move them. And [insert miracle here] to compact the pinned memory into contiguous pages because that's what mlock operates on.

millimoose
  • 39,073
  • 9
  • 82
  • 134
  • +1 for mlock and JNI. I'm not familiar with mlock (and not too familiar with JNI), but will look into it. I assume that mlock is a Linux only thing? – Pimin Konstantin Kefaloukos Jan 22 '12 at 21:51
  • Linux+BSD+Mac. For Windows I'll need another approach, and (unfortunately?) I need Windows support too. – Pimin Konstantin Kefaloukos Jan 22 '12 at 22:01
  • Regarding the miracle part: Could I place the bytes from several images in a single byte array, and pin this (presumably contiguous) data in memory using mlock, and combine this with keeping indices into the byte array in another (memory resident) datastructure? – Pimin Konstantin Kefaloukos Jan 22 '12 at 22:09
  • @PiminKonstantinKefaloukos: The die.net man page mentions the system call dates back to SVr4, so I'd guess it's actually widely available on Unixen. (Windows will probably require a second implementation of the JNI extension.) – millimoose Jan 22 '12 at 22:09
  • @PiminKonstantinKefaloukos I honestly have no idea how to approach the miracle part. As you mention in your question, there is no "sane" way of having Java allocate objects contiguously, so you'll have to manage pages of memory yourself for this. This could be easier in C if you can find an allocator library that lets you specify an arbitrary memory region to work in. – millimoose Jan 22 '12 at 22:15
1

I assume you want to keep access time predictably low so you want to avoid paging. In Java you have very limited set of tool to manage memory. In fact, this is operating systems' job to track which pages are inactive and can be pushed to disc. I am not even sure whether there is any API in major operating systems to control this behaviour.

That being said, you must focus on fooling the system that pages are actually needed, while they weren't really used for a long time. I think you already know the answer - just write an asynchronous task that touches every object in your cache every second or so. This should make the operating system to believe that you process is still actively using these areas of memory.

Sad but should be effective.

Tomasz Nurkiewicz
  • 334,321
  • 69
  • 703
  • 674