47

Is there a way to tell Linux that it shouldn't swap out a particular processes' memory to disk?

Its a Java app, so ideally I'm hoping for a way to do this from the command line.

I'm aware that you can set the global swappiness to 0, but is this wise?

sanity
  • 35,347
  • 40
  • 135
  • 226

7 Answers7

25

You can do this via the mlockall(2) system call under Linux; this will work for the whole process, but do read about the argument you need to pass.

Do you really need to pull the whole thing in-core? If it's a java app, you would presumably lock the whole JVM in-core. I don't know of a command-line method for doing this, but you could write a trivial program to call fork, call mlockall, then exec.

You might also look to see if one of the access pattern notifications in madvise(2) meets your needs. Advising the VM subsystem about a better paging strategy might work out better if it's applicable for you.

Note that a long time ago now under SunOS, there was a mechanism similar to madvise called vadvise(2).

Cristian Ciupitu
  • 20,270
  • 7
  • 50
  • 76
Thomas Kammeyer
  • 4,457
  • 21
  • 30
  • @sanity did you try this? How did it work out for you? I was considering doing something similar. – Bill K Oct 27 '10 at 18:30
  • 15
    ["Memory locks are not inherited by a child created via fork(2) and are automatically removed (unlocked) during an execve(2) or when the process terminates."](http://www.kernel.org/doc/man-pages/online/pages/man2/mlock.2.html) - the fork/mlock/exec approach will not work – Mat Jun 24 '12 at 18:41
  • @Mat appears to be right... OK... what about something like writing something to call via JNI that calls mlockall? That starts to feel pretty hackish though... – Thomas Kammeyer Jun 25 '12 at 16:09
  • If you don't want to mess arround with custom JNI libraries you could probablly use JNA to call mlockall directly. – plugwash Nov 03 '15 at 12:58
  • 2
    Extension, tricky hack: if the process won't make this `mlockall()` call for you, you can help it by a `gdb`, letting it call that for your sake. :-) – peterh Feb 01 '16 at 17:07
15

If you wish to change the swappiness for a process add it to a cgroup and set the value for that cgroup:

https://unix.stackexchange.com/questions/10214/per-process-swapiness-for-linux#10227

Community
  • 1
  • 1
Matthew Buckett
  • 4,073
  • 1
  • 36
  • 28
3

There exist a class of applications in which you never want them to swap. One such class is a database. Databases will use memory as caches and buffers for their disk areas, and it makes absolutely no sense that these are ever put to swap. The particular memory may hold some relevant data that is not needed for a week until one day when a client asks for it. Without the caching/swapping, the database would simply find the relevant record on disk, which would be quite fast; but with swapping, your service might suddenly be taking a long time to respond.

mysqld includes code to use the OS / system call memlock. On Linux, since at least 2.6.9, this system call will work for non-root processes that have the CAP_IPC_LOCK capability[1]. When using memlock(), the process must still work within the bounds of the LimitMEMLOCK limit. [2]. One of the (few) good things about systemd is that you can grant the mysqld process these capabilities, without requiring a special program. If can also set the rlimits as you'd expect with ulimit. Here is an override file for mysqld that does the requisite steps, including a few others that you might need for a process such as a database:

[Service]
# Prevent mysql from swapping
CapabilityBoundingSet=CAP_IPC_LOCK

# Let mysqld lock all memory to core (don't swap)
LimitMEMLOCK=-1 

# do not kills this process if low on memory
OOMScoreAdjust=-900 

# Use higher io scheduling
IOSchedulingClass=realtime    

Type=simple    
ExecStart=
ExecStart=/usr/sbin/mysqld --memlock $MYSQLD_OPTS

Note The standard community mysql currently ships with Type=forking and adds --daemonize in the option to the service on the ExecStart line. This is inherently less stable than the above method.

UPDATE I am not 100% happy with this solution. After several days of runtime, I noticed the process still had enormous amounts of swap! Examining /proc/XXXX/smaps, I note the following:

  • The largest contributor of swap is from a stack segment! 437 MB and fluctuating. This presents obvious performance issues. It also indicates stack-based memory leak.
  • There are zero Locked pages. This indicates the memlock option in MySQL (or Linux) is broken. In this case, it wouldn't matter much because MySQL can't memlock stack.
Otheus
  • 785
  • 10
  • 18
2

You can do that by the mlock family of syscalls. I'm not sure, however, if you can do it for a different process.

jpalecek
  • 47,058
  • 7
  • 102
  • 144
  • I would be careful how much of the total system memory is locked though. I suspect that you could bring down the system through trashing if you weren't careful. – Dana the Sane Feb 23 '09 at 16:04
2

As super user you can 'nice' it to the highest priority level -20 and hope that's enough to keep it from being swapped out. It usually is. Positive numbers lower scheduling priority. Normal users cannot nice upwards (negative nos.)

SumoRunner
  • 159
  • 1
  • 4
  • 3
    unless I'm mistaken, this will only affect the CPU time the process is granted, not its tendency towards main memory usage. – intuited May 22 '10 at 21:28
  • 4
    @intuited: I believe he means that the VM subsystem will also look at the `nice` value to decide how important a process is, and not swap it out. Sounds plausible; I don't know if Linux really does this, though. – sleske Feb 09 '11 at 02:30
  • `ionice` might also be useful here – fread2281 Jun 03 '15 at 01:02
1

Except in extremely unusual circumstances, asking this question means that You're Doing It Wrong(tm).

Seriously, if Linux wants to swap and you're trying to keep your process in memory then you're putting an unreasonable demand on the OS. If your app is that important then 1) buy more memory, 2) remove other apps/daemons from the machine, or dedicate a machine to your app, and/or 3) invest in a really fast disk subsystem. These steps are reasonable for an important app. If you can't justify them, then you probably can't justify wiring memory and starving other processes either.

dwc
  • 24,196
  • 7
  • 44
  • 55
  • 32
    If it were cryptographically-related, it's entirely reasonable to want to stay in memory. gnome-keyring, for instance, takes quite a few steps to keep important bits of itself off of swap. (Other than that, reasonable observations.) – Paul Fisher Feb 23 '09 at 17:37
  • 1
    Good point, Paul. I do feel that this qualifies as an unusual circumstance. – dwc Feb 23 '09 at 17:54
  • 2
    It's quite usual in banking and payment card industry with all the strong cryptography requirements. – Aleksander Adamowski Mar 24 '10 at 15:05
  • 1
    See the following advice ("recommendation MEM06-C - Ensure that sensitive data is not written out to disk") from CERT that applies to C programs on POSIX and Windows: https://www.securecoding.cert.org/confluence/display/seccode/MEM06-C.+Ensure+that+sensitive+data+is+not+written+out+to+disk It's pretty standard stuff when secure coding is required. Similar stuff is needed for Java. – Aleksander Adamowski Mar 24 '10 at 15:32
  • BTW, encrypting swap space would alleviate the problem, but it still wouldn't prevent a privileged system user from reading sensitive data from the crypto block device that backs swap space - possibly even long after the original program's termination (until the particular swap area gets overwritten). – Aleksander Adamowski Mar 24 '10 at 15:32
  • 23
    What about UI-related processes that don't use much memory? EG the desktop manager's panels, that sort of thing? user interface stuff should be ready to go at all times, or at least more ready than stuff that's basically batch processing. Ideally apps would be built so that their interface had a higher real-memory priority than their guts, so that it would still be possible to, EG scan through the menus while the app was busy doing something. Some apps do work this way but it requires multithreading. but apps that are just launchers etc. should be given higher memory "stickiness". – intuited May 22 '10 at 21:08
  • 1
    Java should probably be coded in a way that it's data area is never swapped--java is too likely to re-arrange all of it's data which can cause some extreme thrashing. However keeping the entire JVM and libraries in memory might be overkill. – Bill K Oct 27 '10 at 18:29
  • 9
    This is quite an old answer but I must say "you generally do it wrong because somebody smarter said so" is not a valid reason. I want my process not to be swapped and I have 32GB of memory - so I don't care what kernel developers thought when they developed their system, I don't want my process to be swapped - end of. And I do want my swap to be there even though I have a lot of resources. I would say such answers like yours don't bring anything new to the discussion and I would rather not see such answers when I look for solution to my problem. – Greg0ry Mar 18 '16 at 10:39
1

Why do you want to do this?
If you are trying to increase performance of this app then you are probably on the wrong track. The OS will swap out a process to increase memory for disk cache - even if there is free RAM, the kernel knows best (actauly the samrt guys that wrote the scheduler know best).
If you have a process that needs responsiveness (it's swapped out while not used and you need it to restart quickly) then nice it to high priority, mlock, or using a real time kernel might help.

Martin Beckett
  • 94,801
  • 28
  • 188
  • 263