3

I am taring and then compressing a bunch of files&directories on my Ubuntu Server VPS for a backup. It only has 1GB of RAM and 128MB of Swap (I can't add more - OVH use OpenVZ as their virtualisation software), and every time tar runs it uses a ton of memory for it's buffer, causing everything else to get swapped out - even when using nice -n 10.

Is there any way to force tar to use a small buffer and reduce it's memory usage? I am worried that once the backup gets to be a certain size, my server will go down because tar won't have enough memory for it's buffer.

I am using bzip2 to compress, and I have already limited it's memory usage with the -4 option.

Edit: Here is what htop looks like when I have had tar running for a while:

enter image description here

Here is a link to the full gif

Edit 2: Here is the tar command I am using:

nice -n 20 tar --exclude "*node_modules*" --exclude "*.git/*" --exclude "/srv/www-mail/rainloop/v*"  -cf archive.tar /home /var/log /var/mail /srv /etc
MadHatter
  • 79,770
  • 20
  • 184
  • 232
starbeamrainbowlabs
  • 353
  • 1
  • 7
  • 16
  • How do you see that `tar` is using much memory? I guess it just causes linux to remove useful "hot" data from its cache and replace it with useless "cold" data which are being backup up (and not needed in the cache) – Marki555 Jul 08 '15 at 21:23
  • @Marki555 I used `htop` to observe my memory and swap usage. I used [this tutorial](http://www.cyberciti.biz/faq/linux-which-process-is-using-swap/) to view which proecesses were using the most swap before and after, and I noticed that `tar`ing a large amount of stuff causes almost everything else to get swapped out :/ – starbeamrainbowlabs Jul 09 '15 at 05:28
  • Can you include the output of `htop` into your question? – Marki555 Jul 09 '15 at 06:48
  • @Marki555 Sure, I will update the question as soon as I get the chance. – starbeamrainbowlabs Jul 09 '15 at 06:51
  • @Marki555 Done - I've edited the question. I ran the `tar` command in a separate SSH terminal. It's the yellow part of the "Mem" bar that is the problem. I think that stands for the cache? The other problem is now how to clear the buffer.... – starbeamrainbowlabs Jul 09 '15 at 14:28
  • Hold on. Does this have something to do with the fact that I was using `/tmp` to store the archive? – starbeamrainbowlabs Jul 09 '15 at 14:33
  • 2
    If your `/tmp` is mounted as `tmpfs`, then yes, it does. tar itself doesen't seem to use much memory in the screenshot. – Fox Jul 09 '15 at 14:48
  • I don't see a `tar` command here. Exactly what are you running? – Michael Hampton Jul 09 '15 at 16:52
  • @MichaelHampton Sorry, I meant to include that in the question. Question updated. – starbeamrainbowlabs Jul 09 '15 at 18:07
  • Are you putting `archive.tar` in `/tmp` then? – Michael Hampton Jul 09 '15 at 18:10
  • @MichaelHampton Yes I was. I have changed it to a different folder now and I still get the same problem. – starbeamrainbowlabs Jul 09 '15 at 19:00

1 Answers1

3

Your image shows quite the contrary, actually.

As you can see under the RES column, tar memory consumption is quite low. You RAM usage appear to increase simply because Linux is actively caching the data read by the tar command. This, in turn, causes memory pressure and dirty page writeback (basically, the system flushes its write cache to accommodate for the greater read-caching required) and, possibly, useful data are evicted from the I/O cache.

Unfortunately, it seems that tar itself can not be instructed to use O_DIRECT or POSIX_FADVISE (both of which can be used to "bypass" the cache). So, using tar there is not a real solution out here...

shodanshok
  • 47,711
  • 7
  • 111
  • 180
  • Thanks for your explanation. Is there a different tool I can use then that doesn't fill up the read cache? – starbeamrainbowlabs Jul 11 '15 at 10:42
  • Unfortunately, only some tools support direct I/O operations. The most common tool is `dd`, and you can use it to compress a file using something as `dd if=srcfile bs=1M iflag=direct | bzip2 newfile.bz2`. However, this clearly is a no match for a full directory tree tar – shodanshok Jul 11 '15 at 13:30
  • Thanks for the help. Perhaps I need more ram then...? – starbeamrainbowlabs Jul 11 '15 at 16:04
  • 1
    You probably need more RAM _and_ a faster disk subsystem. As a workaround, you can try to totally disable filesystem caching during the tar/bz2 process, then reenable it. To disable caching, remount your filesystem with the `sync` option. For example, using your / filesystem for the tar/bz2 process, you should issue `mount / -o remount,sync`. Then, **after** completion, you can remount it with caching enabled using `mount / -o remount,async` – shodanshok Jul 11 '15 at 16:28
  • Unfortunately I get `mount: permission denied` if I try the remount sync command. The `async` one works though. I think this must be because OpenVZ doesn't support it on their VPS classic? Apparently I am running on an SSD. As for my kernel, I am using (and can't change from) `Linux 2.6.32-042stab108.5`. – starbeamrainbowlabs Jul 11 '15 at 16:57
  • 2
    Update: I have found a tool called [nocache](https://github.com/Feh/nocache) which prevents read files from being cached - this seems to solve the problem :D – starbeamrainbowlabs Jul 11 '15 at 17:23
  • Interesting utility... I wrote something similar some time ago, just for testing. Anyway, if my reply helped you, please mark it as the accepted answer. – shodanshok Jul 11 '15 at 19:32
  • Done - thanks for reminding me! Your answer was helpful in working out what the problem actually was so I could go about finding a solution :) – starbeamrainbowlabs Jul 12 '15 at 13:16