7

I have a Thecus N8900 NAS, which is a Linux based file server, providing files via NFS to six clients. For some reason that Thecus support has yet to explain, it runs a script that checks /proc/meminfo every 60 seconds and if the disk cache exceeds 50% of available RAM they do a "echo 3 > /proc/sys/vm/drop_caches" command to flush the cache.

Leaving aside the issue of whether that makes sense or not, the actual "echo 3 > /proc/sys/vm/drop_caches" command can take hours to complete, which seems way too long to me.

The big problem is that when this happens, the load on the machine spikes, as does the disk utilization, making all NFS traffic crawl until the command finally completes, at which point things are responsive again.

The NAS itself has 16 gigs of RAM, 7 drives in a raid6 configuration (plus a hot spare), no drive problems at all (according to S.M.A.R.T. tests).

So the question is: what would cause the drop_caches command to take so long?

rmm
  • 81
  • 1
  • 2
  • 4
    The only reason something like that would be there is to cover up some _even worse_ failing of the system. Is it too late to get a refund? – Michael Hampton Dec 10 '13 at 21:55
  • That was what I was beginning to wonder.... the device has been in place for well over a year with no problems, so even though dropping the caches may not make much sense it didn't seem to take so long and cause any noticable slow downs until recently. – rmm Dec 10 '13 at 22:02
  • 5
    ...adding Thecus to my "never buy from" list. – EEAA Dec 10 '13 at 22:48
  • Looks like it could be caused by some kernel bug. Something like this [write-back cache loop](https://lkml.org/lkml/2004/4/29/7). – Dima Chubarov Dec 11 '13 at 05:49
  • How long does it take if 'sync' is run immediately before the 'echo ...' command? Or for that matter, how long does 'sync' take on its own? – rickhg12hs Dec 12 '13 at 01:32
  • A sync takes a few seconds on its own. Haven't tried syncing then dropping caches because there is still one that has been running for well over 24 hours now. I'm going to have to reboot the NAS later tonight and see what happens. – rmm Dec 12 '13 at 05:15
  • If you can allow yourself to do so (and take the risk), I would try to deactivate this cron (or whatever other scheduling they are using) job. So that this, in my opinion, unnecessary command is not run. – Huygens Dec 16 '13 at 08:59

3 Answers3

1

Dropping the caches shouldn't take much time at all. Are you sure it's really not returning from that echo command for several hours?

It makes sense that the machine is slower after the caches are dropped, since files that it could previously read from cache now have to be read from disk.

sciurus
  • 12,678
  • 2
  • 31
  • 49
  • Yes, ps shows '32403 root 9028 D sh -c echo 3 > /proc/sys/vm/drop_caches', process ID never changes so it appears to be the same one. – rmm Dec 10 '13 at 22:00
  • Six hours later and it's still running. Can't kill it either. – rmm Dec 11 '13 at 04:10
1

The command itself should complete instantaneously. The consequences, i.e. everything needs to be cached again, can take a lot of time. It doesn't make sense: if you can remove it completely it would be a good idea.

Maybe you are looking at the wrong command: does it executes also a sync before echo 3 > /proc/sys/vm/drop_caches, such in sync; echo 3 > /proc/sys/vm/drop_caches? Because the sync operation, which flushes all writes to the disk, may take a bit to complete. Also, while also the sync have performance issue, it may have some sense, in case of sudden power failure the data has been written to the disk already so you are going to be safe.

pqnet
  • 236
  • 1
  • 5
0

Could it be this?

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit /mm/vmscan.c?id=1399af7e54896c774d67f1c1acc491b07149421d

  • While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes. - [From Review](/review/late-answers/500848) – bjoster Oct 26 '21 at 13:41