-1

I've set up an Ubuntu Server with microk8s, with the dns, dashboard and prometheus addons. It's running some Cardano nodes.

On the (built-in) Grafana dashboard "Default / Nodes" I see spikes in the Disk IO "on time" every 5 minutes like clockwork:

5 minute Disk IO spikes

I'm curious what this is - I don't know anything that Ubuntu/microk8s would be doing exactly every 5 minutes. How can I identify what this is? (ideally without installing additional software on the host). I can't find anything in any log files that gives a clue.

Danny Tuppeny
  • 237
  • 1
  • 3
  • 9
  • Use `iotop` to identify processes doing IO. – berndbausch Apr 24 '21 at 01:06
  • @berndbausch thanks - I tried that, and can see some processes writing but data, but almost everything is `kube-apiserver`. It may be useful if I could see the files written rather than the processes - I see there are soe similar tools like sysdig that may be worth trying. – Danny Tuppeny Apr 24 '21 at 09:58
  • You can start with `lsof -p PID_APISERVER`, but this will only help if the process doesn't change files too often. In the latter case, check `open` system calls with `strace`. Personally, I don't know `sysdig`; good luck! – berndbausch Apr 24 '21 at 10:37
  • @berndbausch thanks! Using `lsof` I found mention of dqlite which lead me to https://microk8s.io/docs/ha-recovery which has a path for the database, and in there I found "snapshot-foo" files with 5minute gaps in timestamps. So I guess it's microk8s snapshotting its config every 5 min and writing this 18MB file. Thanks! – Danny Tuppeny Apr 24 '21 at 11:17

1 Answers1

1

Using iotop I was able identify the process writing data every 5 min, I ran it as:

sudo iotop -a -o

-a makes the values accumulate and -o shows only those actually reading/writing. 5 mins after the previous spike, kube-apiserver appeared top of the list with an icnrease of around 18MB of data.

Using lsof as suggested by berndbausch, I scanned through the files and noticed some mentions of dqlite (which I knew was a database that microk8s used) and found https://microk8s.io/docs/ha-recovery which listed /var/snap/microk8s/current/var/kubernetes/backend as the storage path.

Looking in that directory, I found some snapshot files that were timestamped exactly 5 mins apart. They were 18MB in size. I waited another 5 minutes, and the older one disappeared and another appeared.

Mystery solved!

I can't find any documentation about microk8s/dqlite doing 5-minute snapshots (or whether it's configurable), though it was good to learn a few more tools while tracking it down.

Danny Tuppeny
  • 237
  • 1
  • 3
  • 9