1

My laptop, running Debian testing, has recently been terribly sluggish at operations involving writing to the disk.

I have no idea where the problem comes from and would love some help tracking this down and fixing it.

Here are the symptoms I noticed:

  • iotop will typically show in "DISK WRITE" a bandwidth very close to 500KB/s for any process currently writing to the disk (e.g. cp, dpkg, ...).
  • This occurs regardless of the CPU load.
  • Running 1 cp process results in a total write bandwidth around 500KB/s. Running 10 cp processes results in a total write bandwidth of around 5MB/s.
  • This is on an ext4 filesystem on an LVM volume on an SSD disk. The previous point strongly suggests the limit doesn't come from the hardware, but just in case, I cloned the system to another SSD and got the same result.
  • This problem doesn't affect the machine after a fresh boot but seems to only show up after a while (which means after some suspend-to-ram and wake ups, tho I have no idea if it's related).
  • The slowdown is particularly noticeable when building Emacs where one of the phases of the build generates a so called "pdump file" of about 7MB via many small write calls and where iotop tells me that the process ends up performing a total of more than 400MB of disk writes (at 500KB/s, hence taking more than 10minutes to write this miserable 7MB file). This suggests that the file is being "sync'd" at a terribly fine granularity although I don't see anything in the source code justifying this behavior.
  • I tried to fsck -f after one of the reboots and it did not signal any problem.
  • dmesg does not contain any unusual message from the ext4 or lvm layers nor from the block device layer.
  • The problem affects all 3 filesystems I'm using on the SSD (all 3 using ext4 in the same LVM volume group). It does not affect the tmpfs-mounted /tmp.
  • This machine has been running the latest Debian testing and showing those signs for a few months now, with various kernels (now running 5.2.0-2-686-pae, not sure what was the first kernel version with which I saw this problem).

As requested here's some extra info.

% df -h
Sys. de fichiers          Taille Utilisé Dispo Uti% Monté sur
udev                        3.9G       0  3.9G   0% /dev
tmpfs                       796M     26M  770M   4% /run
/dev/mapper/Alfajor-root     19G     16G  2.3G  88% /
tmpfs                       3.9G     30M  3.9G   1% /dev/shm
tmpfs                       5.0M    8.0K  5.0M   1% /run/lock
tmpfs                       3.9G       0  3.9G   0% /sys/fs/cgroup
tmpfs                       512M    8.0K  512M   1% /tmp
/dev/mapper/Alfajor-cache   7.8G    6.7G  1.1G  87% /var/cache
/dev/mapper/Alfajor-home     41G     37G  1.8G  96% /home
tmpfs                       796M     12K  796M   1% /run/user/122
% free -m
              total        used        free      shared  buff/cache   available
Mem:           7952        1188        2724         212        4039        5858
Swap:          4095         213        3882
% uname -a
Linux alfajor 5.3.0-3-686-pae #1 SMP Debian 5.3.15-1 (2019-12-07) i686 GNU/Linux
%
Stefan
  • 133
  • 9
  • check the filesystem for errors and show your current status of your system and due the build – djdomi Dec 23 '19 at 07:28
  • I updated my question with that info, – Stefan Dec 23 '19 at 20:11
  • i still missing outputs from `df -h` and `free - m` and this even due the build time – djdomi Dec 24 '19 at 12:32
  • I don't see anything relevant in there, but I just added it (don't know what you mean by "build time". It's a "Debian vanilla" kernel. I put `uname -a`). – Stefan Dec 24 '19 at 16:00
  • ok, csn you please do the same commanfs during of building emacs? for me it looks like your run into swapping – djdomi Dec 25 '19 at 01:19
  • No, the problem shows up already during `cp`, so it's not a question of swapping. And there's no swapping involved (I keep the exact same `213` MB used in the swap) during the whole build. – Stefan Dec 25 '19 at 17:29

1 Answers1

0

I'm still not completely sure where the problem comes from, but I switched my kernel from i686-pae to amd64 and that seems to have solved it (the rest of my system is still running Debian's i386 port). I suspect it has to do with a problem in the kernel's management of PAE memory (presumably because some buffers need to be within the first 2GB or 4GB of RAM in order to be accessible).

Stefan
  • 133
  • 9