btrfs write operations hang when appending to files

Question

When writing sequentially to files on my large-ish (70 TiB) btrfs filesystem, I am noticing that something is causing prolonged write blocks (seconds to several minutes). This is inconvenient at best and is causing application timeouts in some cases.

My understanding is that btrfs commits should happen in the background and not block my writes, and this is also what I see ~~when writing sequentially to an existing file without having new blocks allocated for it~~ most of the time. If I am writing ~~and extending my file~~ at high data rates, it becomes messy quickly.

My test case is a simple dd if=/dev/zero of=/mnt/btrfs/subvolume/testfile which starts writing at 150 MB/s, mainly dirtying blocks in the host's page cache. The file system has some background activity here which is causing the reads seen in the statistics collected for this device with iostat -xdmt 1:

Time:       rrqm/s wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
01:09:59 PM 0.00     0.00 3761.00    0.00    14.69     0.00     8.00     0.38    0.10   0.10  38.00
01:10:00 PM 0.00     0.00 2868.00    0.00    11.20     0.00     8.00     0.42    0.15   0.15  41.60
01:10:01 PM 0.00     0.00 4002.00    0.00    15.63     0.00     8.00     0.43    0.11   0.11  42.80
01:10:02 PM 0.00     0.00 3331.00    0.00    13.01     0.00     8.00     0.54    0.16   0.16  54.40
01:10:03 PM 0.00     0.00 2674.00    0.00    10.45     0.00     8.00     0.62    0.23   0.23  62.00
01:10:04 PM 0.00     0.00 2771.00    0.00    10.82     0.00     8.00     0.59    0.21   0.21  58.80
01:10:05 PM 0.00    12.00 3004.00 2744.00    11.73   264.09    98.28     5.80    1.01   0.09  50.80
01:10:06 PM 0.00    24.00 2661.00 1970.00    10.39    92.86    45.66    75.36   16.12   0.16  74.80
01:10:07 PM 0.00     0.00 1556.00  628.00     6.08   171.23   166.27     9.84    4.83   0.33  72.00
01:10:08 PM 0.00     0.00 1298.00    0.00     5.07     0.00     8.00     0.41    0.31   0.27  35.20

So far, all looks well and the commit at 01:10:05 has caused ~500 MB of data being flushed to the disk. This however is what happened subsequently:

Time:       rrqm/s wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
01:10:09 PM 0.00     0.00 1459.00  115.00     5.70    28.75    44.82     0.62    0.38   0.36  56.80
01:10:10 PM 0.00     8.00  268.00 6816.00     1.05   567.26   164.30   101.84   12.83   0.14  96.80
01:10:11 PM 0.00     4.00    4.00 5396.00     0.02   857.71   325.30   155.78   30.02   0.19 100.40
01:10:12 PM 0.00    16.00   10.00 3179.00     0.04   489.67   314.50   162.47   43.53   0.31 100.00
01:10:13 PM 0.00    13.00    6.00 2785.00     0.02   412.01   302.35   148.54   45.91   0.36 100.00
01:10:14 PM 0.00     8.00    0.00 2745.00     0.00   431.29   321.77   141.90   47.04   0.36 100.00
01:10:15 PM 0.00     9.00   16.00 4235.00     0.06   660.77   318.37   144.02   47.59   0.24 100.00
01:10:16 PM 0.00     8.00   11.00 2771.00     0.04   455.26   335.18   145.08   45.15   0.36 100.00
01:10:17 PM 0.00    18.00   17.00 2688.00     0.07   416.14   315.12   149.61   52.68   0.37 100.00
01:10:18 PM 0.00    15.00   32.00 4133.00     0.12   641.32   315.41   154.58   42.92   0.24 100.00
01:10:19 PM 0.00     3.00   23.00 1921.00     0.09   306.26   322.74   141.80   47.59   0.51 100.00
01:10:20 PM 0.00    10.00   13.00 3553.00     0.05   546.22   313.73   144.95   51.66   0.28 100.00
01:10:21 PM 0.00     6.00   24.00 4623.00     0.09   688.63   303.53   154.29   35.90   0.22 100.00
01:10:22 PM 0.00     8.00   61.00 10151.00     0.24   176.37    35.42   136.80   13.06   0.10 100.00
01:10:23 PM 0.00     0.00   22.00 9858.00     0.09    43.74     9.09   142.66   14.88   0.10 100.00
01:10:24 PM 0.00     0.00    4.00 14740.00     0.02    68.51     9.52   137.17    9.33   0.07 100.00
01:10:25 PM 0.00     0.00    4.00 9070.00     0.02    39.50     8.92   143.02   14.69   0.11 100.00
01:10:26 PM 0.00     0.00    6.00 10659.00     0.02    46.90     9.01   143.00   14.29   0.09 100.00
01:10:27 PM 0.00     0.00    8.00 13098.00     0.03    60.62     9.48   140.74   10.88   0.08 100.00
01:10:28 PM 0.00     0.00    7.00 10284.00     0.03    46.80     9.32   141.98   12.37   0.10 100.00
01:10:29 PM 0.00     0.00    8.00 8562.00     0.03    38.33     9.17   143.43   17.97   0.12 100.00
01:10:30 PM 0.00     0.00    7.00 10996.00     0.03    50.37     9.38   141.34   13.11   0.09 100.00
01:10:31 PM 0.00     0.00    6.00 12012.00     0.02    57.04     9.72   141.58   11.72   0.08 100.00
01:10:32 PM 0.00     0.00    7.00 12666.00     0.03    57.08     9.23   143.91   11.35   0.08 100.00
01:10:33 PM 0.00     0.00    7.00 7741.00     0.03    34.26     9.06   142.60   18.57   0.13 100.00
Time:       rrqm/s wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
01:10:34 PM 0.00     0.00    3.00 7262.00     0.01    30.16     8.50   144.23   19.86   0.14 100.00
01:10:35 PM 0.00     0.00    7.00 6518.00     0.03    27.27     8.57   144.97   20.80   0.15 100.00
01:10:36 PM 0.00     0.00    5.00 4828.00     0.02    19.70     8.36   143.86   30.95   0.21 100.00
01:10:37 PM 0.00     0.00    0.00 7858.00     0.00    35.13     9.16   144.19   18.65   0.13 100.00
01:10:38 PM 0.00     0.00    4.00 11041.00     0.02    48.77     9.05   143.72   13.04   0.09 100.00
01:10:39 PM 0.00     0.00   11.00 8409.00     0.04    35.82     8.72   144.18   16.87   0.12 100.00
01:10:40 PM 0.00     0.00 1440.00 1206.00     5.62     5.09     8.29    33.14   14.05   0.29  77.20
01:10:41 PM 0.00     0.00 2464.00    0.00     9.62     0.00     8.00     0.60    0.24   0.24  59.60
01:10:42 PM 0.00     0.00 2230.00  950.00     8.71   259.55   172.77     9.96    1.99   0.18  58.40
01:10:43 PM 0.00     0.00   10.00 3031.00     0.04   827.06   557.02   143.37   47.04   0.33 100.00
01:10:44 PM 0.00     0.00    7.00 1918.00     0.03   523.48   556.96   144.06   73.96   0.52 100.40
01:10:45 PM 0.00     0.00    8.00 2206.00     0.03   601.99   556.89   143.66   64.11   0.45 100.00
01:10:46 PM 0.00     0.00   16.00 2286.00     0.06   623.87   555.09   141.53   61.57   0.43 100.00
01:10:47 PM 0.00     1.00  826.00  533.00     3.23   145.17   223.63    35.18   31.19   0.64  86.80
01:10:48 PM 0.00     0.00 2596.00    0.00    10.14     0.00     8.00     0.64    0.25   0.25  64.00
01:10:49 PM 0.00     0.00 1725.00  570.00     6.74   154.55   143.93     3.89    1.69   0.33  75.60

This amounts to 30 seconds of a full queue and a blocked dd process in the meantime. The process remains in D (uninterruptable sleep) state for the duration of the block.

This is not as cleanly reproducible as I would wish it to be, in terms that block/delay times may vary significantly. What is happening here and how could I remedy the blockages?

Environment

SLES 11 SP3. Mount options: /dev/sdb1 on /mnt/btrfs type btrfs (rw,noatime,thread_pool=16). I tried mounting with nodatacow as well, but dd continues playing Sleeping Beauty with me.

$ uname -a
Linux host-1 3.0.101-0.47.71-default #1 SMP Thu Nov 12 12:22:22 UTC 2015 (b5b212e) x86_64 x86_64 x86_64 GNU/Linux

# cat /sys/bus/scsi/devices/*/queue_depth
256
256
256

BTRFS is spreading over one single 72 TiB disk which is a RAID-60 VD off an LSI MPT RAID (MegaRAID 2208 w/ 1 GB FBWC), this is not my root file system and not used for any system-internal activity (swap, logs, dumps, ...).

The platform hardware is a single-CPU E5-2620 (no NUMA) with 64 GB of RAM.

Edit

My real-world use case is a Windows client writing data to a dynamically expanding vhdx (virtual disk) file through Samba (3.6.3), but the behavior is essentially the same as with dd - the file is growing rapidly and sequentially as blocks are written to the vhdx virtual disk. It also is blocking just the same way as dd

I have done some additional troubleshooting on this, concentrating on processes in D state. There I noticed, that a number of btrfs-related Kernel threads are stuck in D along with Samba. As I cannot use strace to attach to Kernel threads, I tried working with the information from wchan as output in ps. This is supposed to tell me which function caused the process/thread to be waiting in D:

while true; do ps -eo pid,stat,pcpu,nwchan,wchan=WIDE-WCHAN-COLUMN -o args | awk -v date="$(date +%r)" '$2 ~ /D/ {  print date " " $4 " " $5  " " $6 }'; sleep 0.2; done

And this is what it looks like in action.
Here, the regular 30-seconds-commit starts:

11:15:11 AM     /usr/sbin/smbd  lock_page
11:15:11 AM     [flush-btrfs-1] -
11:15:12 AM     /usr/sbin/smbd  lock_page
11:15:12 AM     [flush-btrfs-1] -
11:15:12 AM     [btrfs-endio-wri]       wait_on_page_lock
11:15:12 AM     /usr/sbin/smbd  lock_page
11:15:12 AM     [flush-btrfs-1] -
11:15:12 AM     [btrfs-submit-1]        get_request_wait
11:15:12 AM     /usr/sbin/smbd  lock_page
11:15:12 AM     /usr/sbin/smbd  lock_page
11:15:12 AM     [flush-btrfs-1] -
11:15:12 AM     [btrfs-submit-1]        get_request_wait
11:15:12 AM     [btrfs-transacti]       wait_on_page_lock
11:15:12 AM     /usr/sbin/smbd  lock_page
11:15:12 AM     [flush-btrfs-1] lock_page
11:15:13 AM     [btrfs-transacti]       wait_on_page_lock
11:15:13 AM     /usr/sbin/smbd  lock_page
11:15:13 AM     [flush-btrfs-1] lock_page
11:15:13 AM     [btrfs-transacti]       wait_on_page_lock
11:15:13 AM     /usr/sbin/smbd  lock_page
11:15:13 AM     [flush-btrfs-1] lock_page
11:15:13 AM     [btrfs-transacti]       wait_on_page_lock
11:15:13 AM     /usr/sbin/smbd  -
11:15:13 AM     [btrfs-transacti]       wait_on_page_lock
11:15:13 AM     /usr/sbin/smbd  lock_page
11:15:13 AM     [btrfs-transacti]       wait_on_page_lock
11:15:13 AM     /usr/sbin/smbd  lock_page

At this point, it seems to have completed. An iostat -xdmt run confirms that nothing is written to disk after 11:15:13:

01/07/2016 11:15:11 AM
Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
sdb               0.00     0.00 3437.00    0.00    13.43     0.00     8.00     0.50    0.15   0.15  50.40
01/07/2016 11:15:12 AM
sdb               0.00     0.00 1056.00 4475.00     4.12  1221.39   453.78    58.13    9.59   0.16  87.60
01/07/2016 11:15:13 AM
sdb               0.00     0.00 1477.00 1082.00     5.77   293.28   239.33    55.36   23.63   0.36  91.20
01/07/2016 11:15:14 AM
sdb               0.00     0.00 2846.00    0.00    11.12     0.00     8.00     0.89    0.32   0.24  68.80
01/07/2016 11:15:15 AM
sdb               0.00     0.00 1591.00    0.00     6.21     0.00     8.00     0.64    0.40   0.34  54.40

Then, at around 11:15:17, the writing process is blocked for good and not responding anymore until around 11:15:29:

11:15:14 AM     /usr/sbin/smbd  -
11:15:14 AM     /usr/sbin/smbd  lock_page
11:15:14 AM     /usr/sbin/smbd  lock_page
11:15:14 AM     /usr/sbin/smbd  lock_page
11:15:15 AM     /usr/sbin/smbd  lock_page
11:15:15 AM     [btrfs-transacti]       wait_on_page_lock
11:15:15 AM     /usr/sbin/smbd  lock_page
11:15:16 AM     /usr/sbin/smbd  lock_page
11:15:16 AM     /usr/sbin/smbd  lock_page
11:15:17 AM     /usr/sbin/smbd  lock_page
11:15:17 AM     [btrfs-transacti]       get_request_wait
11:15:17 AM     /usr/sbin/smbd  get_request_wait
11:15:17 AM     [btrfs-transacti]       get_request_wait
11:15:17 AM     /usr/sbin/smbd  get_request_wait
11:15:17 AM     [btrfs-transacti]       get_request_wait
11:15:17 AM     /usr/sbin/smbd  lock_page
11:15:18 AM     [btrfs-transacti]       get_request_wait
11:15:18 AM     /usr/sbin/smbd  lock_page
11:15:18 AM     /usr/sbin/smbd  lock_page
11:15:18 AM     [btrfs-transacti]       get_request_wait
11:15:18 AM     /usr/sbin/smbd  lock_page
11:15:18 AM     [btrfs-transacti]       get_request_wait
11:15:18 AM     /usr/sbin/smbd  lock_page
11:15:18 AM     [btrfs-transacti]       get_request_wait
11:15:18 AM     /usr/sbin/smbd  lock_page
11:15:19 AM     [btrfs-transacti]       get_request_wait
11:15:19 AM     /usr/sbin/smbd  lock_page
11:15:19 AM     [btrfs-transacti]       get_request_wait
11:15:19 AM     /usr/sbin/smbd  lock_page
11:15:19 AM     [btrfs-transacti]       get_request_wait
11:15:19 AM     /usr/sbin/smbd  lock_page
11:15:19 AM     [btrfs-transacti]       get_request_wait
11:15:19 AM     /usr/sbin/smbd  lock_page
11:15:19 AM     [btrfs-transacti]       get_request_wait
11:15:19 AM     /usr/sbin/smbd  lock_page
11:15:20 AM     /usr/sbin/smbd  lock_page
11:15:20 AM     [btrfs-transacti]       get_request_wait
11:15:20 AM     /usr/sbin/smbd  lock_page
11:15:20 AM     [btrfs-transacti]       get_request_wait
11:15:20 AM     /usr/sbin/smbd  lock_page
11:15:20 AM     [btrfs-transacti]       get_request_wait
11:15:20 AM     /usr/sbin/smbd  lock_page
11:15:20 AM     [btrfs-transacti]       get_request_wait
11:15:20 AM     /usr/sbin/smbd  lock_page
11:15:21 AM     [btrfs-transacti]       get_request_wait
11:15:21 AM     /usr/sbin/smbd  lock_page
11:15:21 AM     [btrfs-transacti]       get_request_wait
11:15:21 AM     /usr/sbin/smbd  lock_page
11:15:21 AM     [btrfs-transacti]       get_request_wait
11:15:21 AM     /usr/sbin/smbd  lock_page
11:15:21 AM     [btrfs-transacti]       get_request_wait
11:15:21 AM     /usr/sbin/smbd  lock_page
11:15:22 AM     [btrfs-transacti]       get_request_wait
11:15:22 AM     /usr/sbin/smbd  lock_page
11:15:22 AM     [btrfs-transacti]       get_request_wait
11:15:22 AM     /usr/sbin/smbd  lock_page
11:15:22 AM     [btrfs-transacti]       get_request_wait
11:15:22 AM     /usr/sbin/smbd  lock_page
11:15:22 AM     [btrfs-transacti]       get_request_wait
11:15:22 AM     /usr/sbin/smbd  lock_page
11:15:22 AM     [btrfs-transacti]       get_request_wait
11:15:22 AM     /usr/sbin/smbd  lock_page
11:15:23 AM     [btrfs-transacti]       get_request_wait
11:15:23 AM     /usr/sbin/smbd  lock_page
11:15:23 AM     [btrfs-transacti]       get_request_wait
11:15:23 AM     /usr/sbin/smbd  lock_page
11:15:23 AM     [btrfs-transacti]       get_request_wait
11:15:23 AM     /usr/sbin/smbd  lock_page
11:15:23 AM     [btrfs-transacti]       get_request_wait
11:15:23 AM     /usr/sbin/smbd  lock_page
11:15:23 AM     [btrfs-transacti]       get_request_wait
11:15:23 AM     /usr/sbin/smbd  lock_page
11:15:24 AM     [btrfs-transacti]       get_request_wait
11:15:24 AM     /usr/sbin/smbd  lock_page
11:15:24 AM     [btrfs-transacti]       get_request_wait
11:15:24 AM     /usr/sbin/smbd  lock_page
11:15:24 AM     /usr/sbin/smbd  lock_page
11:15:24 AM     [btrfs-transacti]       get_request_wait
11:15:24 AM     /usr/sbin/smbd  get_request_wait
11:15:24 AM     [btrfs-transacti]       get_request_wait
11:15:24 AM     /usr/sbin/smbd  lock_page
11:15:25 AM     [btrfs-transacti]       get_request_wait
11:15:25 AM     /usr/sbin/smbd  lock_page
11:15:25 AM     [btrfs-transacti]       get_request_wait
11:15:25 AM     /usr/sbin/smbd  lock_page
11:15:25 AM     [btrfs-transacti]       get_request_wait
11:15:25 AM     /usr/sbin/smbd  lock_page
11:15:25 AM     [btrfs-transacti]       get_request_wait
11:15:25 AM     /usr/sbin/smbd  lock_page
11:15:26 AM     [btrfs-transacti]       get_request_wait
11:15:26 AM     /usr/sbin/smbd  lock_page
11:15:26 AM     [btrfs-transacti]       get_request_wait
11:15:26 AM     /usr/sbin/smbd  lock_page
11:15:26 AM     [btrfs-transacti]       get_request_wait
11:15:26 AM     /usr/sbin/smbd  lock_page
11:15:26 AM     [btrfs-transacti]       get_request_wait
11:15:26 AM     /usr/sbin/smbd  lock_page
11:15:26 AM     [btrfs-transacti]       get_request_wait
11:15:26 AM     /usr/sbin/smbd  lock_page
11:15:27 AM     [btrfs-transacti]       get_request_wait
11:15:27 AM     /usr/sbin/smbd  lock_page
11:15:27 AM     [btrfs-transacti]       get_request_wait
11:15:27 AM     /usr/sbin/smbd  lock_page
11:15:27 AM     [btrfs-transacti]       get_request_wait
11:15:27 AM     /usr/sbin/smbd  lock_page
11:15:27 AM     [btrfs-transacti]       get_request_wait
11:15:27 AM     /usr/sbin/smbd  lock_page
11:15:27 AM     [btrfs-transacti]       get_request_wait
11:15:27 AM     /usr/sbin/smbd  lock_page
11:15:28 AM     [btrfs-transacti]       get_request_wait
11:15:28 AM     /usr/sbin/smbd  lock_page
11:15:28 AM     [btrfs-transacti]       get_request_wait
11:15:28 AM     /usr/sbin/smbd  lock_page
11:15:28 AM     [btrfs-transacti]       get_request_wait
11:15:28 AM     /usr/sbin/smbd  lock_page
11:15:28 AM     [btrfs-transacti]       get_request_wait
11:15:28 AM     /usr/sbin/smbd  lock_page
11:15:29 AM     [btrfs-transacti]       get_request_wait
11:15:29 AM     /usr/sbin/smbd  lock_page
11:15:29 AM     [btrfs-transacti]       get_request_wait
11:15:29 AM     /usr/sbin/smbd  lock_page
11:15:29 AM     [btrfs-transacti]       get_request_wait
11:15:29 AM     /usr/sbin/smbd  lock_page
11:15:29 AM     [btrfs-transacti]       get_request_wait
11:15:29 AM     /usr/sbin/smbd  lock_page
11:15:29 AM     [btrfs-transacti]       get_request_wait
11:15:29 AM     /usr/sbin/smbd  lock_page
11:15:30 AM     /usr/sbin/smbd  lock_page
11:15:30 AM     /usr/sbin/smbd  lock_page
11:15:30 AM     /usr/sbin/smbd  lock_page
11:15:30 AM     /usr/sbin/smbd  lock_page

This is reflected in iostats data as write load with high avgqu-sz:

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
11:15:16 AM
sdb               0.00     0.00 3865.00    0.00    15.10     0.00     8.00     0.46    0.12   0.11  43.20
11:15:17 AM
sdb               0.00     0.00  706.00 9021.00     2.76   116.05    25.01    62.22    6.13   0.10  95.60
11:15:18 AM
sdb               0.00     0.00   13.00 7344.00     0.05    32.70     9.12   141.22   19.35   0.14 100.00
11:15:19 AM
sdb               0.00     0.00   12.00 6219.00     0.05    25.79     8.49   142.56   22.29   0.16 100.00
11:15:20 AM
sdb               0.00     0.00   17.00 4987.00     0.07    20.30     8.34   144.62   29.01   0.20 100.00
11:15:21 AM
sdb               0.00     0.00    2.00 5965.00     0.01    24.98     8.58   143.66   24.63   0.17 100.00
11:15:22 AM
sdb               0.00     0.00    0.00 8145.00     0.00    39.87    10.03   143.13   17.25   0.12 100.00
11:15:23 AM
sdb               0.00     0.00    3.00 4371.00     0.01    18.66     8.74   143.18   32.77   0.23 100.00
11:15:24 AM
sdb               0.00     0.00    2.00 4142.00     0.01    17.32     8.56   144.14   34.99   0.24 100.00
11:15:25 AM
sdb               0.00     0.00    4.00 5014.00     0.02    20.66     8.44   142.62   28.58   0.20 100.00
11:15:26 AM
sdb               0.00     0.00    2.00 4321.00     0.01    17.68     8.38   146.49   33.98   0.23 100.00
11:15:27 AM
sdb               0.00     0.00    0.00 5741.00     0.00    25.59     9.13   145.61   25.26   0.17 100.00
11:15:28 AM
sdb               0.00     0.00    6.00 6775.00     0.02    28.79     8.70   145.66   21.69   0.15 100.00
11:15:29 AM
sdb               0.00     0.00    8.00 9546.00     0.03    43.75     9.38   141.55   14.60   0.10 100.00
11:15:30 AM
sdb               0.00     0.00  544.00 5247.00     2.12    24.03     9.25    73.91   13.26   0.16  92.00
11:15:31 AM
sdb               0.00     0.00 2294.00    0.00     8.96     0.00     8.00     0.62    0.27   0.27  62.40

Looking at what get_request_wait is supposed to mean, I stumbled upon "Understanding the Linux Kernel" which is covering the I/O scheduler as well and is stating:

The number of request descriptors may become, under very heavy loads and high disk activity, a bottleneck. A dearth of free descriptors may force processes to wait until an ongoing data transfer terminates. Thus, a wait queue is used to queue processes waiting for a free request element. The get_request_wait() tries to get a free request descriptor and puts the current process to sleep in the wait queue, if none is found.

^{(emphasis mine)}

I tried playing with the value in /sys/block/sdb/queue/nr_requests, increasing it to 4096 (from 128). This gave me even longer queue size values in iotop (~4k) at times of the stall, but the overall picture remained the same: many small (8K) IO write operations for a prolonged period of time at approximately the same rate (4K-12K IOPS) blocking further writes to the file.

the btrfs mount option: `commit=[num]` (since 3.12) *Set the interval of periodic commit, 30 seconds by default. Higher values defer data being synced to permanent storage with obvious consequences when the system crashes. The upper bound is not forced, but a warning is printed if it's more than 300 seconds (5 minutes)...* might be what you're looking for. you might also want *nodatacow*. — mikeserv, Jan 05 '16 at 14:51
@mikeserv I do not have 3.12, hence the commit period is hardcoded to 30 seconds. Also, apparently it isn't the commit itself that is causing problems but something related to the way btfrs handles file growth. Having the writers block for a shorter amount of time but more frequently is not that much of an improvement. I am going to try `chattr +C testfile` to set the nodatacow attribute and see if it helps matters. — the-wabbit, Jan 05 '16 at 19:51
the btrfs filesystem grows by chunks. i think its 256 mbs per chunk? the fs spreads like kudzu - it often isn't really predictable where its next chunk will physically wind up on disk because the fs is designed to handle adding/removing devices online and to self-heal in the process. one consequence of that can be severe fragmentation if you don't scrub and defrag regularly. and when writing very large pieces at a time, you probably want nodatacow. anyway, you need to update your kernel. seriously. its still officially *experimental* - there's no support for outdated kernels w/ btrfs. — mikeserv, Jan 06 '16 at 02:34
@mikeserv thanks for your input. Alas, I am stuck with this Kernel version as it is the one supported for SLES 11.3. SuSE seems to have backported some btrfs-specific patches and is claiming [full btrfs support for SLES 11](https://www.suse.com/releasenotes/x86_64/SUSE-SLES/11-SP3/). I am going to try and open up a support case with them. — the-wabbit, Jan 07 '16 at 11:49

btrfs write operations hang when appending to files

Environment

Edit

0 Answers0