Why dd can't handle sparse files in shell scripts?

Question

I have the following sparse file that I want to flash to an SD card:

647M -rw-------  1 root     root     4.2G Sep 21 16:53 make_sd_card.sh.xNws4e

As you can see, it takes ~647M on disk for an apparent size of 4.2G. If I flash it directly with dd, in my shell, it's really fast, ~6s:

$ time (sudo /bin/dd if=make_sd_card.sh.xNws4e of=/dev/mmcblkp0 conv=sparse; sync)
8601600+0 records in
8601600+0 records out
4404019200 bytes (4.4 GB, 4.1 GiB) copied, 6.20815 s, 709 MB/s

real    0m6.284s
user    0m1.920s
sys     0m4.336s

But when I do the very same commands inside a shell script, it behaves like if it was copying all the zeroes and takes a big amount of time (~2m10):

$ time sudo ./plop.sh ./make_sd_card.sh.xNws4e
+ dd if=./make_sd_card.sh.xNws4e of=/dev/mmcblk0 conv=sparse
8601600+0 records in
8601600+0 records out
4404019200 bytes (4.4 GB, 4.1 GiB) copied, 127.984 s, 34.4 MB/s
+ sync

real    2m9.885s
user    0m3.520s
sys     0m15.560s

If I watch the dirty section of /proc/meminfo, I can see that this counter is much higher when dd-ing from a shell script than directly from the shell.

My shell is bash an for the record, the script is:

#!/bin/bash
set -xeu
dd if=$1 of=/dev/mmcblk0 conv=sparse bs=512
sync

[EDIT] I'm resurrecting this topic, because a developer I work with, has found these commands: bmap_create and bmap_copy which seems to do exactly what I was trying with achieve clumsily with dd. In debian, they are part of the bmap-tools package. With it, it takes 1m2s to flash a 4.1GB sparse SD image, with a real size of 674MB, when it takes 6m26s with dd or cp.

As an aside, `set -e` is [righty controversial](http://mywiki.wooledge.org/BashFAQ/105); its use (as opposed to explicit, manual error handling) tends to generate subtle and surprising bugs (ie. functions that behave differently depending on whether other code is branching on that function's return value). — Charles Duffy, Sep 21 '18 at 16:07
The link is interesting but, it's a matter of trade off. Having longer and more complicated code, due to an attempt to handle "properly" the errors can be worse than having a maybe simplistic yet simple and less error-prone code. — ncarrier, Sep 23 '18 at 08:08
"Simple"? If you're using primitives that have a wide array of unexpected behaviors, you've introduced a lot of complexity (and room for unexpected errors) into your system, whether it's visually obvious or not. — Charles Duffy, Sep 23 '18 at 14:43
Shell has a lot of these tradeoffs -- maybe the most obvious one is quoting; `if=$1` is shorter than `if="$1"`, but the former does a whole lot of extra and probably-unwanted things (splitting the value on characters in `IFS`, evaluating each piece as a glob, applying runtime configuration such as `globfail` and `nullglob` to decide how to proceed if any element looks like a glob but doesn't match), etc. `set -e` is another case following that pattern; what looks simple is actually complex and fault-prone; what looks complex is well-defined and clear to the (well-informed) reader. — Charles Duffy, Sep 23 '18 at 14:44

score 3 · Accepted Answer · answered Sep 21 '18 at 16:16

3

This difference is caused by a typo in the non-scripted invocation, which did not actually write to your memory card. There is no difference in dd behavior between scripted and interactive invocation.

Keep in mind what a sparse file is: It's a file on a filesystem that's able to store metadata tracking which blocks have values at all, and thus for which zero blocks have never been allocated any storage on disk whatsoever.

This concept -- of a sparse file -- is specific to files. You can't have a sparse block device.

The distinction between your two lines of code is that one of them (the fast one) has a typo (mmcblkp0 instead of mmcblk0), so it's referring to a block device name that doesn't exist. Thus, it creates a file. Files can be sparse. Thus, it creates a sparse file. Creating a sparse file is fast.

The other one, without the typo, writes to the block device. Block devices can't be sparse. Thus, it always takes the full execution time to run.

answered Sep 21 '18 at 16:16

Charles Duffy

280,126
43
390
441

Community Wiki because the question is off-topic (questions about problems caused by a typo are #2 in the "Some questions are still off-topic" list at https://stackoverflow.com/help/on-topic). – Charles Duffy Sep 21 '18 at 16:18
Thank you very much, I feel sooooo stupid for this typo. I naively though that because the file was sparse, dd could optimize in some way, the copy to the disk. And the time "gained" thanks to the typo, strengthened this belief. – ncarrier Sep 23 '18 at 08:02
there's a chicken and egg problem here. How could I know the question was off-topic before it was answered ? :) – ncarrier Sep 23 '18 at 08:13
To be fair, one *could* have a tool written to use, say, a special ioctl that tells the target device "just wipe this page" when there's no content in the source, and on some devices that could well be a faster operation -- it wasn't a completely unreasonable assumption. And, yeah, needing more eyes to identify a typo as cause is the usual case. :) – Charles Duffy Sep 23 '18 at 14:49
Please see my edit, such a tool seems to exist, **bmap-tools** – ncarrier Mar 22 '19 at 08:44

Why dd can't handle sparse files in shell scripts?

1 Answers1