I need to download thousands of files with curl. I know how to parallelize with xargs -Pn
(or gnu parallel
) but I've just discovered curl itself can parallelize downloads with the argument -Z|--parallel
introduced in curl-7.66 (see curl-goez-parallel) which might be cleaner or easier to share.
I need to use -o|--output
option and --create-dirs
. URLs need to be percent-encoded, the URL path becoming the folder path which also need to be escaped as path can contain single quotes, spaces, and usual suspects (hence -O option
is not safe and -OJ option
doesn't help).
If I understand well, curl command should be build like so:
curl -Z -o path/to/file1 http://site/path/to/file1 -o path/to/file2 http://site/path/to/file2 [-o path/to/file3 http://site/path/to/file3, etc.]
This works indeed, but what's the best way to deal with thousands URLS. Can a config
file used with -K config
be useful? what if the -o path/to/file_x http://site/path/to/file_x
is the output of another program? I haven't found any way to record commands in a file, one command per line, say.