A page contains links to a set of .zip files, all of which I want to download. I know this can be done by wget and curl. How is it done?
Asked
Active
Viewed 7.9k times
3 Answers
137
The command is:
wget -r -np -l 1 -A zip http://example.com/download/
Options meaning:
-r, --recursive specify recursive download.
-np, --no-parent don't ascend to the parent directory.
-l, --level=NUMBER maximum recursion depth (inf or 0 for infinite).
-A, --accept=LIST comma-separated list of accepted extensions.
-
16The `-nd` (no directories) flag is handy if you don't want any extra directories created (i.e., all files will be in the root folder). – Steve Davis Nov 06 '13 at 23:19
-
1How do I tweak this solution for it to go deeper from the given page? I tried -l 20, but wget stops immediatly. – Wrench Nov 27 '15 at 14:46
-
2If the files aren't in the same directory as the starting URL, you might need to get rid of `-np`. If they're on a different host, you'll need `--span-host`. – Dan Sep 26 '18 at 15:28
-
Is there a way to keep the directory structure of the website, but exclude the root folder only, such that the current directly is the root folder of the website instead of a folder with the name of the website's URL? – Aaron Franke Jun 02 '21 at 13:39
92
Above solution does not work for me. For me only this one works:
wget -r -l1 -H -t1 -nd -N -np -A.mp3 -erobots=off [url of website]
Options meaning:
-r recursive
-l1 maximum recursion depth (1=use only this directory)
-H span hosts (visit other hosts in the recursion)
-t1 Number of retries
-nd Don't make new directories, put downloaded files in this one
-N turn on timestamping
-A.mp3 download only mp3s
-erobots=off execute "robots.off" as if it were a part of .wgetrc

Richard
- 56,349
- 34
- 180
- 251

K.-Michael Aye
- 5,465
- 6
- 44
- 56
-
2Source: http://www.commandlinefu.com/commands/view/12498/download-all-music-files-off-of-a-website-using-wget – James Jeffery Sep 18 '14 at 16:10
-
yes, thanks! I didn't remember where it came from, have it just lying in my scripts. – K.-Michael Aye Sep 18 '14 at 19:33
-
-
1+1 for the `-H` switch. This is what was preventing the first answer (which is what I tried before looking on SO) from working. – Alex Jun 26 '17 at 16:10
-
I'm getting a "Mandatory arguments to long options are mandatory for short options too" error with this one. :( – François Leblanc Mar 26 '18 at 15:00
-
Command line Jedi option skills. Works a treat, ensure you have well-formed html document and not just a blob of html body. – chopstik Apr 26 '18 at 14:39
-
actually, i just realize that my source was NOT command-line fu because I answered this on 2013-07-10, while the commandlinefu entry is from 2013-07-13. But I also note that I did not come up with this, I found it somewhere... – K.-Michael Aye Nov 01 '18 at 00:14
-
1
-
1DOH, I must have thought, September=7! Who made this month the 9th month?? :) (Of course, the Romans...) – K.-Michael Aye Apr 15 '21 at 02:43
7
For other scenarios with some parallel magic I use:
curl [url] | grep -i [filending] | sed -n 's/.*href="\([^"]*\).*/\1/p' | parallel -N5 wget -

M Lindblad
- 71
- 1
- 3