146

Let's say I have a text file of hundreds of URLs in one location, e.g.

http://url/file_to_download1.gz
http://url/file_to_download2.gz
http://url/file_to_download3.gz
http://url/file_to_download4.gz
http://url/file_to_download5.gz
....

What is the correct way to download each of these files with wget? I suspect there's a command like wget -flag -flag text_file.txt

ShanZhengYang
  • 16,511
  • 49
  • 132
  • 234
  • 2
    Anybody end up here after trying to get US topos at nationalmap.gov? – Dave Nov 10 '17 at 03:40
  • 1
    Besides wget -i, You'll want to add some switches so you don't get banned from the servers for hammering them! And so that if it can't download one it doesn't keep trying for too long `-w and -t and -T` may be of interest – barlop Jun 05 '20 at 13:18

5 Answers5

280

Quick man wget gives me the following:

[..]

-i file

--input-file=file

Read URLs from a local or external file. If - is specified as file, URLs are read from the standard input. (Use ./- to read from a file literally named -.)

If this function is used, no URLs need be present on the command line. If there are URLs both on the command line and in an input file, those on the command lines will be the first ones to be retrieved. If --force-html is not specified, then file should consist of a series of URLs, one per line.

[..]

So: wget -i text_file.txt

Community
  • 1
  • 1
randomnickname
  • 2,954
  • 1
  • 11
  • 7
59

try:

wget -i text_file.txt

(check man wget)

ceyquem
  • 2,079
  • 1
  • 18
  • 39
27

Run it in parallel with

cat text_file.txt | parallel --gnu "wget {}"
Yuseferi
  • 7,931
  • 11
  • 67
  • 103
  • 3
    If Parallel's demand for citation is annoying, use xargs: `cat text_file.txt | xargs -n10 -P4 wget`. This tells xargs to call wget with 10 URLs and run 4 wget processes at a time. For a little bit nicer experience, here's what I do: `cat text_file.txt | shuf | xargs -n10 -P4 wget --continue`. This (1) shuffles the URLs so when you stop and restart, it's more likely to start downloading new files right away, and (2) it asks wget to `continue` partial downloads (you might get some if you Control-C while wget is downloading). – Ahmed Fasih Mar 27 '22 at 07:00
26

If you also want to preserve the original file name, try with:

wget --content-disposition --trust-server-names -i list_of_urls.txt
ilCosmico
  • 1,319
  • 15
  • 26
10

If you're on OpenWrt or using some old version of wget which doesn't gives you -i option:

#!/bin/bash
input="text_file.txt"
while IFS= read -r line
do
  wget $line
done < "$input"

Furthermore, if you don't have wget, you can use curl or whatever you use for downloading individual files.

0x48piraj
  • 395
  • 3
  • 12