0

I have a txt file list.txt that have bunch of urls, one every line. I want to set a cron job that wget/curl each url of the file once a day but does not save them in the computer.

I tried to run this on the terminal first.

wget -i /root/list.txt -O /dev/null

The command doesnt work understandably. It saves the list.txt to /dev/null, not the files from the urls inside list.txt. Then it says "no urls found".

So how do I do it properly? Wget each urls from a list but dont save anything on the computer?

  • What is the content of `/root/list.txt`? As which user do you run the `cron`? – Romeo Ninov Nov 07 '22 at 13:17
  • In general `wget` is specifically designed to download content from the web where `curl` is much more versatile and for example allows you to use the [`HEAD` method](https://www.rfc-editor.org/rfc/rfc9110.html#name-methods) which may save you on bandwidth and time when you're discarding the output anyway. – diya Nov 07 '22 at 13:19
  • @RomeoNinov "bunch of URLs, one every line" – Sahriar Saikat Nov 07 '22 at 14:04

2 Answers2

3

If you don't want to download the URLs, suppress it by using --spider. And you can remove the clutter with -q, which has the additional benefit the actual errors are still handled by crond and forwarded if set up properly.

wget -i /root/list.txt --spider -q
Gerald Schneider
  • 23,274
  • 8
  • 57
  • 89
  • Thanks, that worked! – Sahriar Saikat Nov 07 '22 at 14:06
  • How do I make it work for URLs? This only works for internal files. when I try `wget -i https://example.com/list.txt --spider`, it shows `example.txt.tmp: No such file or directory No URLs found in https://example.com/list.txt.` – Sahriar Saikat Nov 19 '22 at 06:57
1

Not sure why you're using wget here, as the primary goal of that tool is to download file(s).

With curl and a simple loop, it should work, something like this :

for i in `cat list.txt`; do curl $i; done

And nothing will be downloaded, just a hit on the targeted websites in your text list.

SBO
  • 544
  • 1
  • 5
  • 12