Cronjob to visit each url of a list but not save them

Question

I have a txt file list.txt that have bunch of urls, one every line. I want to set a cron job that wget/curl each url of the file once a day but does not save them in the computer.

I tried to run this on the terminal first.

wget -i /root/list.txt -O /dev/null

The command doesnt work understandably. It saves the list.txt to /dev/null, not the files from the urls inside list.txt. Then it says "no urls found".

So how do I do it properly? Wget each urls from a list but dont save anything on the computer?

What is the content of `/root/list.txt`? As which user do you run the `cron`? — Romeo Ninov, Nov 07 '22 at 13:17
In general `wget` is specifically designed to download content from the web where `curl` is much more versatile and for example allows you to use the [`HEAD` method](https://www.rfc-editor.org/rfc/rfc9110.html#name-methods) which may save you on bandwidth and time when you're discarding the output anyway. — diya, Nov 07 '22 at 13:19

score 3 · Answer 1 · answered Nov 07 '22 at 13:37

3

If you don't want to download the URLs, suppress it by using --spider. And you can remove the clutter with -q, which has the additional benefit the actual errors are still handled by crond and forwarded if set up properly.

wget -i /root/list.txt --spider -q

answered Nov 07 '22 at 13:37

Gerald Schneider

23,274
8
57
89

Thanks, that worked! – Sahriar Saikat Nov 07 '22 at 14:06
How do I make it work for URLs? This only works for internal files. when I try `wget -i https://example.com/list.txt --spider`, it shows `example.txt.tmp: No such file or directory No URLs found in https://example.com/list.txt.` – Sahriar Saikat Nov 19 '22 at 06:57

score 1 · Answer 2 · answered Nov 07 '22 at 13:18

1

Not sure why you're using wget here, as the primary goal of that tool is to download file(s).

With curl and a simple loop, it should work, something like this :

for i in `cat list.txt`; do curl $i; done

And nothing will be downloaded, just a hit on the targeted websites in your text list.

answered Nov 07 '22 at 13:18

SBO

544
1
5
12

What if my list.txt has to be sent over HTTP? like https://example.com/list.txt? – Sahriar Saikat Nov 07 '22 at 13:38
This will actually download the URLs AND print them (which will then be forwarded by cron), just like wget. You want to add `--silent --output /dev/null` to prevent that. – Gerald Schneider Nov 07 '22 at 13:42

Cronjob to visit each url of a list but not save them

2 Answers2