I have written a small bash script for crawling an XML sitemap of URLs. It retrieves 5 URLs in parallel using xargs.
Now I want an E-Mail to be sent when all URLs have been crawled, so it has to wait until all sub-processes of xargs have finished and then send the mail.
I have tried with a pipe after the xargs:
#!/bin/bash
wget --quiet --no-cache -O- http://some.url/test.xml | egrep -o "http://some.url[^<]+" | xargs -P 5 -r -n 1 wget --spider | mail...
and with wait
#!/bin/bash
wget --quiet --no-cache -O- http://some.url/test.xml | egrep -o "http://some.url[^<]+" | xargs -P 5 -r -n 1 wget --spider
wait
mail ...
Which both doesn't work, the email is sent immediately after the script is executed.
How can I achieve this? Unfortunately I don't have the parallel
program on my server (managed hosting).