4

Here is a simple bash script for HTTP status code

while read url
    do
        urlstatus=$(curl -o /dev/null --silent --head --write-out  '%{http_code}' "${url}" --max-time 5 )
        echo "$url  $urlstatus" >> urlstatus.txt
    done < $1

I am reading URL from text file but it processes only one at a time, taking too much time, GNU parallel and xargs also process one line at time (tested)

How to process simultaneous URL for processing to improve timing? In other words threading of URL file rather than bash commands (which GNU parallel and xargs do)

 Input file is txt file and lines are separated  as
    ABC.Com
    Bcd.Com
    Any.Google.Com

Something  like this

1 Answers1

2

GNU parallel and xargs also process one line at time (tested)

Can you give an example of this? If you use -j then you should be able to run much more than one process at a time.

I would write it like this:

doit() {
    url="$1"
    urlstatus=$(curl -o /dev/null --silent --head --write-out  '%{http_code}' "${url}" --max-time 5 )
    echo "$url  $urlstatus"
}
export -f doit
cat "$1" | parallel -j0 -k doit >> urlstatus.txt

Based on the input:

Input file is txt file and lines are separated  as
ABC.Com
Bcd.Com
Any.Google.Com
Something  like this
www.google.com
pi.dk

I get the output:

Input file is txt file and lines are separated  as  000
ABC.Com  301
Bcd.Com  301
Any.Google.Com  000
Something  like this  000
www.google.com  302
pi.dk  200

Which looks about right:

000 if domain does not exist
301/302 for redirection
200 for success
Ole Tange
  • 31,768
  • 5
  • 86
  • 104
  • I will test and let you know – user7423959 Jan 19 '17 at 02:18
  • hey i got same status code 000 ,, can you tell me how you executing your script from terminal , may it help – user7423959 Jan 19 '17 at 04:33
  • `cat input.txt | parallel -j0 -k doit >> urlstatus.txt;` As you can see, I also get 000 for the domains that do not exist. I am wondering, if you actually give us an extract from your input. If the 6 lines are not actually in your input file, then could you please give 10 lines from your _actual_ input file? – Ole Tange Jan 19 '17 at 07:42
  • i explain the whole process--- 1. i copied your bash script and saved it as bash.sh and giving execution permissions . 2. my input file is big file but i also tested on small 10 lines file---here is list www.yahoo.com ,www.google.com facebook.com amazon.com bing.com apple.com www.microsoft.com www.windows.com ,,,,,all seperated by lines and saved as top.txt 4. now then i go to terminal and type ./bash.sh top.txt 5. now it gives the result 000 in each 6. now can you assist me further where ia am wrong ,,,thanks – user7423959 Jan 19 '17 at 09:18