Monitoring URL Requests from Shell Script

Question

I am required to create a shell script in Mac, which will monitor and if a specified URL (for example, *.google.com) is hit from any browser or program, shell script will prompt or do an operation. Could anyone guide how to do this?

If you need to monitor traffic between points X and Y, it's important to state what are X and Y. For example, X could be computers on a local ethernet network, or the wifi at a coffee shop, or just one particular server. And Y also needs to be well-defined, for example the gateway server of a network that all machines in X must inevitably pass through. Another important thing to clarify is the protocol you need to monitor. Is it simply HTTP? Or HTTPS too? — janos, Jul 16 '17 at 10:26
I simply want to run my script when user hit url from any browser/script/anything from machine. Is it possible? @janos — Sazzad Hissain Khan, Jul 17 '17 at 05:30
I don't know if it is possible to do it as simple you want. I will not state anything about OSX because I don't have any knowledge about it. I am assuming everything is in the same machine. In Linux you can combine `tcpdump` and `incron`. Like: `tcpdump dst port 80 or dst port 443 > /some/log/file` and configure `incrontab` to run your `bash` script everytime `/some/log/file` is updated. — Azize, Jul 18 '17 at 20:58
Another option is to use `http_proxy` and `https_proxy` environment variables, then point to apache intalled locally and use `mod_actions` to execute your script. But it will only works with applications that understand and respect those environment variables. — Azize, Jul 19 '17 at 08:07
Can you please guide with little bit more directions? Thanks @Azize — Sazzad Hissain Khan, Jul 19 '17 at 11:59
Just to confirm, is everything in the same machine? I will elaborate it as answer, because comment field is too small. — Azize, Jul 19 '17 at 12:09
How about capturing the http/https traffic using [tshark](https://www.wireshark.org/docs/man-pages/tshark.html) into a file and scanning this file for your required input and do processing as per your requirement? — Rishikesh Darandale, Jul 21 '17 at 13:11

score 3 · Answer 1 · answered Jul 19 '17 at 13:21

Those environment variables will set proxy settings to my programs, like curl, wget and browser.

$ env | grep -i proxy
NO_PROXY=localhost,127.0.0.0/8,::1
http_proxy=http://138.106.75.10:3128/
https_proxy=https://138.106.75.10:3128/
no_proxy=localhost,127.0.0.0/8,::1

Here you can see that curl respects it and always connects on my proxy, in your case your proxy settings will be like this: http://localhost:3128.

$ curl -vvv www.google.com
* Rebuilt URL to: www.google.com/
*   Trying 138.106.75.10...
* Connected to 138.106.75.10 (138.106.75.10) port 3128 (#0)
> GET http://www.google.com/ HTTP/1.1
> Host: www.google.com
> User-Agent: curl/7.47.0
> Accept: */*
> Proxy-Connection: Keep-Alive
> 
< HTTP/1.1 302 Found
< Cache-Control: private
< Content-Type: text/html; charset=UTF-8
< Referrer-Policy: no-referrer
< Location: http://www.google.se/?gfe_rd=cr&ei=3ExvWajSGa2EyAXS376oCw
< Content-Length: 258
< Date: Wed, 19 Jul 2017 12:13:16 GMT
< Proxy-Connection: Keep-Alive
< Connection: Keep-Alive
< Age: 0
< 
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
<A HREF="http://www.google.se/?gfe_rd=cr&amp;ei=3ExvWajSGa2EyAXS376oCw">here</A>.
</BODY></HTML>
* Connection #0 to host 138.106.75.10 left intact

Install apache on your machine and configure it as forward proxy, like the example below, the trick is combine mod_actions and mod_proxy:

Listen 127.0.0.1:3128
<VirtualHost 127.0.0.1:3128>
  Script GET "/cgi-bin/your-script.sh"

  ProxyRequests On
  ProxyVia On
  <Proxy http://www.google.com:80>
    ProxySet keepalive=On
    Require all granted
  </Proxy>
</VirtualHost>

I never tried it, but theoretically it should work.

Jens Haeusser · Accepted Answer · 2017-07-22T23:28:20.240

1

If you want to monitor or capture network traffic, tcpdump is your friend- requires no proxy servers, additional installs, etc., and should work on stock Mac OS as well as other *nix variants.

Here's a simple script-

sudo tcpdump -ql dst host google.com | while read line; do echo "Match found"; done

The while read loop will keep running until manually terminated; replace echo "Match found" with your preferred command. Note that this will trigger multiple times per page load; you can use tcpdump -c 1 if you only want it to run until it sees relevant traffic.

As Azize mentions, You could also have tcpdump outputting to a file in one process, and monitor that file in another. incrontab is not available on Mac OS X; you could wrap tail -f in a while read loop:

sudo tcpdump -l dst host google.com > /tmp/output &
tail -fn 1 /tmp/output | while read line; do echo "Match found"; done

There's a good similar script available on github. You can also read up on tcpdump filters if you want to make the filter more sophisticated.

edited Jul 22 '17 at 23:28

answered Jul 22 '17 at 22:18

Jens Haeusser

71
5

it gives me below error: 'pktap_filter_packet: pcap_add_if_info(en3, 1) failed: pcap_add_if_info: pcap_compile_nopcap() failed ' – Sazzad Hissain Khan Jul 24 '17 at 03:27
That usually means there's an error in your `tcpdump` command line options. What was the exact command you ran? – Jens Haeusser Jul 24 '17 at 18:46
when I ran `sudo tcpdump -l dst host *.google.com` it gave the error. When I ran `sudo tcpdump -l dst host google.com` and hit google.com from browser it just kept listening but no result shown. Even for `sudo tcpdump -ql dst host google.com | while read line; do echo "Match found"; done` it does not show anything. – Sazzad Hissain Khan Jul 25 '17 at 03:34
Unfortunately `tcpdump` (or the underlying `pcap` to be precise) does not allow wildcards. With the interesting global traffic balancing Google does, `google.com` worked for me; you could try being more explicit with `sudo tcpdump -l dst host www.google.com` , but that wouldn't necessarily capture all traffic to *.google.com . – Jens Haeusser Jul 26 '17 at 09:00

Rohit Linux · Answer 3 · 2019-11-26T13:06:54.583

first we have to put url's in sample.txt file that we have to monitor.

FILEPATH=/tmp/url
INFILE=$FILEPATH/sample.txt
OUTFILE=$FILEPATH/url_status.txt

> $OUTFILE

for file in `cat $INFILE`
do
echo -n "$file |" >> $FILEPATH/url_status.txt
timeout 20s curl -Is $file |head -1 >> $FILEPATH/url_status.txt
done

grep '200' $OUTFILE|awk '{print $1"   Url is working fine"}' > /tmp/url/working.txt
grep -v '200' $OUTFILE|awk '{print $1"    Url is not working"}' > /tmp/url/notworking.txt

COUNT=`cat /tmp/url/notworking.txt | wc -l`

if [ $COUNT -eq 0 ]
        then
                echo "All url working fine"
        else
                echo "Issue in following url"
                cat /tmp/url/notworking.txt
fi

Monitoring URL Requests from Shell Script

3 Answers3