10

How list from the command line URLs requests that are made from the server (an *ux machine) to another machine.

For instance, I am on the command line of server ALPHA_RE . I do a ping to google.co.uk and another ping to bbc.co.uk I would like to see, from the prompt :

google.co.uk bbc.co.uk

so, not the ip address of the machine I am pinging, and NOT an URL from servers that passes my the request to google.co.uk or bbc.co.uk , but the actual final urls.

Note that only packages that are available on normal ubuntu repositories are available - and it has to work with command line

Edit The ultimate goal is to see what API URLs a PHP script (run by a cronjob) requests ; and what API URLs the server requests 'live'. These ones do mainly GET and POST requests to several URLs, and I am interested in knowing the params :

Does it do request to :

foobar.com/api/whatisthere?and=what&is=there&too=yeah

or to :

foobar.com/api/whatisthathere?is=it&foo=bar&green=yeah

And does the cron jobs or the server do any other GET or POST request ? And that, regardless what response (if any) these API gives.

Also, the API list is unknown - so you cannot grep to one particular URL.

Edit: (OLD ticket specified : Note that I can not install anything on that server (no extra package, I can only use the "normal" commands - like tcpdump, sed, grep,...) // but as getting these information with tcpdump is pretty hard, then I made installation of packages possible)

Cedric
  • 5,135
  • 11
  • 42
  • 61
  • 1
    If you down vote questions, please consider leaving a comment explaining why you have down voted it. After all, it is the only way for people to understand your action, and to improve their future questions. – Cedric Nov 09 '15 at 09:21
  • You mentioned URLs in your question but your example uses ping which only uses hostnames/IP addresses, do you want to know all DNS lookups (to get from name to IP) or are you interested in a specific type of traffic? – Bert Neef Nov 10 '15 at 05:57
  • Thanks for your question. I am interested in the full URL, not just the hostname – Cedric Nov 10 '15 at 11:44
  • For which type of traffic? HTTP? – Bert Neef Nov 10 '15 at 12:29
  • Yes, one is definitely http ; the other one is using sockets – Cedric Nov 10 '15 at 13:44
  • using sockets as in any kind of application which connects to a remote host over a network socket? – Bert Neef Nov 10 '15 at 18:36

2 Answers2

16

You can use tcpdump and grep to get info about activity about network traffic from the host, the following cmd line should get you all lines containing Host:

 tcpdump -i any -A -vv -s 0 |  grep -e "Host:"

If I run the above in one shell and start a Links session to stackoverflow I see:

Host: www.stackoverflow.com
Host: stackoverflow.com

If you want to know more about the actual HTTP request you can also add statements to the grep for GET, PUT or POST requests (i.e. -e "GET"), which can get you some info about the relative URL (should be combined with the earlier determined host to get the full URL).

EDIT: based on your edited question I have tried to make some modification: first a tcpdump approach:

[root@localhost ~]# tcpdump -i any -A -vv -s 0 | egrep -e "GET" -e "POST" -e "Host:"
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes
E..v.[@.@.......h.$....P....Ga  .P.9.=...GET / HTTP/1.1
Host: stackoverflow.com
E....x@.@..7....h.$....P....Ga.mP...>;..GET /search?q=tcpdump HTTP/1.1
Host: stackoverflow.com

And an ngrep one:

[root@localhost ~]# ngrep -d any -vv -w byline | egrep -e "Host:" -e "GET" -e "POST"
^[[B  GET //meta.stackoverflow.com HTTP/1.1..Host: stackoverflow.com..User-Agent:
  GET //search?q=tcpdump HTTP/1.1..Host: stackoverflow.com..User-Agent: Links

My test case was running links stackoverflow.com, putting tcpdump in the search field and hitting enter.

This gets you all URL info on one line. A nicer alternative might be to simply run a reverse proxy (e.g. nginx) on your own server and modify the host file (such as shown in Adam's answer) and have the reverse proxy redirect all queries to the actual host and use the logging features of the reverse proxy to get the URLs from there, the logs would probably a bit easier to read.

EDIT 2: If you use a command line such as:

ngrep -d any -vv -w byline | egrep -e "Host:" -e "GET" -e "POST" --line-buffered |  perl -lne 'print $3.$2  if /(GET|POST) (.+?) HTTP\/1\.1\.\.Host: (.+?)\.\./'

you should see the actual URLs

Bert Neef
  • 743
  • 1
  • 7
  • 14
  • This command line provided does not give any result - I have tried to change the grep -e "Host:" expression, but no luck. – Cedric Nov 10 '15 at 11:59
  • hmmm, let me check again, do you have ngrep available as well? – Bert Neef Nov 10 '15 at 12:39
  • I can use various flavour of grep, but not ngrep (egrep fgrep grep msggrep pgrep rgrep ) – Cedric Nov 10 '15 at 13:39
  • ngrep is a packet capture utility, a bit like tcpdump, can be easier to use in some scenarios. the command works on my system when I run links www.google.com I get output about google in the window with the tcpdump. Which command are you trying to run? And what happens when you run the tcpdump command without the pipe to grep? – Bert Neef Nov 10 '15 at 18:35
  • sudo tcpdump -i any -A -vv -s 0 | egrep 'localdomain.*>' | grep -v ssh is a variation from you answer . – Cedric Nov 13 '15 at 11:13
  • Hi Bert, thank you very much for all these questions - I have requested the permission to install extra packages on prod. server, so it is possible to use ngrep and other packages now. – Cedric Nov 14 '15 at 10:28
  • Ok, does my ngrep sample give enough info or is something missing? Or does it need some teeaking for readability? – Bert Neef Nov 14 '15 at 15:19
2

A simple solution is to modify your '/etc/hosts' file to intercept the API calls and redirect them to your own web server

api.foobar.com 127.0.0.1
Adam
  • 17,838
  • 32
  • 54