Get whois information using wget

Question

I get whois information for a bunch of URLs by wget the following address

wget -qO- https://www.whois.com/whois/SampleDomain

At the first phase I wanna not creating a file for each URL, so I use -qO- option.

I want to extract 10 field of every domain (such as, Creation Date, Registrant Name)

My question is: How can I get make a csv file which every row define the domain and each column has the value of the whois information?

I'd *seriously* recommend a higher-level language for this, such as PHP or Perl. — Theodore R. Smith, Nov 01 '17 at 17:56

Cyrus · Accepted Answer · 2017-11-01T18:47:45.060

0

With xmlstarlet, GNU grep and GNU paste. A first step:

wget -qO - https://www.whois.com/whois/stackoverflow.com |\
  xmlstarlet format --html --recover 2>/dev/null |\
  xmlstarlet select --template --value-of '//pre' |\
  grep -Po '^(Creation Date|Registrant Name): \K.*(?= )' |\
  paste -d , - -

Output:

2003-12-26T19:18:07Z,Sysadmin Team

edited Nov 01 '17 at 18:47

answered Nov 01 '17 at 17:34

Cyrus

84,225
14
89
153

Thanks for your answer. How can I output the result as the following to a file? 2003-12-26T19:18:07Z,Sysadmin Team – John Nov 01 '17 at 18:28
I've updated my answer. – Cyrus Nov 01 '17 at 18:47
Thanks again for your update. For multi-field like => Registrant Name|Registrant Organization|Registrant City|Registrant Country|Registrar IANA ID|Creation Date|Updated Date|Registry Expiry Date and the following address (https://www.whois.com/whois/academicreviews.us) it does not print the appropriate result! – John Nov 01 '17 at 19:00
There, the lines do not end with a blank character but with a semicolon. Replace `(?= )` with `(?=( |;))`. Use with `paste` for each column one `-`. – Cyrus Nov 01 '17 at 19:12

Get whois information using wget

1 Answers1