I have a hundred Whois files of different top level domains(.com, .se, .uk, .cz etc.). Each has a different format. My main task is to extract information such as registrar, registrant, expiry date, updated date etc. The below code works for com. net. org & info. I am using Java SE 6.
Admin contact: "\\bAdmin\\sEmail:\\s*\\w+\\-*\\w*\\.*\\w*@\\w+(\\.\\w+)+"
Technical contact: "\\bTech\\sEmail:\\s*\\w+\\-*\\w*\\.*\\w*@\\w+(\\.\\w+)+"
Whois Registrant: "\\bRegistrant\\sName:\\s*\\w+\\-*\\.*\\w+\\s*\\w*"
Registrar: "\\bRegistrar:\\w+\\.*\\w*"
Registered on Date: "\\bCreation\\sDate:\\s*\\d+-\\d+-\\d+T\\d+:\\d+:\\d+Z"
Expiry Date: "\\bExpiry\\sDate:\\s*\\d+-\\d+-\\d+T\\d+:\\d+:\\d+Z"
Updated Date: "\\bUpdated\\sDate:\\s*\\d+-\\d+-\\d+T\\d+:\\d+:\\d+Z"
Name Servers: "\\bName\\sServer:\\s*\\w+\\d*\\.*\\w*\\-*\\w*(\\.\\w+)+"
Registrant Status: "\\bDomain\\sStatus:\\s*\\w+"
How do I add alternatives for each of the above points for other TLDs. For example : I would like to have Name Servers:
"\\bName\\sServer:\\s*\\w+\\d*\\.*\\w*\\-*\\w*(\\.\\w+)+"
OR
alternative pattern
OR
alternative Pattern
Is it doable? If not is there an alternative way?