-1

I have a hundred Whois files of different top level domains(.com, .se, .uk, .cz etc.). Each has a different format. My main task is to extract information such as registrar, registrant, expiry date, updated date etc. The below code works for com. net. org & info. I am using Java SE 6.

   Admin contact: "\\bAdmin\\sEmail:\\s*\\w+\\-*\\w*\\.*\\w*@\\w+(\\.\\w+)+"
   Technical contact: "\\bTech\\sEmail:\\s*\\w+\\-*\\w*\\.*\\w*@\\w+(\\.\\w+)+"
   Whois Registrant: "\\bRegistrant\\sName:\\s*\\w+\\-*\\.*\\w+\\s*\\w*"
   Registrar: "\\bRegistrar:\\w+\\.*\\w*"
   Registered on Date: "\\bCreation\\sDate:\\s*\\d+-\\d+-\\d+T\\d+:\\d+:\\d+Z"
   Expiry Date: "\\bExpiry\\sDate:\\s*\\d+-\\d+-\\d+T\\d+:\\d+:\\d+Z"
   Updated Date: "\\bUpdated\\sDate:\\s*\\d+-\\d+-\\d+T\\d+:\\d+:\\d+Z"
   Name Servers: "\\bName\\sServer:\\s*\\w+\\d*\\.*\\w*\\-*\\w*(\\.\\w+)+"
   Registrant Status: "\\bDomain\\sStatus:\\s*\\w+"

How do I add alternatives for each of the above points for other TLDs. For example : I would like to have Name Servers:

"\\bName\\sServer:\\s*\\w+\\d*\\.*\\w*\\-*\\w*(\\.\\w+)+" 
OR 
alternative pattern 
OR 
alternative Pattern

Is it doable? If not is there an alternative way?

Mallik Kumar
  • 540
  • 1
  • 5
  • 28

1 Answers1

1

Alternative patterns can be concatenated with the | operator:

"\\bName\\sServer:\\s*\\w+\\d*\\.*\\w*\\-*\\w*(\\.\\w+)+|alternative pattern|alternative Pattern"

(If this isn't what you need, then your question should be reformulated.)

laune
  • 31,114
  • 3
  • 29
  • 42