1

I have this list of domains

https://download.my-domain.com/auth/login
https://download.my-domain.com
http://localhost:60162/API/script/authbar.js
http://localhost:28173/logout.aspx
http://my-domain.com/logout.aspx
http://my-domain.com/logout.aspx/
http://my-domain.com/
http://my-domain.com
http://my-domain.tk/
http://my-domain.gov
download.my-domain.com/auth/login
www.download.my-domain.com/auth/login
http://www.google.com
https://www.google.com
http://www.google.com/
https://www.google.com/

and i tried

((\.)?[a-zA-Z0-9-]+\.([a-zA-Z0-9]{2,4}))/?

but it also detects the logout.aspx any help would be appreciated..

TYIA

Expected Result:

my-domain.com

localhost (without the port)

google.com

my-domain.tk

my-domain.gov

Usage:

For cookie domain

Community
  • 1
  • 1
Vincent Dagpin
  • 3,581
  • 13
  • 55
  • 85

5 Answers5

1

The following might work for you:

[-a-z0-9_]+(?!://)(?:\.[-a-z0-9_]+)?(?=[/:]|$)

Regular expression visualization

JS Demo

C# Demo

Ulugbek Umirov
  • 12,719
  • 3
  • 23
  • 31
1

maybe this one is more likely to work in your example

([a-zA-Z0-9-]+(\.(com|net|org|info|coop|co\.uk|org\.uk|ac\.uk|uk|tk|gov)))|localhost
Zhen Zhang
  • 1,052
  • 1
  • 11
  • 18
1

I did this with minimum regular expression in Javascript because I was bored. I imagine it would be pretty easy to convert to c#?

var urls = [
  'https://download.my-domain.com/auth/login',
  'https://download.my-domain.com',
  'http://localhost:60162/API/script/authbar.js',
  'http://localhost:28173/logout.aspx',
  'http://my-domain.com/logout.aspx',
  'http://my-domain.com/logout.aspx/',
  'http://my-domain.com/',
  'http://my-domain.com',
  'http://my-domain.tk/',
  'http://my-domain.gov',
  'download.my-domain.com/auth/login',
  'www.download.my-domain.com/auth/login',
  'http://www.google.com',
  'https://www.google.com',
  'http://www.google.com/',
  'https://www.google.com/'
];

var domains = urls.map(function (url) {
  var domain = url.replace(/^https?:\/\//, '').split('/')[0];

  if (domain.indexOf(':') > 0) {
    domain = domain.split(':')[0];
  } else {
    domain = domain.split('.').slice(-2).join('.');
  }

  return domain;
});
Bill Criswell
  • 32,161
  • 7
  • 75
  • 66
0

Instead of

((\.)?[a-zA-Z0-9-]+\.([a-zA-Z0-9]{2,4}))/?

You should add all possible top level domain

((\.)?[a-zA-Z0-9-]+\.(com|net|org|info|coop|co\.uk|org\.uk|ac\.uk|uk|tk)/?
coder hacker
  • 4,819
  • 1
  • 25
  • 50
0

Try this:

Regex("^(?>https?://|)([-A-Z0-9+&@#%?=~_|!,.;]+)", RegexOptions.IgnoreCase);

if you want to ignore lines ending in logout.aspx, then try the following:

Regex("^(?>https?://|)([-A-Z0-9+&@#%?=~_|!,.;]+)[-A-Z0-9+&@#%?=~_|!,:/.;]*(?<!logout.aspx/?)$", RegexOptions.IgnoreCase);
Andre Artus
  • 1,850
  • 15
  • 21