You don't need CLR and RegEx for this.
DECLARE @x TABLE(id INT, url VARCHAR(2048));
INSERT @x VALUES
(1,'http://test.com/mary/archive/project.aspx'),
--> I want http::/test.com/mary
(2,'http://www.testmary.com/company'),
--> I want http://www.testmary.com/
(3,'http://marytest.com/b/about/'),
--> I want http://marytest.com/
(4,'http://mary.test.com/b/mary/project.aspx'),
--> I want http://mary.test.com/b/mary/
(5,'mary.test.com');
--> I want mary.test.com
SELECT ID,
[output] = SUBSTRING(url, 1, LEN(url) - CHARINDEX('yram', REVERSE(url))
+ COALESCE(NULLIF(CHARINDEX('/', SUBSTRING(url, LEN(url) - CHARINDEX('yram',
REVERSE(url)) + 1, 2048)),0),2048))
FROM @x
WHERE url LIKE '%mary%';
Results:
ID output
-- --------------------------------
1 http://test.com/mary/
2 http://www.testmary.com/
3 http://marytest.com/
4 http://mary.test.com/b/mary/
5 mary.test.com
The only thing I didn't understand is why the first row in the output should be missing a trailing slash, while the other rows include it. In my query this trailing slash is included in all rows. If it shouldn't be included only on that one row, you'll need to explain why.
I do recommend getting a better handle on data cleansing, though. Why would you allow some URLs without the http:// prefix?