2

I need to extract the rest of the portion of the URL after "home"

for example the URLs can be

https://www.example.com/site/country/home/products
https://www.example.com/site/country/home/products/consumer
https://www.example.com/site/country/home/products/consumer/kids

The keywords in the url "site", "country" might change.

All I need in the output is:

 /products 
 /products/consumer 
 /products/consumer/kids

I tried using Regex, but didn't worked in above situation

  • Find the first index of "/home/" and take the substring after that? – David Jan 03 '19 at 13:15
  • 1
    What regex did you try? Also, why not just use [String.IndexOf()](https://learn.microsoft.com/de-de/dotnet/api/system.string.indexof?view=netframework-4.7.2) ? – Corion Jan 03 '19 at 13:15
  • [Something like this should work](https://dotnetfiddle.net/Oy2M22) `string rest = url.Substring(url.IndexOf("/products"));` – ikerbera Jan 03 '19 at 13:25

4 Answers4

2

As suggested by Corion and David in the comments, in this case, the simplest method is probably just to find the index of /home/ and then strip everything up to that point (but not second /):

string home = "/home/";
int homeIndex = url.IndexOf(home);
string relativeUrl = url.Substring(homeIndex + home.Length - 1);

Using a regular expression, you want to match the /home/ substring, and capture the second / and everything following it:

Match match = Regex.Match(url, @"/home(/.*)");
string relativeUrl = "/";
if (match.Success) {
    relativeUrl = match.Groups[1].Value;
}
Billy Brown
  • 2,272
  • 23
  • 25
1

its a so simple c# code i think it may help you

string sub = "https://www.example.com/site/country/home/products";
        string temp = "";
        string[] ss = sub.Split('/');
        for(int i = 0; i < sub.Length; i++)
        {
            if (ss[i] == "home")
            {
                i++;
                for (int j = i; j < ss.Length; j++)
                    temp +='/'+ ss[j];

                break;
            }

        }
        Console.WriteLine(temp);
amal mansour
  • 345
  • 2
  • 5
  • 14
  • Although it's more general than most answers here, there is no need to split and loop; just find the index of `home` and take the substring. – Billy Brown Jan 03 '19 at 13:34
  • i only give an idea and i think indexof get only character and indexofany get array of characters but may be not sequenced – amal mansour Jan 03 '19 at 13:37
  • It's not really a bad idea, but the code is a lot more work than is needed, and this approach is better using Crowcoder's answer with `System.Uri`. Additionally, you should check `ss.Length` instead of `sub.Length` in the `for` loop. `string.IndexOf` will find the starting index of a substring in the string, so it works in this case. – Billy Brown Jan 03 '19 at 13:39
0

It is easy using Regex. Please use the following Regex and test your scenario. It works fine.

Regex: '(?<=\/home).*\b'

No need to worry about front portion before home. As soon as it finds home, it will take words after home.

Deep
  • 342
  • 3
  • 12
0

You could use the System.Uri class to extract the segments of the URL:

Uri link = new Uri("https://www.example.com/site/country/home/products/consumer/kids");
string[] segs = link.Segments;

int idxOfHome = Array.IndexOf(segs, "home/");

string restOfUrl = string.Join("", segs, idxOfHome+1, segs.Length  - (idxOfHome + 1));

Yeilds:

products/consumer/kids

Crowcoder
  • 11,250
  • 3
  • 36
  • 45
  • The problem with segments is, if I have a space in the url it returns the encoded form. In my case for **/products/consumer/for men** it was returning **/products/consumer/for%40men** then I just used Trim and IndexOf. – SitecoreSXADeveloper Jan 04 '19 at 11:42
  • You could URL decode it, but that's not a space it's@ – Crowcoder Jan 04 '19 at 11:56