9

What I'm trying to achieve here is lets say we have two example URLs:

url1 = "http://emy.dod.com/kaskaa/dkaiad/amaa//////////"
url2 = "http://www.example.com/"

How can I extract the striped down URLs?

url1 = "http://emy.dod.com/kaskaa/dkaiad/amaa"
url2 = "http://http://www.example.com"

URI.parse in Ruby sanitizes certain type of malformed URL but is ineffective in this case.

If we use regex then /^(.*)\/$/ removes a single slash / from url1 and is ineffective for url2.

Is anybody aware of how to handle this type of URL parsing?

The point here is I don't want my system to have http://www.example.com/ and http://www.example.com being treated as two different URLs. And same goes for http://emy.dod.com/kaskaa/dkaiad/amaa//// and http://emy.dod.com/kaskaa/dkaiad/amaa/.

Aliaksei Kliuchnikau
  • 13,589
  • 4
  • 59
  • 72
splintercell
  • 575
  • 1
  • 7
  • 22
  • 1
    @other_people_reading_this_question If, like me, you only need to remove one trailing slash, you can use `String#chomp`. E.g: `"/path/to/directory/".chomp("/")` – Ajedi32 Jul 23 '13 at 14:59

3 Answers3

28

If you just need to remove all slashes from the end of the url string then you can try the following regex:

"http://emy.dod.com/kaskaa/dkaiad/amaa//////////".sub(/(\/)+$/,'')
"http://www.example.com/".sub(/(\/)+$/,'')

/(\/)+$/ - this regex finds one or more slashes at the end of the string. Then we replace this match with empty string.

Hope this helps.

Arturo Herrero
  • 12,772
  • 11
  • 42
  • 73
Aliaksei Kliuchnikau
  • 13,589
  • 4
  • 59
  • 72
4

Although this thread is a bit old and the top answer is quite good, but I suggest another way to do this:

/^(.*?)\/$/

You could see it in action here: https://regex101.com/r/vC6yX1/2

The magic here is *?, which does a lazy match. So the entire expression could be translated as:

Match as few characters as it can and capture it, while match as many slashes as it can at the end.

Which means, in a more plain English, removes all trailing slashes.

nevets
  • 4,631
  • 24
  • 40
  • 1
    upvote for being a pure regex, instead of using language-specific functions – theEpsilon Feb 03 '17 at 19:40
  • thanks for voting for so long @theEpsilon :D i think this request could be done using only regex, instead of some language specific features. – nevets Feb 04 '17 at 10:16
0
def without_trailing_slash path
  path[ %r(.*[^/]) ]
end

path = "http://emy.dod.com/kaskaa/dkaiad/amaa//////////"

puts without_trailing_slash path # "http://emy.dod.com/kaskaa/dkaiad/amaa"
crantok
  • 1,333
  • 11
  • 18