62

I'm using the ruby version 1.9.3, I like to get host name from the video url below,

I tried with code

require 'uri'
url = "https://ferrari-view.4me.it/view-share/playerp/?plContext=http://ferrari-%201363948628-stream.4mecloud.it/live/ferrari/ngrp:livegenita/manifest.f4m&cartellaConfig=http://ferrari-4me.weebo.it/static/player/config/&cartellaLingua=http://ferrari-4me.weebo.it/static/player/config/&poster=http://pusher.newvision.it:8080/resources/img1.jpg&urlSkin=http://ferrari-4me.weebo.it/static/player/swf/skin.swf?a=1363014732171&method=GET&target_url=http://ferrari-4me.weebo.it/static/player/swf/player.swf&userLanguage=IT&styleTextColor=#000000&autoPlay=true&bufferTime=2&isLive=true&highlightColor=#eb2323&gaTrackerList=UA-23603234-4"  
puts URI.parse(url).host  

it throws an exception URI::InvalidURIError: bad URI(is not URI?):

I tried with encode the URL then parse like below

puts URI.parse(URI.parse(url)).host

it throws an exception same URI::InvalidURIError: bad URI(is not URI?)

But above code works for the below URL.

url = http://www.youtube.com/v/GpQDa3PUAbU?version=3&autohide=1&autoplay=1

How to fix this? any suggestion please. Thanks

Avi
  • 21,182
  • 26
  • 82
  • 121
prabu
  • 6,121
  • 8
  • 35
  • 39

6 Answers6

121

This url is not valid, but it works in browser because browser itself is less strict about special characters like :, /, etc.

You should encode your URI first

encoded_url = URI.encode(url)

And then parse it

URI.parse(encoded_url)
Konrad Szczęśniak
  • 1,960
  • 1
  • 14
  • 13
  • 1
    You really saved my time. In my case I faced **Net::HTTPBadResponse - wrong status line: "{":** problem and same solution applicable for that – Taimoor Changaiz Dec 05 '13 at 17:21
  • I end up with `NoMethodError: undefinded method 'gsub' for nil:NilClass` when I try this. However, when I run the Gem Manually, the code works. – FilBot3 May 15 '15 at 21:13
  • I was also seeing `undefinded method 'gsub' for nil:NilClass`, but it turned out that my original `url` was nil when I didn't expect it. – Andrew Oct 13 '15 at 23:38
  • 1
    This is not reliable, ex. URI.parse(URI.encode("https://thepiratebay.org/torrent/16110726/Scorpion.S03E05.720p.HDTV.X264-DIMENSION[ettv]")) still renders the error – Laser Oct 26 '16 at 05:29
  • @Laser url without protocol (http) is not valid either – Konrad Szczęśniak Nov 14 '16 at 15:56
  • @KonradSzczęśniak try the actual link in quotes, stackoverflow converted the literal string example into the format there. – Laser Nov 15 '16 at 06:21
  • @KonradSzczęśniak `URI.parse(URI.encode("https://thepiratebay.org/torrent/16110726/Scorpion.S03E05.720p.HDTV.X264-DIMENSION[ettv]"))` – Laser Nov 15 '16 at 06:22
21

Addressable::URI is a better, more rfc-compliant replacement for URI:

require "addressable/uri"
Addressable::URI.parse(url).host
#=> "ferrari-view.4me.it"

gem install addressable first.

pguardiario
  • 53,827
  • 19
  • 119
  • 159
  • 3
    I think this response is more accurate, because ruby URI.encode does not work with some URIs – Daniel Cukier Feb 25 '14 at 12:04
  • Addressable::URI.parse("https://www. with_space.com").host will not work, it will raise an Addressable::URI::InvalidURIError – astropanic Mar 27 '18 at 11:24
  • Steam uses urls like `https://steamcommunity.com/profiles/[U:1:123456789]` and `URI.encode` doesn't help but `Addressable` does. – WojciechKo Jan 02 '19 at 13:09
3

try this:

safeurl = URI.encode(url.strip)
response = RestClient.get(safeurl)
Lucia
  • 51
  • 5
0

Your URI query is not valid. There are several characters that you should encode with URI::encode(). For instance, #, , or & are not valid in a query.

Below a working version of your code

    require 'uri'

    plContext = URI::encode("http://ferrari-%201363948628-stream.4mecloud.it/live/ferrari/ngrp:livegenita/manifest.f4m")
    cartellaConfig = URI::encode("http://ferrari-4me.weebo.it/static/player/config/")
    cartellaLingua = URI::encode("http://ferrari-4me.weebo.it/static/player/config/")
    poster = URI::encode("http://pusher.newvision.it:8080/resources/img1.jpg")
    urlSkin = URI::encode("http://ferrari-4me.weebo.it/static/player/swf/skin.swf?a=1363014732171")
    target_url = URI::encode("http://ferrari-4me.weebo.it/static/player/swf/player.swf")
    url = "https://ferrari-view.4me.it/view-share/playerp/?"
    url << "plContext=#{plContext}"
    url << "&cartellaConfig=#{cartellaConfig}"
    url << "&cartellaLingua=#{cartellaLingua}"
    url << "&poster=#{poster}"
    url << "&urlSkin=#{urlSkin}"
    url << "&method=GET"
    url << "&target_url=#{target_url}"
    url << "&userLanguage=IT"
    url << "&styleTextColor=#{URI::encode("#000000")}"
    url << "&autoPlay=true&bufferTime=2&isLive=true&gaTrackerList=UA-23603234-4"
    url << "&highlightColor=#{URI::encode("#eb2323")}"  
    puts url
    puts URI.parse(url).host
toch
  • 3,905
  • 2
  • 25
  • 34
0

URI.parse is right: that URI is illegal. Just because it accidentally happens to work in your browser doesn't make it legal. You cannot parse that URI, because it isn't a URI.

Jörg W Mittag
  • 363,080
  • 75
  • 446
  • 653
0
uri = URI.parse(URI.encode(url.strip))
rusllonrails
  • 5,586
  • 3
  • 34
  • 27