0

I wrote a flawed implementation of detecting if two urls are on the same domain. I realize this isn't rocket science, but it seems like there would be a standard lib that has this built-in. My google-fu has failed me. Are there any libraries out there I can require or attribute the cargo-culted code to?

SOP says scheme, port are identical AND hosts match, and subdomain(s) is/are a subset of the origin

so

google.com matches google.com
a.google.com matches google.com

Turns out

a.b.google.com matches [b.google.com, google.com] but not [c.b.google.com, a.google.com]

Is only applicable when you can manipulate document.domain which is not the case for me. Direct host matches are required.

oreoshake
  • 4,712
  • 1
  • 31
  • 38
  • Can you give one or more examples of URL pairs and what would you expect the result of that utility would be for them? This would help a lot. – dimitarvp Nov 09 '12 at 19:40
  • Where did you find that last rule? `a.b.google.com matches [b.google.com, google.com] but not [c.b.google.com, a.google.com]` – Sandro Nov 09 '12 at 20:34
  • https://code.google.com/p/browsersec/wiki/Part2#Same-origin_policy see the second bullet point in combination with https://developer.mozilla.org/en-US/docs/Same_origin_policy_for_JavaScript – oreoshake Nov 09 '12 at 22:16

1 Answers1

2

From this Wikipedia article it looks like the scheme, host and port have to be the same to satisfy the same origin policy.

http://en.wikipedia.org/wiki/Same_origin_policy

require 'uri'

class SameOrigin
  def self.test(str1, str2)
    uri1 = URI.parse(str1)
    uri2 = URI.parse(str2)
    uri1.scheme == uri2.scheme && uri1.host == uri2.host && uri1.port == uri2.port
  end
end

SameOrigin.test "http://google.com", "http://google.com"     # => true
SameOrigin.test "http://google.com:80", "http://google.com"  # => true
SameOrigin.test "http://google.com", "http://www.google.com" # => false
SameOrigin.test "https://google.com", "http://google.com"    # => false

If you use the Domainatrix library I found you could change the code to something like this for your test, but it runs a little slow for me. Another option would be to use this RegEx to find the domain of a url. The RegEx is faster but may not work in all cases. I found the RegEx here, btw.

Remove subdomain from string in ruby

require 'rubygems'
require 'domainatrix'
require 'uri'

class SameOrigin
  def self.relaxed_test(str1, str2)
    d1 = Domainatrix.parse(str1)
    d1 = Domainatrix.parse(str2)

    uri1 = URI.parse(str1)
    uri2 = URI.parse(str2)

    uri1.scheme == uri2.scheme && 
    d1.domain == d1.domain && 
    d1.public_suffix == d1.public_suffix && 
    uri1.port == uri2.port
  end

  def self.relaxed_test2(str1, str2)
    uri1 = URI.parse(str1)
    uri2 = URI.parse(str2)

    re = /^(?:(?>[a-z0-9-]*\.)+?|)([a-z0-9-]+\.(?>[a-z]*(?>\.[a-z]{2})?))$/i
    domain1 = uri1.host.gsub(re, '\1').strip
    domain2 = uri2.host.gsub(re, '\1').strip

    uri1.scheme == uri2.scheme && domain1 == domain2 && uri1.port == uri2.port
  end
end

SameOrigin.relaxed_test "http://google.com", "http://google.com"     # => true
SameOrigin.relaxed_test "http://google.com:80", "http://google.com"  # => true
SameOrigin.relaxed_test "http://google.com", "http://www.google.com" # => false
SameOrigin.relaxed_test "https://google.com", "http://google.com"    # => false

SameOrigin.relaxed_test2 "http://google.com", "http://google.com"     # => true
SameOrigin.relaxed_test2 "http://google.com:80", "http://google.com"  # => true
SameOrigin.relaxed_test2 "http://google.com", "http://www.google.com" # => false
SameOrigin.relaxed_test2 "https://google.com", "http://google.com"    # => false
Community
  • 1
  • 1
Sandro
  • 4,761
  • 1
  • 34
  • 41