184

I'm trying to find a relatively easy and reliable method to extract the base URL from a string variable using JavaScript (or jQuery).

For example, given something like:

http://www.sitename.com/article/2009/09/14/this-is-an-article/

I'd like to get:

http://www.sitename.com/

Is a regular expression the best bet? If so, what statement could I use to assign the base URL extracted from a given string to a new variable?

I've done some searching on this, but everything I find in the JavaScript world seems to revolve around gathering this information from the actual document URL using location.host or similar.

kenorb
  • 155,785
  • 88
  • 678
  • 743
Bungle
  • 19,392
  • 24
  • 79
  • 106

21 Answers21

228

Edit: Some complain that it doesn't take into account protocol. So I decided to upgrade the code, since it is marked as answer. For those who like one-line-code... well sorry this why we use code minimizers, code should be human readable and this way is better... in my opinion.

var pathArray = "https://somedomain.com".split( '/' );
var protocol = pathArray[0];
var host = pathArray[2];
var url = protocol + '//' + host;

Or use Davids solution from below.

itzhar
  • 12,743
  • 6
  • 56
  • 63
  • 6
    Thanks for the reply, but again, I'm trying to extract the base URL from a string, rather than the actual document URL. I don't think this will help me - though please correct me if I'm wrong. – Bungle Sep 14 '09 at 11:04
  • 2
    pathArray = String("http://YourHost.com/url/nic/or/not").split( '/' ); host = pathArray[2]; –  Sep 14 '09 at 11:05
  • split works for strings, does not matter where are they coming from –  Sep 14 '09 at 11:07
  • 4
    Got it - thanks Rafal and daddywoodland! I ended up using: url = 'http://www.sitename.com/article/2009/09/14/this-is-an-article/'; pathArray = (url).split('/'); host = 'http://' + pathArray[2]; I think Rafal's example just omitted the "http://" that is present in all of the strings that I'm processing, in which case the pathArray[2] is the one you need. Without the "http://" prefix, pathArray[0] would be the one. Thanks again. – Bungle Sep 14 '09 at 11:21
  • Ahhh, I see what happened. Rafal's example was indeed correct, but the "http://" prefix meant that it was interpreted as a link in his comment, so it was truncated from the visible text. – Bungle Sep 14 '09 at 11:22
  • Yep, It depends if You have http or not. You can always run one more test if first array elm is http or not :) –  Sep 14 '09 at 11:35
  • 4
    Why all the variable declaration? `url = 'sitename.com/article/2009/09/14/this-is-an-article'; newurl = 'http://' + url.split('/')[0];` – ErikE Aug 21 '10 at 02:27
  • `pathArray = window.location.href.split( '/' );` yes? – Chalist Jun 03 '13 at 12:39
  • 1
    pathArray = window.location.href.split( '/' ); protocol = pathArray[0]; host = pathArray[2]; url = protocol + '://' + host; `//now url === "http:://stackoverflow.com"` checkout `::` –  Sep 06 '13 at 03:44
  • I agree with @TamilVendhan, the code above generates 2 colon's in the protocol section of the URL. Code should probably read `url = protocol + '//' + host;` – ewitkows May 06 '14 at 20:09
  • 1
    I updated one line to get rid of query string: host = pathArray[2].split('?')[0]; – Dariux Sep 23 '14 at 11:05
  • The fist part of the solution in this answer does not consider credentials in the URL i.e. http://username:password@host.com/some-path. This can be resolved by adding if(host && host.indexOf("@") !== -1) host = host.split("@")[1]; – Jan H Sep 19 '16 at 12:09
  • 1
    why not simply location.href.split('/', 3).join('/')? – dan Apr 15 '17 at 03:06
  • here is the one liner using URL API, no need to split and construct the url manually https://stackoverflow.com/a/50449208/6333644 – devansvd May 21 '18 at 12:45
  • Please stop. The now days answer is the one provided by [devensvd below](https://stackoverflow.com/questions/1420881/how-to-extract-base-url-from-a-string-in-javascript#answer-50449208) – davidmpaz May 06 '20 at 07:07
155

WebKit-based browsers, Firefox as of version 21 and current versions of Internet Explorer (IE 10 and 11) implement location.origin.

location.origin includes the protocol, the domain and optionally the port of the URL.

For example, location.origin of the URL http://www.sitename.com/article/2009/09/14/this-is-an-article/ is http://www.sitename.com.

To target browsers without support for location.origin use the following concise polyfill:

if (typeof location.origin === 'undefined')
    location.origin = location.protocol + '//' + location.host;
David
  • 3,392
  • 3
  • 36
  • 47
  • 36
    `window.location.hostname` will miss of the port number if given, so use `window.location.host`. So the complete 'basename' including the trailing slash would be: `window.location.protocol+"//"+window.location.host + "/";` – sroebuck Aug 30 '11 at 09:39
  • 4
    Actually, window.location.hostname is still useful if, as in my case, you need to provide a different port number. – Darrell Brogdon Mar 13 '12 at 06:46
44

Don't need to use jQuery, just use

location.hostname
Timo Tijhof
  • 10,032
  • 6
  • 34
  • 48
daddywoodland
  • 1,512
  • 2
  • 12
  • 16
39

There is no reason to do splits to get the path, hostname, etc from a string that is a link. You just need to use a link

//create a new element link with your link
var a = document.createElement("a");
a.href="http://www.sitename.com/article/2009/09/14/this-is-an-article/";

//hide it from view when it is added
a.style.display="none";

//add it
document.body.appendChild(a);

//read the links "features"
alert(a.protocol);
alert(a.hostname)
alert(a.pathname)
alert(a.port);
alert(a.hash);

//remove it
document.body.removeChild(a);

You can easily do it with jQuery appending the element and reading its attr.

Update: There is now new URL() which simplifies it

const myUrl = new URL("https://www.example.com:3000/article/2009/09/14/this-is-an-article/#m123")

const parts = ['protocol', 'hostname', 'pathname', 'port', 'hash'];

parts.forEach(key => console.log(key, myUrl[key]))
epascarello
  • 204,599
  • 20
  • 195
  • 236
32

Well, URL API object avoids splitting and constructing the url's manually.

 let url = new URL('https://stackoverflow.com/questions/1420881');
 alert(url.origin);
devansvd
  • 889
  • 1
  • 12
  • 23
21
var host = location.protocol + '//' + location.host + '/';
Timo Tijhof
  • 10,032
  • 6
  • 34
  • 48
kta
  • 19,412
  • 7
  • 65
  • 47
16
String.prototype.url = function() {
  const a = $('<a />').attr('href', this)[0];
  // or if you are not using jQuery 
  // const a = document.createElement('a'); a.setAttribute('href', this);
  let origin = a.protocol + '//' + a.hostname;
  if (a.port.length > 0) {
    origin = `${origin}:${a.port}`;
  }
  const {host, hostname, pathname, port, protocol, search, hash} = a;
  return {origin, host, hostname, pathname, port, protocol, search, hash};

}

Then :

'http://mysite:5050/pke45#23'.url()
 //OUTPUT : {host: "mysite:5050", hostname: "mysite", pathname: "/pke45", port: "5050", protocol: "http:",hash:"#23",origin:"http://mysite:5050"}

For your request, you need :

 'http://mysite:5050/pke45#23'.url().origin

Review 07-2017 : It can be also more elegant & has more features

const parseUrl = (string, prop) =>  {
  const a = document.createElement('a'); 
  a.setAttribute('href', string);
  const {host, hostname, pathname, port, protocol, search, hash} = a;
  const origin = `${protocol}//${hostname}${port.length ? `:${port}`:''}`;
  return prop ? eval(prop) : {origin, host, hostname, pathname, port, protocol, search, hash}
}

Then

parseUrl('http://mysite:5050/pke45#23')
// {origin: "http://mysite:5050", host: "mysite:5050", hostname: "mysite", pathname: "/pke45", port: "5050"…}


parseUrl('http://mysite:5050/pke45#23', 'origin')
// "http://mysite:5050"

Cool!

Abdennour TOUMI
  • 87,526
  • 38
  • 249
  • 254
12

If you're using jQuery, this is a kinda cool way to manipulate elements in javascript without adding them to the DOM:

var myAnchor = $("<a />");

//set href    
myAnchor.attr('href', 'http://example.com/path/to/myfile')

//your link's features
var hostname = myAnchor.attr('hostname'); // http://example.com
var pathname = myAnchor.attr('pathname'); // /path/to/my/file
//...etc
Wayne
  • 501
  • 6
  • 7
  • 1
    I think it should be `myAnchor.prop('hostname')`. I'm guessing that jQuery has changed in the last 5 years... Thanks for the answer! – Dehli Oct 28 '15 at 20:44
10

A lightway but complete approach to getting basic values from a string representation of an URL is Douglas Crockford's regexp rule:

var yourUrl = "http://www.sitename.com/article/2009/09/14/this-is-an-article/";
var parse_url = /^(?:([A-Za-z]+):)?(\/{0,3})([0-9.\-A-Za-z]+)(?::(\d+))?(?:\/([^?#]*))?(?:\?([^#]*))?(?:#(.*))?$/;
var parts = parse_url.exec( yourUrl );
var result = parts[1]+':'+parts[2]+parts[3]+'/' ;

If you are looking for a more powerful URL manipulation toolkit try URI.js It supports getters, setter, url normalization etc. all with a nice chainable api.

If you are looking for a jQuery Plugin, then jquery.url.js should help you

A simpler way to do it is by using an anchor element, as @epascarello suggested. This has the disadvantage that you have to create a DOM Element. However this can be cached in a closure and reused for multiple urls:

var parseUrl = (function () {
  var a = document.createElement('a');
  return function (url) {
    a.href = url;
    return {
      host: a.host,
      hostname: a.hostname,
      pathname: a.pathname,
      port: a.port,
      protocol: a.protocol,
      search: a.search,
      hash: a.hash
    };
  }
})();

Use it like so:

paserUrl('http://google.com');
alexandru.topliceanu
  • 2,364
  • 2
  • 27
  • 38
8

A good way is to use JavaScript native api URL object. This provides many usefull url parts.

For example:

const url = 'https://stackoverflow.com/questions/1420881/how-to-extract-base-url-from-a-string-in-javascript'

const urlObject = new URL(url);

console.log(urlObject);


// RESULT: 
//________________________________
hash: "",
host: "stackoverflow.com",
hostname: "stackoverflow.com",
href: "https://stackoverflow.com/questions/1420881/how-to-extract-base-url-from-a-string-in-javascript",
origin: "https://stackoverflow.com",
password: "",
pathname: "/questions/1420881/how-to-extract-base-url-from-a-string-in-javaript",
port: "",
protocol: "https:",
search: "",
searchParams: [object URLSearchParams]
... + some other methods

As you can see here you can just access whatever you need.

For example: console.log(urlObject.host); // "stackoverflow.com"

doc for URL

V. Sambor
  • 12,361
  • 6
  • 46
  • 65
8

If you are extracting information from window.location.href (the address bar), then use this code to get http://www.sitename.com/:

var loc = location;
var url = loc.protocol + "//" + loc.host + "/";

If you have a string, str, that is an arbitrary URL (not window.location.href), then use regular expressions:

var url = str.match(/^(([a-z]+:)?(\/\/)?[^\/]+\/).*$/)[1];

I, like everyone in the Universe, hate reading regular expressions, so I'll break it down in English:

  • Find zero or more alpha characters followed by a colon (the protocol, which can be omitted)
  • Followed by // (can also be omitted)
  • Followed by any characters except / (the hostname and port)
  • Followed by /
  • Followed by whatever (the path, less the beginning /).

No need to create DOM elements or do anything crazy.

Timo Tijhof
  • 10,032
  • 6
  • 34
  • 48
BMiner
  • 16,669
  • 12
  • 53
  • 53
7

You can use below codes for get different parameters of Current URL

alert("document.URL : "+document.URL);
alert("document.location.href : "+document.location.href);
alert("document.location.origin : "+document.location.origin);
alert("document.location.hostname : "+document.location.hostname);
alert("document.location.host : "+document.location.host);
alert("document.location.pathname : "+document.location.pathname);
Nimesh07
  • 362
  • 3
  • 5
7

I use a simple regex that extracts the host form the url:

function get_host(url){
    return url.replace(/^((\w+:)?\/\/[^\/]+\/?).*$/,'$1');
}

and use it like this

var url = 'http://www.sitename.com/article/2009/09/14/this-is-an-article/'
var host = get_host(url);

Note, if the url does not end with a / the host will not end in a /.

Here are some tests:

describe('get_host', function(){
    it('should return the host', function(){
        var url = 'http://www.sitename.com/article/2009/09/14/this-is-an-article/';
        assert.equal(get_host(url),'http://www.sitename.com/');
    });
    it('should not have a / if the url has no /', function(){
        var url = 'http://www.sitename.com';
        assert.equal(get_host(url),'http://www.sitename.com');
    });
    it('should deal with https', function(){
        var url = 'https://www.sitename.com/article/2009/09/14/this-is-an-article/';
        assert.equal(get_host(url),'https://www.sitename.com/');
    });
    it('should deal with no protocol urls', function(){
        var url = '//www.sitename.com/article/2009/09/14/this-is-an-article/';
        assert.equal(get_host(url),'//www.sitename.com/');
    });
    it('should deal with ports', function(){
        var url = 'http://www.sitename.com:8080/article/2009/09/14/this-is-an-article/';
        assert.equal(get_host(url),'http://www.sitename.com:8080/');
    });
    it('should deal with localhost', function(){
        var url = 'http://localhost/article/2009/09/14/this-is-an-article/';
        assert.equal(get_host(url),'http://localhost/');
    });
    it('should deal with numeric ip', function(){
        var url = 'http://192.168.18.1/article/2009/09/14/this-is-an-article/';
        assert.equal(get_host(url),'http://192.168.18.1/');
    });
});
Michael_Scharf
  • 33,154
  • 22
  • 74
  • 95
3
function getBaseURL() {
    var url = location.href;  // entire url including querystring - also: window.location.href;
    var baseURL = url.substring(0, url.indexOf('/', 14));


    if (baseURL.indexOf('http://localhost') != -1) {
        // Base Url for localhost
        var url = location.href;  // window.location.href;
        var pathname = location.pathname;  // window.location.pathname;
        var index1 = url.indexOf(pathname);
        var index2 = url.indexOf("/", index1 + 1);
        var baseLocalUrl = url.substr(0, index2);

        return baseLocalUrl + "/";
    }
    else {
        // Root Url for domain name
        return baseURL + "/";
    }

}

You then can use it like this...

var str = 'http://en.wikipedia.org/wiki/Knopf?q=1&t=2';
var url = str.toUrl();

The value of url will be...

{
"original":"http://en.wikipedia.org/wiki/Knopf?q=1&t=2",<br/>"protocol":"http:",
"domain":"wikipedia.org",<br/>"host":"en.wikipedia.org",<br/>"relativePath":"wiki"
}

The "var url" also contains two methods.

var paramQ = url.getParameter('q');

In this case the value of paramQ will be 1.

var allParameters = url.getParameters();

The value of allParameters will be the parameter names only.

["q","t"]

Tested on IE,chrome and firefox.

Christoph
  • 50,121
  • 21
  • 99
  • 128
shaikh
  • 1,355
  • 4
  • 16
  • 30
3

Instead of having to account for window.location.protocol and window.location.origin, and possibly missing a specified port number, etc., just grab everything up to the 3rd "/":

// get nth occurrence of a character c in the calling string
String.prototype.nthIndex = function (n, c) {
    var index = -1;
    while (n-- > 0) {
        index++;
        if (this.substring(index) == "") return -1; // don't run off the end
        index += this.substring(index).indexOf(c);
    }
    return index;
}

// get the base URL of the current page by taking everything up to the third "/" in the URL
function getBaseURL() {
    return document.URL.substring(0, document.URL.nthIndex(3,"/") + 1);
}
sova
  • 101
  • 5
2

Implementation:

const getOriginByUrl = url => url.split('/').slice(0, 3).join('/');

Test:

getOriginByUrl('http://www.sitename.com:3030/article/2009/09/14/this-is-an-article?lala=kuku');

Result:

'http://www.sitename.com:3030'
Alexander
  • 7,484
  • 4
  • 51
  • 65
2

This works:

location.href.split(location.pathname)[0];
Timo Tijhof
  • 10,032
  • 6
  • 34
  • 48
Alain Beauvois
  • 5,896
  • 3
  • 44
  • 26
1

You can do it using a regex :

/(http:\/\/)?(www)[^\/]+\//i

does it fit ?

Clement Herreman
  • 10,274
  • 4
  • 35
  • 57
  • 1
    Hmm, from my limited regex skills, it looks like that's at least close. I'll add some more information to the question to see if I can help narrow down the best regex. – Bungle Sep 14 '09 at 11:08
  • 1
    I ended up using .split('/') on the string just because it was an easier solution for me. Thanks for your help, though! – Bungle Sep 14 '09 at 11:24
  • 2
    https URLs? Host names not starting with www? Why capture the www anyway? – Tim Down Sep 14 '09 at 15:08
  • 1
    I don't know, the OP asked how to catch a url, and in his example there was http & www. – Clement Herreman Sep 14 '09 at 16:47
1

To get the origin of any url, including paths within a website (/my/path) or schemaless (//example.com/my/path), or full (http://example.com/my/path) I put together a quick function.

In the snippet below, all three calls should log https://stacksnippets.net.

function getOrigin(url)
{
  if(/^\/\//.test(url))
  { // no scheme, use current scheme, extract domain
    url = window.location.protocol + url;
  }
  else if(/^\//.test(url))
  { // just path, use whole origin
    url = window.location.origin + url;
  }
  return url.match(/^([^/]+\/\/[^/]+)/)[0];
}

console.log(getOrigin('https://stacksnippets.net/my/path'));
console.log(getOrigin('//stacksnippets.net/my/path'));
console.log(getOrigin('/my/path'));
Tom Kay
  • 1,512
  • 15
  • 25
0

This, works for me:

var getBaseUrl = function (url) {
  if (url) {
    var parts = url.split('://');
    
    if (parts.length > 1) {
      return parts[0] + '://' + parts[1].split('/')[0] + '/';
    } else {
      return parts[0].split('/')[0] + '/';
    }
  }
};
abelabbesnabi
  • 1,819
  • 1
  • 16
  • 21
0
var tilllastbackslashregex = new RegExp(/^.*\//);
baseUrl = tilllastbackslashregex.exec(window.location.href);

window.location.href gives the current url address from browser address bar

it can be any thing like https://stackoverflow.com/abc/xyz or https://www.google.com/search?q=abc tilllastbackslashregex.exec() run regex and retun the matched string till last backslash ie https://stackoverflow.com/abc/ or https://www.google.com/ respectively

  • 5
    Please add brief description. – Preet Dec 05 '17 at 06:40
  • 6
    **From review queue**: May I request you to please add some context around your source-code. Code-only answers are difficult to understand. It will help the asker and future readers both if you can add more information in your post. – RBT May 30 '19 at 08:49