-3

How can I extract the following parts using PHP function:

  • The Domain
  • The path without the file
  • The file
  • The file with Extension
  • The file without Extension
  • The scheme
  • The port
  • The query
  • The fragment
  • (add any other that you think would be useful)

Ex.1 https://stackoverflow.com/users/test/login.php?q=san&u=post#top

  • The Domain (stackoverflow.com)
  • The path without the file (/users/test/)
  • The file(login.php)
  • The file Extension (.php)
  • The file without Extension (login)
  • The scheme(https:)
  • The port(return empty string)
  • The query(q=san&u=post)
  • The fragment(top)

Ex: 2 stackoverflow.com/users/test/login.php?q=san&u=post#top

  • The Domain (stackoverflow.com)
  • The path without the file (/users/test/)
  • The file(login.php)
  • The file Extension (.php)
  • The file without Extension (login)
  • The scheme(return empty string)
  • The port(return empty string)
  • The query(q=san&u=post)
  • The fragment(top)

Ex: 3 /users/test/login.php?q=san&u=post#top

  • The path without the file (/users/test/)
  • The file(login.php)
  • The file Extension (.php)
  • The file without Extension (login)
  • The query(q=san&u=post)
  • The fragment(top)
  • For remaining (return empty string)

Ex: 4 /users/test/login?q=san&u=post#top

  • The path without the file (/users/test/)
  • The file(login)
  • The file Extension (return empty string)
  • The file without Extension (login)
  • The query(q=san&u=post)
  • The fragment(top)
  • For remaining (return empty string)

Ex: 5 login?q=san&u=post#top

  • The file(login)
  • The file Extension (return empty string)
  • The file without Extension (login)
  • The query(q=san&u=post)
  • The fragment(top)
  • For remaining (return empty string)

Ex: 6 ?q=san&u=post

  • The query(q=san&u=post)
  • For remaining (return empty string)

I checked parse_url function, but doesn't return what I need. Since, I'm beginner in PHP, it was difficult for me. If you have any idea, please answer.

Thanks in advance.

Alan Moore
  • 73,866
  • 12
  • 100
  • 156
San Ka Ran
  • 156
  • 2
  • 18
  • 2
    Have you tried anything? – Sougata Bose Nov 16 '15 at 10:24
  • ***login**?q=san&u=post#top -> The file(**login.php**)* - Where is `.php` in the input? – Wiktor Stribiżew Nov 16 '15 at 10:27
  • What did `parse_url` not return? – hjpotter92 Nov 16 '15 at 10:29
  • ya. sougata. I have tired "parse_url" function. but its only working for Ex.1(I have mentioned above). For remaining case it's not giving proper answer. – San Ka Ran Nov 16 '15 at 10:29
  • @SanKaRan - So what do you need that parse_url() doesn't give you? Does using parse_url() with pathinfo() give you everything? – Mark Baker Nov 16 '15 at 10:29
  • @stribizhev. Oops. I have edited. thanks :) – San Ka Ran Nov 16 '15 at 10:31
  • File and extension don't *really* have anything to do with the URL, especially with `mod_rewrite` and everything else is covered by `parse_url()` isn't it? – CD001 Nov 16 '15 at 10:31
  • for my sec case(Ex.2) it is return my domain as empty. path as stackoverflow.com/users/test/. – San Ka Ran Nov 16 '15 at 10:35
  • The problem with your second example is that it's quite possible to have a directory called *stackoverflow.com* rather than that being a domain, e.g. `http://example.com/stackoverflow.com/` would be a perfectly valid URL... there's no way you can guarantee you're parsing that correctly without the scheme... or at least a leading slash or two. `parse_url()` is, I presume, treating `stackoverflow.com` as part of the path in that example. – CD001 Nov 16 '15 at 10:45

2 Answers2

1

PHP provides a parse_url function.

This function parses a URL and returns an associative array containing any of the various components of the URL that are present.

This function is not meant to validate the given URL, it only breaks it up into the above listed parts. Partial URLs are also accepted, parse_url() tries its best to parse them correctly.


You can see the test cases executed here.

$urls = array(
  "https://stackoverflow.com/users/test/login.php?q=san&u=post#top",
  "/users/test/login.php?q=san&u=post#top",
  "?q=san&u=post#top",
  "login.php?q=san&u=post#top",
  "/users/test/login?q=san&u=post#top",
  "login?q=san&u=post#top"
);
foreach( $urls as $x ) {
  echo $x . "\n";
  var_dump( parse_url($x) );
}
hjpotter92
  • 78,589
  • 36
  • 144
  • 183
  • 2
    *I checked parse_url function, but doesn't return what I need.* – Wiktor Stribiżew Nov 16 '15 at 10:27
  • @hipotter. thanks. but for my sec case it's not working fyn. its return as ["path"]=> string(38) "stackoverflow.com/users/test/login.php". – San Ka Ran Nov 16 '15 at 10:41
  • @SanKaRan How'll you differentiate b/w `website.info/page.php?query` and `page.php/website.info?query`? Both of them are valid as `path` but only one qualifies as actual webhost. – hjpotter92 Nov 16 '15 at 10:46
  • @hipotter. yep I agree with your point. but any other idea is there to differentiate bw these two. – San Ka Ran Nov 16 '15 at 10:50
1

I"m using this to locate the root and webroot

<?php

/**
 * @brief get paths from the location where it was executed.
 */
class PathHelper {
    /**
     * @brief This function tries to determine the FileSystem root of the application. (needs to be executed in the root)
     * @return string
     */
    public static function locateRoot($file) {
        $dir = dirname($file);
        /** FIX Required for WINDOWS * */
        $dir = preg_replace('/\\\\/', '/', $dir);
        $dir = preg_replace('/\\\/', '/', $dir);
        return $dir;
    }

    /**
     * @brief This function tries to determine the WebRoot. (needs to be executed in the root)
     * @return string
     */
    public static function locateWebroot($file) {
        $docroot = $_SERVER['DOCUMENT_ROOT'];
        $dir = dirname($file);
        if ($dir == $docroot) {
            $webroot = "";
        } else {
            $webroot = substr_replace($dir, '', 0, strlen($docroot));
        }
        /** FIX Required for WINDOWS * */
        $webroot = preg_replace('/\\\\/', '/', $webroot);
        $webroot = preg_replace('/\\\/', '/', $webroot);
        return $webroot;
    }
}

I set this as a constant so i can use it throughout my application.

For example:

For a menu you can do something like this:

   // the requested url
    $requestedUrl = $_SERVER['REQUEST_URI'];

    // remove the webroot from the requested url
    $requestedUrl = str_replace(WEBROOT, "", $_SERVER['REQUEST_URI']);

    // take away the / if they still exist at the beginning
    $requestedUrl = ltrim($requestedUrl, "/");

Then i got this: index.php?controller=User&action=overview

This equals to my url of one of my menu items. You could use explode on this last url to find all the other values you want.

Edit: Its probably better to use parse_url(). I am not used to all the functions in PHP but if nothing works then this is atleast a fallback.

jermey
  • 179
  • 2
  • 10