4

I've a routing mechanism that dispatches requests by relying on the file system structure:

function Route($root) {
  $root = realpath($root) . '/';
  $segments = array_filter(explode('/',
    substr($_SERVER['PHP_SELF'], strlen($_SERVER['SCRIPT_NAME']))
  ), 'strlen');

  if ((count($segments) == 0) || (is_dir($root) === false)) {
    return true; // serve index
  }

  $controller = null;
  $segments = array_values($segments);

  while ((is_null($segment = array_shift($segments)) !== true)
    && (is_dir($root . $controller . $segment . '/'))) {
      $controller .= $segment . '/';
  }

  if ((is_file($controller = $root . $controller . $segment . '.php')) {
    $class = basename($controller . '.php');
    $method = array_shift($segments) ?: $_SERVER['REQUEST_METHOD'];

    require($controller);

    if (method_exists($class = new $class(), $method)) {
      return call_user_func_array(array($class, $method), $segments);
    }
  }

  throw new Exception('/' . implode('/', self::Segment()), 404); // serve 404
}

Basically, it tries to map as many URL segments to directories as it can, matching the following segment to the actual controller (.php file with the same name). If more segments are provided, the first defines the action to call (falling back to the HTTP method), and the remaining as the action arguments.

The problem is that (depending on the file system structure) there are some ambiguities. Consider this:

- /controllers
  - /admin
    - /company
      - /edit.php   (has get() & post() methods)
    - /company.php  (has get($id = null) method)

Now the ambiguity - when I access domain.tld/admin/company/edit/ the edit.php controller serves the request (as it should), however accessing domain.tld/admin/company/ via GET or domain.tld/admin/company/get/ directly throws a 404 error because the company segment was mapped to the corresponding directory, even though the remaining segments have no mapping in the file system. How can I solve this issue? Preferably without putting too much effort in the disk.

There are already a lot of similar questions in SO regarding this problem, I looked at some of them but I couldn't find a single answer that provides a reliable and efficient solution.

Alix Axel
  • 151,645
  • 95
  • 393
  • 500

3 Answers3

3

For critical stuff like this it's really important to write test, with a test Framework like PHPUnit.

Install it like described here (You need pear): https://github.com/sebastianbergmann/phpunit/

I also use a virtual file system so your test folder doesn't get cluttered: https://github.com/mikey179/vfsStream/wiki/Install

I simply dropped your Route function into a file called Route.php. In the same directory I now created a test.php file with the following content:

<?php

require_once 'Route.php';

class RouteTest extends PHPUnit_Framework_TestCase {
}

To check if it all works open the command line and do the following:

$ cd path/to/directory
$ phpunit test.php
PHPUnit 3.7.13 by Sebastian Bergmann.

F

Time: 0 seconds, Memory: 1.50Mb

There was 1 failure:

1) Warning
No tests found in class "RouteTest".


FAILURES!
Tests: 1, Assertions: 0, Failures: 1.

If this appears PHPUnit is correctly installed and you're ready to write tests.

In order make the Route function better testable and less coupled to server and the file system I modified it slightly:

// new parameter $request instead of relying on server variables
function Route($root, $request_uri, $request_method) {
  // vfsStream doesn't support realpath(). This will do.
  $root .= '/';
  // replaced server variable with $request_uri
  $segments = array_filter(explode('/', $request_uri), 'strlen');

  if ((count($segments) == 0) || (is_dir($root) === false)) {
    return true; // serve index
  }

  $controller = null;
  $all_segments = array_values($segments);
  $segments = $all_segments;

  while ((is_null($segment = array_shift($segments)) !== true)
    && (is_dir($root . $controller . $segment . '/'))) {
      $controller .= $segment . '/';
  }

  if (is_file($controller = $root . $controller . $segment . '.php')) {
    $class = basename($controller . '.php');
    // replaced server variable with $request_method
    $method = array_shift($segments) ?: $request_method;

    require($controller);

    if (method_exists($class = new $class(), $method)) {
      return call_user_func_array(array($class, $method), $segments);
    }
  }
  // $all_segments variable instead of a call to self::
  throw new Exception('/' . implode('/', $all_segments), 404); // serve 404
}

Lets add a test to check wheter the function returns true if the index route is requested:

public function testIndexRoute() {
    $this->assertTrue(Route('.', '', 'get'));
    $this->assertTrue(Route('.', '/', 'get'));
}

Because your test class extends PHPUnit_Framework_TestCase you can now use methods like $this->assertTrue to check wheter a certain statement evaluates to true. Lets run it again:

$ phpunit test.php
PHPUnit 3.7.13 by Sebastian Bergmann.

.

Time: 0 seconds, Memory: 1.75Mb

OK (1 test, 2 assertions)

To this test passed! Lets test if array_filter removes empty segments correctly:

public function testEmptySegments() {
    $this->assertTrue(Route('.', '//', 'get'));
    $this->assertTrue(Route('.', '//////////', 'get'));
}

Lets also test if the index route is requested if the $root directory for the routes doesn't exist.

public function testInexistentRoot() {
    $this->assertTrue(Route('./inexistent', '/', 'get'));
    $this->assertTrue(Route('./does-not-exist', '/some/random/route', 'get'));
}

To test more stuff than this we now need files containing classes with methods. So let's use our virtual file system to setup a directory structure with files before running each test.

require_once 'Route.php';
require_once 'vfsStream/vfsStream.php';

class RouteTest extends PHPUnit_Framework_TestCase {

    public function setUp() {
        // intiialize stuff before each test
    }

    public function tearDown() {
        // clean up ...
    }

PHPUnit has some special methods for this kind of thing. The setUp method gets executed before every test method in this test class. And the tearDown method after a test method as been executed.

Now I create a directory Structure using vfsStream. (If you are looking for a tutorial to do this: https://github.com/mikey179/vfsStream/wiki is a pretty good resource)

    public function setUp() {
        $edit_php = <<<EDIT_PHP
<?php
class edit {
    public function get() {
        return __METHOD__ . "()";
    }
    public function post() {
        return __METHOD__ . "()";
    }
}
EDIT_PHP;

        $company_php = <<<COMPANY_PHP
<?php
class company {
    public function get(\$id = null) {
        return __METHOD__ . "(\$id)";
    }
}
COMPANY_PHP;

        $this->root = vfsStream::setup('controllers', null, Array(
            'admin' => Array(
                'company' => Array(
                    'edit.php' => $edit_php
                ),
                'company.php' => $company_php
            )
        ));
    }

    public function tearDown() {
        unset($this->root);
    }

vfsStream::setup() now creates a virtual directory with the given file structure and the given file contents. And as you can see I let my controllers return the name of the method and the parameters as string.

Now we can add a few more tests to our test suite:

public function testSimpleDirectMethodAccess() {
    $this->assertEquals("edit::get()", Route(vfsStream::url('controllers'), '/controllers/admin/company/edit/get', 'get'));
}

But this time the test fails:

$ phpunit test.php
PHPUnit 3.7.13 by Sebastian Bergmann.

...
Fatal error: Class 'edit.php.php' not found in C:\xampp\htdocs\r\Route.php on line 27

So there is something wrong with the $class variable. If we now inspect the following line in the Route function with a debugger (or some echos).

$class = basename($controller . '.php');

We can see the that the $controller variable holds the correct filename, but why is there a .php appended? This seams to be a typing mistake. I think it should be:

$class = basename($controller, '.php');

Because this removes the .php extension. And we get the correct classname edit.

Now let's test if an exception gets thrown if we request an random path which doesn't exist in our directory structure.

/**
 * @expectedException Exception
 * @expectedMessage /random-route-to-the/void
 */
public function testForInexistentRoute() {
    Route(vfsStream::url('controllers'), '/random-route-to-the/void', 'get');
}

PHPUnit automaticly reads this comments and checks if an Exception of type Exception is thrown when executing this method and if the message of the Exception was /random-route-to-the/void

This seams to work. Lets check if the $request_method parameter works properly.

public function testMethodAccessByHTTPMethod() {
    $this->assertEquals("edit::get()", Route(vfsStream::url('controllers'), '/admin/company/edit', 'get'));
    $this->assertEquals("edit::post()", Route(vfsStream::url('controllers'), '/admin/company/edit', 'post'));
}

If we execute this test we run into an other issue:

$ phpunit test.php
PHPUnit 3.7.13 by Sebastian Bergmann.

....
Fatal error: Cannot redeclare class edit in vfs://controllers/admin/company/edit.php on line 2

Looks like we use an include/require multiple times for the same file.

require($controller);

Lets change that to

require_once($controller);

Now let's face your issue and write a test to check that the directory company and the file company.php do not interfere with each other.

$this->assertEquals("company::get()", Route(vfsStream::url('controllers'), '/admin/company', 'get'));
$this->assertEquals("company::get()", Route(vfsStream::url('controllers'), '/admin/company/get', 'get'));

And here we get the 404 Exception, as you stated in your question:

$ phpunit test.php
PHPUnit 3.7.13 by Sebastian Bergmann.

.....E.

Time: 0 seconds, Memory: 2.00Mb

There was 1 error:

1) RouteTest::testControllerWithSubControllers
Exception: /admin/company

C:\xampp\htdocs\r\Route.php:32
C:\xampp\htdocs\r\test.php:69

FAILURES!
Tests: 7, Assertions: 10, Errors: 1.

The problem right here is, we don't know excatly when to enter the subdirectory and when to use the controller in the .php file. So we need to specify what exactly you want to happen. And I assume the following, because it makes sense.

  • Only enter a subdirectory if the controller doesn't contain the method requested.
  • If neither the controller nor the subdirectory contains the method requested throw a 404

So instead of searching directories like here:

while ((is_null($segment = array_shift($segments)) !== true)
  && (is_dir($root . $controller . $segment . '/'))) {
    $controller .= $segment . '/';
}

We need to search for files. And if we find a file which doesn't contain the method requested, then we search for a directory.

function Route($root, $request_uri, $request_method) {
  $segments = array_filter(explode('/', $request_uri), 'strlen');

  if ((count($segments) == 0) || (is_dir($root) === false)) {
    return true; // serve index
  }

  $all_segments = array_values($segments);
  $segments = $all_segments;

  $directory = $root . '/';
  do {
    $segment = array_shift($segments);
    if(is_file($controller = $directory . $segment . ".php")) {
      $class = basename($controller, '.php');
      $method = isset($segments[0]) ? $segments[0] : $request_method;

      require_once($controller);
      if (method_exists($class = new $class(), $method)) {
        return call_user_func_array(array($class, $method), array_slice($segments, 1));
      }
    }
    $directory .= $segment . '/';
  } while(is_dir($directory));

  throw new Exception('/' . implode('/', $all_segments), 404); // serve 404
}

This method works now as expected.

We could now add a lot more test cases, but I don't want to stretch this more. As you can see it's very useful to run a set of automated tests to ensure that some things in your function work. It's also very helpful for debugging, because you get to know where exactly the error occured. I just wanted to give you a start on how to do TDD and how to use PHPUnit, so you can debug your code yourself.

"Give a man a fish and you feed him for a day. Teach a man to fish and you feed him for a lifetime."

Of course you should write the tests before you write the code.

Here a few more links which could be interesting:

MarcDefiant
  • 6,649
  • 6
  • 29
  • 49
  • You deserve the bounty just for the effort you put into the answer. Anyway, you seem to focus excessively on the TDD and I can't reproduce the `Fatal error: Class 'edit.php.php'` error with my live code, but either way, the problem I'm facing can be summed up after your *"The problem right here is, we don't know excatly when to enter the subdirectory and when to use the controller in the .php file."* statement (let me continue this in a follow-up comment because I'm reaching the maximum allowed length). – Alix Axel Feb 19 '13 at 19:20
  • If files have precedence over directories. Consider I have the controller `admin` and the sub-admin controller `posts`. When I access `domain.com/admin/posts/` instead of being presented with the `posts::get()` method that would list the newest posts for instance, I will be served the `admin::posts()` methods, which may or may not exist. Sure, the problem exists both ways but if files have precedence then the hierarchical routing will only work when no collisions exist. I was looking for a way to make the routing smarter so it deduce which level it makes sense to serve. – Alix Axel Feb 19 '13 at 19:26
  • One simple solution would be to pre-fetch all the paths of all controllers and require each one so that their methods could be inspected with Reflection. This of course, would make it as trivial as matching a regular expression but the performance wouldn't be the best, specially regarding disk hits. Another (although more complex) solution would be to go back and forth trying to match directories then files then existent methods, *iff* none is found it would go up a level and try another option until all segments / filesystem paths are exhausted. – Alix Axel Feb 19 '13 at 19:30
  • I guess a mix between those two solutions would be the most effective, i.e., preload all the controller files (possibly caching their attributes so the disk doesn't get hit too much) and method / argument information via reflection that directly match the *n*, *n-1*, *n-2*, ..., *1*, *0* levels. I'm still not sure about this, hence the question / bounty. – Alix Axel Feb 19 '13 at 19:34
2

Although your method of magic HVMC is convenient for developers.. it could become a bit of a performance killer (all the stats/lstats). I once used a similar method of mapping FS to routes but later gave up on the magic and replaced it with some good old fashion hard coded config:

$controller_map = array(
  '/some/route/' => '/some/route.php',
  '/anouther/route/' => 'another/route.php',
  # etc, etc, ...
);

Perhaps it's not as elegant as what you have in place and will require some config changes everytime you add/remove a controller (srsly, this shouldn't be a common task..) but it is faster, removes all ambiguities, and gets rid of all of the useless disk/page-cache lookups.

smassey
  • 5,875
  • 24
  • 37
  • Yeah, that's a good suggestion and I do it already but surely you can see the benefit of having such a FS router, specially during development. The idea would be to optimize disk lookups later on, with a memory cache like APC. If you look at https://gist.github.com/alixaxel/4692575 you can see that you can provide much more convenient HTTP status codes with FS that with a "blind" regex-based route. – Alix Axel Feb 19 '13 at 19:41
  • Although this doesn't provide a solution to the specific problem, it's the only answer that delivers a non-ambiguous routing strategy, bounty awarded. – Alix Axel Feb 20 '13 at 22:28
1

Sorry, I haven't had the time to test my solution, but here is my suggestion:

while ((is_null($segment = array_shift($segments)) !== true)
    && (is_dir($root . $controller . $segment . '/'))
    && ( (is_file($controller = $root . $controller . $segment . '.php') 
        && (!in_array(array_shift(array_values($segments)), ['get','post']) || count($segments)!=0 ) ) ) {
      $controller .= $segment . '/';
  }

A simple explanation to the above code would be that if a route that is both a file and a directory is encountered, check if it is succeeded by get / post or if it is the last segment in the $segments array. If it is, treat it as a file, otherwise, keep adding segments to the $controller variable.

Although the code sample I gave is simply what I had in mind, it has not been tested. However, if you use this workflow in the comparison, you should be able to pull it off. I suggest you follow smassey's answer and keep to declaring routes for each controller.

Note: I am using *array_shift* on *array_values* so I only pull the next segment's value without tampering with the $segments array. [edit]

Octav O
  • 503
  • 2
  • 6
  • I can see what you are going for but the `is_dir` / `is_file` combo wouldn't as it is because: 1) you're only calling `array_shift` once - so `/admin/posts/` would wrongly try to point to `/admin/admin.php[/posts/posts.php]` and, 2) a controller might be directly in the root (i.e. not in a sub-directory). – Alix Axel Feb 20 '13 at 13:21