1

I'm trying to view images in different network folders. Everything works but as soon there's a folder have "&" in its name, I can't view the content. The URL stops after "&" so it can't find the folder in question.

I've looked at other questions similar to this problem but none of them worked (where also many could use strreplace() for a single URL which in my case don't work when I have hundreds of folders)

Running on Windows Server 2022 with IIS.

My code:

<?php 

$selectedDir = isset($_GET["dir"]) ? $_GET["dir"] : "";
//echo $selectedDir;
echo '<h1> '.$selectedDir.' </h1>';
$dirs = "\\\\server\images\\moreimages/$selectedDir";


$dir = new DirectoryIterator($dirs);
foreach ($dir as $fileinfo) {
    if ($fileinfo->isDir() && !$fileinfo->isDot()) {
        $directorypath = $selectedDir."/".$fileinfo->getFilename();
        echo "<a href='?dir=".$directorypath."'>".$fileinfo->getFilename()."</a><br><br>";
        
    }
        elseif (stripos($fileinfo, '.jpg') !== false || stripos($fileinfo, '.png') !== false) {

        
        ?>
        <?php  echo '<a target="_blank" href="'."\Intraimages\VD_images/$selectedDir/".$fileinfo.'"/>'; ?>
       <?php echo '<img src="'."\Intraimages\VD_images/$selectedDir/".$fileinfo.'"/> </a>'; ?>
         
         <?php 
    
    }

}

?>

(For clarification the different paths if it's any help: "\\server\images\moreimages/" is the UNC path. "\Intraimages\VD_images/" is the Virtual Directory path in IIS.)

I will try to provide as much info as I can in advance.

The folder name is: 35144 MAN T&B T2-L62

PHP_errors log:

PHP Fatal error: Uncaught UnexpectedValueException: DirectoryIterator::__construct(\server\images\moreimages//35000-35999/35144 MAN T): The system cannot find the file specifi (code: 2) in C:\inetpub\wwwroot\Intraimages\index.php:38

Tried var_dump $selectedDir which says: string(28) "%2F35000-35999%2F35144+MAN+T"

URL in my web browser says: .../?dir=/35000-35999/35144 MAN T&B T2-L62

HTTP Error

What I've tried

urlencode the $selectedDir

Change some settings in php.ini to %26:

arg_separator.output = "&"

arg_separator.input = ";&"

Do I need to insert "%26" in the URL instead of "&"? If so, I have no idea how.

EDIT

I tried with: str_replace("&","%26",$selectedDir); but the URL and $selectedDir don't contain "&" since it stops right before "&" so I guess there's nothing to replace...?

SOLVED, this is my new code thanks to the all the help:

if ($fileinfo->isDir() && !$fileinfo->isDot()) {
        $directorypath = http_build_query(array($selectedDir."/".$fileinfo->getFilename()));
        $url = parse_url($directorypath);
        $newurl = str_replace('0=','',$url['path']);
        echo "<a href='?dir=".$newurl."'>".$fileinfo->getFilename()."</a><br><br>";        
    }

NOTE: that this my not be the best solution since i'm using str_replace() but it just proves that it works with http_build_query

For those who are interested, the new URL is now: .../?dir=%2F35000-35999%2F35144+MAN+T%26B+T2-L62

Gummi
  • 37
  • 6
  • Of course you should use "%26" instead of "&". – shingo Jun 29 '23 at 07:31
  • Then the question becomes: How? @shingo – Gummi Jun 29 '23 at 07:35
  • You use `str_replace()` – Alessandro Jun 29 '23 at 07:36
  • 1
    Type %26 in the address bar of your browser. – shingo Jun 29 '23 at 07:36
  • @shingo Well if I understand correct, you mean I should insert %26 everytime I view a folder that contains "&"??? That's not a good solution in that case... – Gummi Jun 29 '23 at 07:39
  • @Alessandro And how would I go about that when the URL changes everytime a new folder is viewed. Something like foreach "&" str_replace "%26"? – Gummi Jun 29 '23 at 07:40
  • `$dirs = "\\\\server\images\\moreimages/$selectedDir";` ??? 1 backslash, 2 backslashes, 1 forward slash... this is a quoted string, so to have one backslash you have escape it! – Markus Zeller Jun 29 '23 at 07:40
  • @MarkusZeller It works. Forward slash is there so it understands is a variable. As soon as I change any back slash it doesn't work but that's beside the point of the main problem. It works. – Gummi Jun 29 '23 at 07:42
  • @MarkusZeller Well I don't understand it, as soon as I change from 4 backslash to 1, the page becomes blank, as if it doesn't find the path. – Gummi Jun 29 '23 at 07:46
  • Tested locally on regular windoze pc, apache and network share with folder and sub-folder both containing ampersands - no issues when the path is correctly escaped. UNC paths do not use the forward slash as per the above - only backslash and each of those should be escaped - failing to do so causes the `"Uncaught UnexpectedValueException: DirectoryIterator::__construct....."` error mentioned – Professor Abronsius Jun 29 '23 at 08:20
  • @ProfessorAbronsius I see, I understand it better thanks! So I could I write it in the correct way? Because I think I need the '$selectedDir' – Gummi Jun 29 '23 at 08:34
  • When you have figured that out -- I left an answer as well, HTH -- , then understand _path or directory traversal_ and how it can be used to exploit a PHP script. https://owasp.org/www-community/attacks/Path_Traversal -- it is again an encoding issue, but on a higher level. – hakre Jun 29 '23 at 10:01
  • @hakre Thank you for answer and time! Yes, this is a high level encoding, it will probably take some time for me to both understand and try everything out. One question, could not simply use `http_build_query` in the same line of code as href? I'm just wondering where it would be the best to try this out. – Gummi Jun 29 '23 at 10:13
  • You best play with it on a development server and figuring out each part on its own. If you compare with the example given, you can see that the same line of code as href is with _http_build_query()_. The syntax highlighting is not specifically good at that line to spot it (also just fixed a mistake I had on that line, just FYI). – hakre Jun 29 '23 at 10:17
  • @hakre Ahh yes, totaly missed it, thanks! Just a side note: Didn't think this "small" problem needed so much effort so fix. Thanks again! – Gummi Jun 29 '23 at 10:21
  • @Odin: Well it seem easy to miss, and simple to fix. But then it's often in such details that can make a difference. The safe practice is to check the input against a known list of all valid inputs before actually using it. That part is entirely missing in your script. If you use the filesystem itself you can make at least use of _realpath()_ and then compare if the input path is rooted in the base directory as a minimum. The _realpath()_ function also already returns _false_ if the path does not exists. – hakre Jun 29 '23 at 10:34
  • Look at quoted string: [backslash](https://www.php.net/manual/en/language.types.string.php#language.types.string.syntax.double) – Markus Zeller Jun 30 '23 at 07:55

2 Answers2

1

It requires you to first understand what is happening here, as otherwise it is easy to miss.

It starts here:

$_GET["dir"]

The contents of that array member is not what you think. You will find anything removed from the string starting with (and including) the first ampersand (&).

If you then look into all the array keys of $_GET, you will notice that after the "dir" entry the next key is the continuation of the pathname you'd like to use.

This is because of how PHP parses the incoming request URL, specifically the query-info part. That is the part at the end of a URL starting with the first question mark.

?dir=why&why-not?

array (
  'dir' => 'why',
  'why-not?' => '',
)

So this is basically missing URL encoding, and as you can imagine, PHP has that.

So when you build the URL to be clicked, like having the pathname of the directory for the dir parameter, encode the pathname value properly.

Your script needs to speak the same language, otherwise things may end up unexpected. This is also why it is easy to miss.

Build the query of the URL (http_build_query) to have a proper href attribute value.

And for debugging purposes

echo '<h1> '.$selectedDir.' </h1>';

will display you the HTML of $selectedDir, use htmlspecialchars then.

Both suggestions should be a good example how the data is processed between your PHP scripts and the Browser and back in your PHP scripts.

Example Script

<?php
/*
 * index.php - example of http_build_query() and htmlspecialchars()
 *
 * @link https://www.php.net/manual/en/function.http-build-query.php
 * @link https://www.php.net/manual/en/function.htmlspecialchars.php
 * @link https://stackoverflow.com/a/76579728/367456
 */
?>

<ul>

  <li>
      <a href="?dir=why&why-not?">A</a>

  <li>
      <a href="?<?= http_build_query(array('dir' => 'why&why-not?')) ?>">B</a>

</ul>

<hr>

<pre><?=

    htmlspecialchars(var_export($_GET, true))

?></pre>

If you have PHP on your own computer, you can save this file into one of your directories (create a new one), then open cmd.exe and change into that directory and run the PHP development server:

php -S 127.0.0.1:8080

It will display you a http URL you can use to run the example. You will then see how the browser interacts with your PHP script in the terminal:

[Thu Jun 29 11:44:50 2023] PHP 8.2.7 Development Server (http://127.0.0.1:8080) started
[Thu Jun 29 11:44:55 2023] 127.0.0.1:42426 Accepted
[Thu Jun 29 11:44:55 2023] 127.0.0.1:42426 [200]: GET /
[Thu Jun 29 11:44:55 2023] 127.0.0.1:42426 Closing
[Thu Jun 29 11:44:58 2023] 127.0.0.1:42440 Accepted
[Thu Jun 29 11:44:58 2023] 127.0.0.1:42440 [200]: GET /?dir=why&why-not?
[Thu Jun 29 11:44:58 2023] 127.0.0.1:42440 Closing
[Thu Jun 29 11:45:00 2023] 127.0.0.1:49138 Accepted
[Thu Jun 29 11:45:00 2023] 127.0.0.1:49138 [200]: GET /?dir=dir%3Dwhy%26why-not%3F
[Thu Jun 29 11:45:00 2023] 127.0.0.1:49138 Closing

You stop the PHP development webserver by pressing ctrl + c.

hakre
  • 193,403
  • 52
  • 435
  • 836
  • I'm so close: This is my code `$directorypath = http_build_query(array($selectedDir."/".$fileinfo->getFilename()));` And the URL says now: `/?dir=0=%2F35000-35999%2F35142+MAN+T%26B+T2-L62` I need to get rid of the `"=0"` in `"dir=0="` . I'm not so sure where my slashes went (I guess it's the `%2F`) and if they are needed when I use `http_build_query`. I hope i'm heading in the right direction. – Gummi Jun 29 '23 at 12:55
  • check the manual page, it shows an example how you call the function, check the array format. – hakre Jun 29 '23 at 13:45
  • Thank You for your help! I got it to work. I updated the question to see the result. I'm not really at the endgoal since i'm using `parse_url` and `str_replace` but it proves it works for me. Going to change the code so I only need `http_build_query`. Thanks! – Gummi Jun 30 '23 at 05:54
  • A key benefit of PHP is using $_GET so you don't need parse_url or other string handling specifically. It is also pretty common how PHP parses the URL already, if not to say the de-facto standard, albeit the URL specs make not strong judgement here. As any output requires to have the proper encoding, keeping the counterpart, the data that is coming in with the request simple (standard) then is only closing that circle or finalizing the round-trip. – hakre Jul 01 '23 at 14:22
0

Rather than the single DirectoryIterator as above the following uses recursive iterators to illustrate that ampersands need not be an issue in UNC paths if proper escaping of backslash characters is observed.

Given some test folders on a local network share that have files and sub-folders where the names contain the ampersand...

<?php

    $depth=5;
    $path='\\\\buffalo-1\\share\\temp\\test folder';
    $selectedDir='files & folders';
    
    function isDot( $dir=true ){
        return basename( $dir )=='.' or basename( $dir )=='..';
    }

    # where $selectedDir emulates your GET request variable.
    $dir=sprintf( '%s\\%s', $path, $selectedDir );
    printf(
        '
        <h1>
        Root directory for searches: %s<br />
        Selected Directory: %s<br />
        Working path: %s<br /><br />
        </h1>',
        $path,
        $selectedDir,
        $dir
    );
    
    $dirItr=new RecursiveDirectoryIterator( $dir, RecursiveDirectoryIterator::KEY_AS_PATHNAME );
    $recItr=new RecursiveIteratorIterator( $dirItr, RecursiveIteratorIterator::CHILD_FIRST );
    $recItr->setMaxDepth( $depth );
    
    foreach( $recItr as $obj => $info ) {
        if( $info->isDir() && !isDot( $info->getPathname() ) ) printf('directory:%s<br />', $info->getPathName() );
        elseif( $info->isFile() ) printf('File:%s<br />', $info->getFileName() );   
    }
?>

Example output from the above:

<h1>
    Root directory for searches: \\buffalo-1\share\temp\test folder<br />
    Selected Directory: files & folders<br />
    Working path: \\buffalo-1\share\temp\test folder\files & folders<br /><br />
</h1>
directory:\\buffalo-1\share\temp\test folder\files & folders\hot & cold<br />
directory:\\buffalo-1\share\temp\test folder\files & folders\sweet & sour<br />
directory:\\buffalo-1\share\temp\test folder\files & folders\up & down<br />
File:left & right.txt<br />

The ampsersand characters do not hinder the scanning of nor traversal of files and folders. Whether the practise of using said characters in paths is advisable or not is another matter.

Professor Abronsius
  • 33,063
  • 5
  • 32
  • 46
  • That's interesting, thank you for your answer! I always thought the ampsersand was the problem. Probably have something to with my href attribute then. Thank You again – Gummi Jun 29 '23 at 10:15