0

Here's my code:

$url = "https://de.wikipedia.org/wiki/…_und_wenn_der_letzte_Reifen_platzt";

$base = basename($url);
echo $base . "<br>";

$url2 = urlencode($base);
echo $url2 . "<br>";

$url = dirname($url) . "/" . $url2;

echo $url;
$aHeader = @get_headers($url);

echo "<pre>" . print_r($aHeader,true) . "</pre>";

It works fine on my local machine (running Xampp with PHP v7.3.12) - $base encodes as %E2%80%A6_und_wenn_der_letzte_Reifen_platzt

But when running on my server, $base will encode as _und_wenn_der_letzte_Reifen_platzt which is wrong and will result in an error 404 (the server is running on PHP 7.2.24).

Any ideas what is causing this behaviour? Both scripts are encoded in UTF-8.

Fuxi
  • 7,611
  • 25
  • 93
  • 139

2 Answers2

0

I could be a bug related to the basename function. Because if you mix char with letters in und_wenn_der_letzte_Reifen_platzt part, if works as expected. You can try to upgrade your PHP on your server matching your local version if possible.

If you can't do this, there is always a better way to achieve this with regular expressions.

$re = '/.+\/(.*)/m';
$str = 'https://de.wikipedia.org/wiki/…_und_wenn_der_letzte_Reifen_platzt';

preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);

$base = $matches[0][1];
echo $base . "<br>";

$url2 = rawurlencode($base);
echo $url2 . "<br>";
Emre Aydin
  • 569
  • 2
  • 16
0

I just ran into the same problem while processing some MP3 files of French songs I listen to. I set up a webpage where I can download a M3U playlist filtered according to what I want to listen to on my phone. I simply download the playlist and it will find the songs on my phone in a MP3 folder. Problem was that basename truncated the base filenames. Frustrated, I tracked it down to the "basename" function in PHP. I found a simple solution by creating a new basename function once I realized that paths as well as URLs used the "/" as a seperator. And, it was the final "/" that defines what the base name is ...

function basename_x($url, $ext = NULL ) {
    $Array_Check = TRUE;
    $url = explode("/", $url);
    $Array_Check = ( is_array($url) ? TRUE : FALSE );
    $key = ( $Array_Check ? count($url) - 1 : NULL );
    if ( $ext != NULL ) {
            if ( $Array_Check ) {
                $url[$key] = preg_replace( "/$ext/", '', $url[$key] );
              } else {
                $url       = preg_replace( "/$ext/", '', $url );
               }
            }
    $base_name = ( $Array_Check ? $url[$key] : $url );
 return $base_name;
 }

$sample = "./MP3s/À_ton_nom_-_Collectif_Cieux_Ouverts.mp3";
$this_doesnt_work = basename($sample);
$will_this_work  = basename_x($sample);

var_dump($this_doesnt_work,$will_this_work);

From the command line, this is the output ...

string(40) "À_ton_nom_-_Collectif_Cieux_Ouverts.mp3"
string(40) "À_ton_nom_-_Collectif_Cieux_Ouverts.mp3"

But, when I ran this on my Apache Server, I got this instead ...

string(38) "_ton_nom_-_Collectif_Cieux_Ouverts.mp3" 
string(40) "À_ton_nom_-_Collectif_Cieux_Ouverts.mp3"

I find it interesting that "A" in the file accounts for two characters, not one. Anyway, this approach solved my problem without having to play with my locale settings in PHP. Of course, I added the feature of removing the extension as well as insuring the URL is exploded into a true array. But, it was a quick work around with a simple solution.

Hope this helps someone with the same problem.

MrBoo99
  • 1
  • 1