2

A URL should be properly encoded to deal with spaces when passed to file_get_contents, but there appears to be a strange inconsistency in behaviour (accidentally discovered) when not doing so that I'm curious about.

Here we see a 400 bad request when the space is not encoded:

$ php -r "echo file_get_contents('https://api.postcodes.io/postcodes/Willoughby Hedge/validate');"

PHP Warning:  file_get_contents(https://api.postcodes.io/postcodes/Willoughby Hedge/validate): Failed to open stream: HTTP request failed! HTTP/1.1 400 Bad Request
 in Command line code on line 1
PHP Stack trace:
PHP   1. {main}() Command line code:0
PHP   2. file_get_contents($filename = 'https://api.postcodes.io/postcodes/Willoughby Hedge/validate') Command line code:1

If we encode it, it works as you would expect:

$ php -r "echo file_get_contents('https://api.postcodes.io/postcodes/Willoughby%20Hedge/validate');"

{"status":200,"result":false}

So why does this also work, without the encoding of the space?

$ php -r "echo file_get_contents('https://api.postcodes.io/postcodes/Willoughby hedge/validate');"

{"status":200,"result":false}

Note the "H" in "Hedge" is now "h" but we didn't encode the space. Curiously, if you substitute "H" for another letter such as "Z", it will also work.

Is there something specifically going on with a space and then a "H" in the function?

This is testing with PHP 8.1.

jamieburchell
  • 761
  • 1
  • 5
  • 18
  • file_get_contents is notoriously hard to troubleshoot and use for URLs. I'd recommend switching to [cURL](https://www.php.net/manual/en/book.curl.php) instead. – aynber Oct 04 '22 at 18:21
  • It's possible (perhaps likely) that you're seeing a bug on the server, and that your code is fine. – Tangentially Perpendicular Oct 04 '22 at 18:23
  • You have to encode it before sending. At least that's what the docs say https://www.php.net/manual/en/function.file-get-contents.php – nice_dev Oct 04 '22 at 18:25
  • @TangentiallyPerpendicular I wondered that, but a cURL request outside of PHP doesn't exhibit the same behaviour unless it is silently encoding spaces. – jamieburchell Oct 04 '22 at 19:01
  • @nice_dev Absolutely, but why does one example without the encoding work? – jamieburchell Oct 04 '22 at 19:02
  • @aynber Yes, but that is not the question :) – jamieburchell Oct 04 '22 at 19:03
  • @jamieburchell Weird behaviour from it. I wouldn't ponder on this much and simply move on. – nice_dev Oct 04 '22 at 19:08
  • 1
    I have to agree with @nice_dev. The behavior is indeed curious however as the documentation explicitly mentions that spaces need to be manually encoded this falls into the "undefined behavior" category. – Chris Haas Oct 04 '22 at 19:29
  • I'll move on but leave the question open incase someone knows a technical reason for it or confirms a bug in PHP. – jamieburchell Oct 04 '22 at 21:28
  • 2
    @jamieburchell, I tried this on my servers out of curiosity and my Nginx logs show `GET /Willoughby Hedge HTTP/1.1`. My one and only guess is that the parser sees a single unencoded space followed by a capital `H` and is expecting that to continue as `HTTP`. Every other sequence appears to work, just not a capital `H`. My server didn't freak out no matter what, but maybe it is a Cloudflare thing, or their origin server. – Chris Haas Oct 05 '22 at 13:14

0 Answers0