2

I'm passing a URL that should generate a 404 Error, using PHP's get_headers(). In fact, if I use the URL as a link, I get a 404 Error in my browser. And if I use the URL (which is to an image file) as an img src the "Network" tab of my browser shows a 404 Error status. But when I print_r the results of @get_headers( $uri ) I see that my response returns HTTP/1.0 200 OK! What's up with that?

Is this something on the web server itself? If so, what (if anything) should I communicate to the server support to get them to address the issue?

Update

The URL I am testing against is a gravatar URL: http://0.gravatar.com/avatar/4d445fd58bf07d406345bac336c3b836?s=96&d=404&r=G

Tom Auger
  • 19,421
  • 22
  • 81
  • 104
  • It's possible that your browser is following a redirect while get_headers() is not. – JRL Apr 15 '13 at 21:28
  • @JRL in this case, a redirect will be a 3XX (302 for example), not a 200 – MatRt Apr 15 '13 at 21:30
  • 2
    can you give us the URI ? – MatRt Apr 15 '13 at 21:33
  • @MatRT it's a gravatar URL, deliberately pointing to a user that does not exist, and passing the 404 default parameter. When you follow this link you will see a 404 error in your browser. – Tom Auger Apr 16 '13 at 13:38

2 Answers2

1

I did a test and I get the 404 not 200.

  $url = 'http://0.gravatar.com/avatar/4d445fd58bf07d406345bac336c3b836?s=96&d=404&r=G';
  var_dump(get_headers($url, 0));
  /* array (size=11)
  0 => string 'HTTP/1.0 404 Not Found' (length=22)
  1 => string 'Cache-Control: max-age=300' (length=26)
  2 => string 'Content-Type: text/html; charset=utf-8' (length=38)
  3 => string 'Date: Tue, 16 Apr 2013 13:46:12 GMT' (length=35)
  4 => string 'Expires: Tue, 16 Apr 2013 13:51:12 GMT' (length=38)
  5 => string 'Last-Modified: Fri, 28 Sep 2012 05:18:58 GMT' (length=44)
  6 => string 'Server: nginx' (length=13)
  7 => string 'Via: 1.1 varnish' (length=16)
  8 => string 'X-Varnish: 3241507148 3241041069' (length=32)
  9 => string 'Content-Length: 13' (length=18)
  10 => string 'Connection: close' (length=17)
  */

I did a little search and it seems that the behaviour of get_headers() depends primarily on the PHP version.

However it can be changed by changing the HTTP context options (see: HTTP context options)

Edit

Here's a very similar problem: PHP get_headers() reports different headers than CURL

Community
  • 1
  • 1
MythThrazz
  • 1,629
  • 17
  • 25
  • thanks for checking. I'm running from a local XAMPP installation. I wonder whether this is related? – Tom Auger Apr 16 '13 at 14:07
  • Definitely, but indirectly. Each XAMPP, WAMP etc. comes with some PHP version and there were a couple of those lately. Check what version of the PHP you have on your XAMPP, try to check the link from my answer. There are at least two settings that may be interesting "follow_location" and "max_redirects". – MythThrazz Apr 16 '13 at 14:20
  • thanks for the follow-up! I attempted to set the `stream_context_set_default()` as recommended in the linked SE question, but the results were identical. I'm having difficulty grokking the HTTP context options. – Tom Auger Apr 16 '13 at 15:24
  • Can you post what have you tried (I mean a part of an actual code)? And possibly your PHP version? – MythThrazz Apr 16 '13 at 18:43
  • Thanks for all your advice. The issue ended up being tangential, but your request to post the actual code led me to try out a few things as I was doing so, and that led to the solution. Thanks again! – Tom Auger Apr 16 '13 at 20:03
  • I'm glad that I could help, but I think that for the sake of all the other people who may have the similar issue you could provide the answer for your own question. :) ps. Also I'm curious what was that – MythThrazz Apr 16 '13 at 20:11
  • 1
    Okay, I've taken your advice and posted the results in an answer. – Tom Auger Apr 18 '13 at 13:29
0

The URL that was being sent to get_headers() had html-escaped entities, specifically &amp; rather than an actual & ampersand token. This makes a difference to get_headers(), though its original use (<img src='{$url}'... />) didn't mind the html entities encoded version. The solution was simply to use & when building the URL.

Specific Application for Checking the Validity of a Gravatar

Since I encountered this in the context of checking the validity of a Gravatar, and the code that I was using was on some more-or-less "official" documentation, I'm posting this in case anyone else runs into the same issue and wants a cut-and-paste solution.

$url = "$host/avatar/";
$url .= $email_hash;
$url .= '?s='.$size;
$url .= '&d=404';

$gravatar_response_code = wp_cache_get( $email_hash );
if ( false === $gravatar_response_code ){
    $response = wp_remote_head ( $url );
    if ( is_wp_error( $response ) ){
        $gravatar_response_code = "error";
    } else {
        $gravatar_response_code = $response['response']['code'];
    }

    wp_cache_set( $email_hash, $gravatar_response_code, '', 300 );
}

if ( '200' == $gravatar_response_code )
    $avatar = "<img alt='{$safe_alt}' src='{$url}' class='avatar avatar-{$size} photo' height='{$size}' width='{$size}' />";

Do note that certain functions wp_cache_get(), wp_remote_head() and wp_cache_set() are WordPress-specific functions. The wp_remote_head() method of the HTTP API will call curl, get_headers() or even fopen() depending on what's available so it's 100% relevant and exhibits the same behaviours as what's documented here.

Tom Auger
  • 19,421
  • 22
  • 81
  • 104