3

I written simple php code to get some url content but it dosn't work it return this error

file_get_contents(http://www.nature.com/nature/journal/v508/n7496/full/nature13001.html) [function.file-get-contents]: failed to open stream: HTTP request failed! HTTP/1.0 401 Unauthorized

here is my code. please help me.tnx

$content=file_get_contents('http://www.nature.com/nature/journal/v508/n7496/full/nature13001.html');
echo $content;
saboteur
  • 81
  • 1
  • 1
  • 6
  • "HTTP/1.0 401 Unauthorized" Do you what that means? http://en.wikipedia.org/wiki/List_of_HTTP_status_codes Moreover, http wrapper for file_get_contents does not send normal http headers and can be recognized on the server-side... – Cheery Sep 25 '14 at 00:06
  • read updated comment. if you are able to reach this URL with browser, but file_get_contents gives you that error, then you have to use cURL, for example, to send full set of http headers. BTW, you may get access to the article from the college/university network, but if script runs on some server outside of the campus network it can give this 401 error, too. – Cheery Sep 25 '14 at 00:09
  • You mean *here is my code*: `$content=file_get_contents("http://www.nature.com/nature/journal/v508/n7496/full/nature13001.html");` – Funk Forty Niner Sep 25 '14 at 00:13
  • I just need content of this url same as browser show. for pattern maching and datamining. – saboteur Sep 25 '14 at 00:15
  • tank you, i also try to use cURL but its not work :( – saboteur Sep 25 '14 at 00:17
  • My test reveals `Warning: file_get_contents(http://www.nature.com/nature/journal/v508/n7496/full/nature13001.html): failed to open stream: HTTP request failed! HTTP/1.0 401 Unauthorized in...` so I think they may not be letting anyone scrape their site. You can always try an ` – Funk Forty Niner Sep 25 '14 at 00:17
  • i really dont want to show in browser. echo here is for test. – saboteur Sep 25 '14 at 00:20

2 Answers2

4

Here's an alternative to file_get_contents using cURL:

$url = 'http://www.example.com';
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_HEADER, false);
$data = curl_exec($curl);
curl_close($curl);

You might want to add this curl_setopt($curl, CURLOPT_ENCODING ,""); if you encounter encoding problem.

Adam Sinclair
  • 1,654
  • 12
  • 15
0
  1. Open the page using your browser and with the console open, see that the server does indeed send a 401 even when page is sent and viewable

  2. On php, open the url in an alternate way to ignore the error (see http://php.net/manual/en/context.http.php)

  3. You'll also notice that it's gzip-encoded, see http://php.net/manual/en/function.gzinflate.php

Happy hacking!

Agi
  • 71
  • 6