2

Trying to retrieve a page from the XML language. However, retrieval is unreliable because it is a chunked transfer encoding. How do I download this page correctly to give me further editing?

I can not use PHP Stream Filters because my PHP version is only 5.2.

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
trymerson2010
  • 21
  • 1
  • 2
  • How are you trying to retrieve the file? – zneak Jul 04 '11 at 13:43
  • "retrieve a page from the xml language"? – Lightness Races in Orbit Jul 04 '11 at 13:44
  • @zneak i will post new coment. function readfile_chunked ($filename,$type='array'){ $chunk_array=array(); $chunksize = 1*(1024*1024); $buffer = ''; $handle = fopen($filename, 'rb'); if ($handle === false) { return false; } while (!feof($handle)) { switch($type) { case'array': $lines[] = fgets($handle, $chunksize); break; case'string': $lines = fread($handle, $chunksize); break; } } fclose($handle); return $lines; } foreach ( readfile_chunked('http://api.justin.tv/stream/list.xml?language=en&stream_type=live') as $key=>$value) {$res .= $value; } file_put_contents('file.xml', $res); – trymerson2010 Jul 04 '11 at 14:11
  • @Tomalak Geret'kal yes it is xml file with television. – trymerson2010 Jul 04 '11 at 14:26
  • @trymerson2010: "xml file with television"?!?!?! – Lightness Races in Orbit Jul 04 '11 at 14:36

3 Answers3

1

I'd recommend using cURL. It supports HTTP/1.1, which is neccessary to reliably receive chunked data. The PHP core functions like file_get_contents and the like do not support HTTP/1.1 do not support chunked data before PHP 5.3.0.

EDIT

Rephrased to clarify. Thank you, @troelskn.

EDIT

Example using cURL:

$rCURL = curl_init();

curl_setopt($rCURL, CURLOPT_URL, 'http://www.example.com/file_to_retrieve.xml');
curl_setopt($rCURL, CURLOPT_HEADER, 0);
curl_setopt($rCURL, CURLOPT_RETURNTRANSFER, 1);

$aData = curl_exec($rCURL);

curl_close($rCURL);

var_dump($aData);
Jürgen Thelen
  • 12,745
  • 7
  • 52
  • 71
  • Do you have a reference for the claim that http/1.1 isn't supported before `5.3`? – troelskn Jul 04 '11 at 13:45
  • @troelskn: argl, I should've been more clear about that. I don't mean that HTTP/1.1 isn't supported before 5.3.0, but [chunked data isn't](http://www.php.net/manual/en/context.http.php) (see Changelog). – Jürgen Thelen Jul 04 '11 at 13:59
  • You might be on to something. The stream wrappers `protocol_version` (or context setting) might be misconfigured. Or OPs server is not fully compliant and replies with /1.1 even if the php user agent requested with /1.0 (and chunked is indeed a requirement for 1.1, not 1.0) – mario Jul 04 '11 at 14:09
  • any idea how to get this file using cURL or any other method? – trymerson2010 Jul 04 '11 at 14:22
  • Yes, but it still breaking the content. Look at source of http://www.waszapolska.tv/justintv/justin.php This is a source of php file http://www.waszapolska.tv/justintv/justin.html – trymerson2010 Jul 04 '11 at 14:43
  • @trymerson2010: Looking at the first link I'd say the download is completely ok and works properly. The problem is most likely that you are downloading an HTML (not an XML) file, which has set `` but uses UTF-8 characters inside (first one found at line 2, pos 120). But this could also be my editor getting it wrong. Anyhow, the download is ok. Beyond this I'm out^^ – Jürgen Thelen Jul 04 '11 at 15:00
  • what if it stops in the middle of it? how do you determine the error then? – Timo Huovinen Sep 17 '14 at 10:36
0

Use the curl extension.

Or for simple GET requests, just use the built-in wrappers (E.g. file_get_contents)

troelskn
  • 115,121
  • 27
  • 131
  • 155
0

Use ext/curl to download the file.

wonk0
  • 13,402
  • 1
  • 21
  • 15
  • How is proper comand to use curl? Becouse even wget is breaking down downloading file. – trymerson2010 Jul 04 '11 at 14:04
  • I do not recommend to use curl command line but the php extension; that's why I added a link to the appropriate manual section – wonk0 Jul 04 '11 at 18:20