2

I have the following php code which I found here:

function download_xml()
{
    $url = 'http://tv.sygko.net/tv.xml';

    $ch = curl_init($url);
    $timeout = 5;

    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);

    $data = curl_exec($ch);

    echo("curl_exec was succesful"); //This never gets called

    curl_close($ch);
    return $data;
}

$my_file = 'tvdata.xml';
$handle = fopen($my_file, 'w');
$data = download_xml();
fwrite($handle, $data);

What I'm trying to do is to download the xml at the specified url and save it to the disk. However, it stops once about 80% finished and never reaches the echo call after the curl_exec call. I'm not sure why, but I believe this is because it runs out of memory. Therefore I would like to ask if it is possible to make curl write the data to the file every time it has downloaded say 4kb. If this is not possible, do anybody know a way to get the xml file stored at the url downloaded and stored on my disk using php?

Thank you very much, BEN.

EDIT: This is the code now, it doesnt work. It writes the data to the file but still only about 80% of the document. Maybe it isn't because it exceeds memory but some other reason? I really can't believe it is this hard to copy a file from a URL to the disc...

    <?

$url = 'http://tv.sygko.net/tv.xml';
$my_file = fopen('tvdata.xml', 'w');

$ch = curl_init($url);
$timeout = 300;

curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FILE, $my_file);
curl_setopt($ch, CURLOPT_FAILONERROR, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
    curl_setopt($ch, CURLOPT_BUFFERSIZE, 4096);

curl_exec($ch) OR die("Error in curl_exec()");

echo("got to after curl exec");

fclose($my_file);
curl_close($ch);

    ?>
  • try to 1: add ";" after fopen 2: fclose the file – Fluffy Oct 05 '09 at 18:11
  • and 3: increase the timeout to like 300 seconds because the page you linked to is really big – Fluffy Oct 05 '09 at 18:12
  • Just saw the ";" problem, and I've updated the code. It still stops at the same place (around 80% in) and I'm now trying to run the script with a timeout of 300 – Benjamin Egelund-Müller Oct 05 '09 at 18:15
  • I set the timeout to 300, but it is still stopping at the exact same place. – Benjamin Egelund-Müller Oct 05 '09 at 18:17
  • Well I've been playing with this and other possibilities and nothing work. I find it incredible I cant download a file from a URL. Nevertheless thanks for all the answers here. I'll keep fighting and of course if anybody finds a solution, please post it! When I find the solution I will of course post it here. – Benjamin Egelund-Müller Oct 05 '09 at 19:03

4 Answers4

5

Here comes a fully working example:

public function saveFile($url, $dest) {

        if (!file_exists($dest))
                touch($dest);

        $file = fopen($dest, 'w');
        $ch = curl_init();

        curl_setopt($ch, CURLOPT_URL, $url);
        curl_setopt($ch, CURLOPT_PROGRESSFUNCTION, 'progressCallback');
        curl_setopt($ch, CURLOPT_BUFFERSIZE, (1024*1024*512));
        curl_setopt($ch, CURLOPT_NOPROGRESS, FALSE);
        curl_setopt($ch, CURLOPT_FAILONERROR, 1);
        curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
        curl_setopt($ch, CURLOPT_TIMEOUT, 15);
        curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
        curl_setopt($ch, CURLOPT_FILE, $file);

        curl_exec($ch);
        curl_close($ch);

        fclose($file);
}
?>

The secret lies withing setting CURLOPT_NOPROGRESS to FALSE, and then, CURLOPT_BUFFERSIZE will make the callback report for every CURLOPT_BUFFERSIZE bytes reached. The smaller value, the more frequently it will report. This also depends on your download speed, etc, so don't count on it to report every X seconds, since it will report for every X bytes received/transferred.

Joe
  • 921
  • 1
  • 10
  • 16
3

Your timeout is set to 5 seconds which might be too short depending on the file size of the document. Try increasing it to 10-15 just to make sure it has enough time to complete the transfer.

Jesse Dearing
  • 2,251
  • 18
  • 20
2

There's an option called CURELOPT_FILE that allows you to specify a file handler that curl should write to. I'm pretty sure it will do "right" thing and "write" as it reads, avoiding your memory problem

$file = fopen('test.txt', 'w'); //<--------- file handler
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,'http://example.com');
curl_setopt($ch, CURLOPT_FAILONERROR,1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION,1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch, CURLOPT_TIMEOUT, 15);
curl_setopt($ch, CURLOPT_FILE, $file);   //<------- this is your magic line
curl_exec($ch); 
curl_close($ch);
fclose($file);
Alana Storm
  • 164,128
  • 91
  • 395
  • 599
1

curl_setopt the CURLOPT_FILE - The file that the transfer should be written to. The default is STDOUT (the browser window)

http://us2.php.net/manual/en/function.curl-setopt.php

Fluffy
  • 27,504
  • 41
  • 151
  • 234