29

I have a function that calls 3 different APIs using cURL multiple times. Each API's result is passed to the next API called in nested loops, so cURL is currently opened and closed over 500 times.

Should I leave cURL open for the entire function or is it OK to open and close it so many times in one function?

nbro
  • 15,395
  • 32
  • 113
  • 196
makenoiz
  • 341
  • 1
  • 5
  • 13
  • 4
    Pretty vague question without seeing the usage and how the code is being handled. – Jim Aug 04 '13 at 19:17
  • 1
    I tend to err on the reliability side, and fresh handles seem less problematic because leftover state from a request seems less likely to pollute future requests if you make a fresh handle. With that said, I have a process that regularly runs for weeks, making near a million http requests on the same curl handle. They are very plain http requests to a single api on a single domain. I've experienced no issues. – goat Aug 04 '13 at 20:36
  • Possible duplicate of [When to use cURLs function curl\_close?](http://stackoverflow.com/questions/3849857/when-to-use-curls-function-curl-close) – T.Todua Aug 21 '16 at 10:07
  • 1
    Be carefull on using this for changing data by curl (post, put etc), for it can reuse old data in the background, see my answer: https://stackoverflow.com/a/67266458/4699609 – flexJoly Apr 30 '21 at 08:39
  • Does this answer your question? [Reusing the same curl handle. Big performance increase?](https://stackoverflow.com/questions/3787002/reusing-the-same-curl-handle-big-performance-increase) – flexJoly Apr 30 '21 at 08:45

1 Answers1

39

There's a performance increase to reusing the same handle. See: Reusing the same curl handle. Big performance increase?

If you don't need the requests to be synchronous, consider using the curl_multi_* functions (e.g. curl_multi_init, curl_multi_exec, etc.) which also provide a big performance boost.

UPDATE:

I tried benching curl with using a new handle for each request and using the same handle with the following code:

ob_start(); //Trying to avoid setting as many curl options as possible
$start_time = microtime(true);
for ($i = 0; $i < 100; ++$i) {
    $rand = rand();
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, "http://www.google.com/?rand=" . $rand);
    curl_exec($ch);
    curl_close($ch);
}
$end_time = microtime(true);
ob_end_clean();
echo 'Curl without handle reuse: ' . ($end_time - $start_time) . '<br>';

ob_start(); //Trying to avoid setting as many curl options as possible
$start_time = microtime(true);
$ch = curl_init();
for ($i = 0; $i < 100; ++$i) {
    $rand = rand();
    curl_setopt($ch, CURLOPT_URL, "http://www.google.com/?rand=" . $rand);
    curl_exec($ch);
}
curl_close($ch);
$end_time = microtime(true);
ob_end_clean();
echo 'Curl with handle reuse: ' . ($end_time - $start_time) . '<br>';

and got the following results:

Curl without handle reuse: 8.5690529346466
Curl with handle reuse: 5.3703031539917

So reusing the same handle actually provides a substantial performance increase when connecting to the same server multiple times. I tried connecting to different servers:

$url_arr = array(
    'http://www.google.com/',
    'http://www.bing.com/',
    'http://www.yahoo.com/',
    'http://www.slashdot.org/',
    'http://www.stackoverflow.com/',
    'http://github.com/',
    'http://www.harvard.edu/',
    'http://www.gamefaqs.com/',
    'http://www.mangaupdates.com/',
    'http://www.cnn.com/'
);
ob_start(); //Trying to avoid setting as many curl options as possible
$start_time = microtime(true);
foreach ($url_arr as $url) {
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_exec($ch);
    curl_close($ch);
}
$end_time = microtime(true);
ob_end_clean();
echo 'Curl without handle reuse: ' . ($end_time - $start_time) . '<br>';

ob_start(); //Trying to avoid setting as many curl options as possible
$start_time = microtime(true);
$ch = curl_init();
foreach ($url_arr as $url) {
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_exec($ch);
}
curl_close($ch);
$end_time = microtime(true);
ob_end_clean();
echo 'Curl with handle reuse: ' . ($end_time - $start_time) . '<br>';

And got the following result:

Curl without handle reuse: 3.7672290802002
Curl with handle reuse: 3.0146431922913

Still quite a substantial performance increase.

Community
  • 1
  • 1
AlliterativeAlice
  • 11,841
  • 9
  • 52
  • 69
  • I wonder if curl is using keep-alive connections. That alone could account for most of the performance boost. – goat Aug 04 '13 at 20:25
  • I believe cURL uses keep-alive, but each call to curl_exec() initiates a fresh request (because options may have changed, etc.) Especially when connecting to a different server, this would have to be the case. – AlliterativeAlice Aug 04 '13 at 21:08
  • Thanks everyone. While I am connecting to the same server but different url, Im amazed by the benchmarks Otome posted. However I really like the reliability point that Chris Posted..... – makenoiz Aug 05 '13 at 02:17
  • 1
    While I realize this is an old post, one thing to keep in mind with the different-servers benchmark is DNS looking has to occur in the first execution's timeframe, poisoning the results. If you run a simple duplication of the test in the same file (just copy and paste a couple of times) you'll notice the performance ends up being roughly on par. – Aaron F. Mar 12 '18 at 22:42
  • i love this answere - and it is on par with my assumption, namely that somebody once upon a time wrote a curl tutorial where they closed the handle after each operation and this, over time, led to so many copy-paste tutorials keeping to this strange "convention". KUDOS – clockw0rk Oct 07 '22 at 14:31