0

There is an example/test code made with the use of GuzzleHttp:

use GuzzleHttp\Client;
use GuzzleHttp\Handler\CurlHandler;
use GuzzleHttp\HandlerStack;
use GuzzleHttp\Middleware;
use GuzzleHttp\Pool;
use Psr\Http\Message\ResponseInterface;

require __DIR__ . '/vendor/autoload.php';

$handler = new CurlHandler();

$stack = new HandlerStack($handler);
$stack->push(Middleware::httpErrors(), 'http_errors');
$stack->push(Middleware::redirect(), 'allow_redirects');
$stack->push(Middleware::cookies(), 'cookies');
$stack->push(Middleware::prepareBody(), 'prepare_body');
$interval = 100;
$concurrency = 50;
$client = new Client(['handler' => $stack]);
echo sprintf("Using Guzzle handler %s\n", get_class($handler));
echo sprintf("Printing memory usage every %d requests\n", $interval);
echo "Fetching package list... ";

$packageNames = json_decode(
    $client->get('https://packagist.org/packages/list.json')
           ->getBody()
           ->getContents()
)->packageNames;

if (empty($packageNames)) {
    echo "Empty result. No reason to continue.";
    return;
}

echo 'done. (' . count($packageNames) . " packages)\n\n";

$requests = function($packageNames) {
    foreach ($packageNames as $packageVendorPair) {
        yield new GuzzleHttp\Psr7\Request('GET', "https://packagist.org/p/{$packageVendorPair}.json");
    }
};

$pool = new Pool($client, $requests($packageNames), [
    'concurrency' => $concurrency,
    'fulfilled' => function (ResponseInterface $response, $index) use (&$counter, $interval) {
        $counter++;
        if ($counter % $interval === 0) {
            echo sprintf(
                "Processed %s requests. Memory used: %s MB\n",
                number_format($counter),
                number_format(memory_get_peak_usage()/1024/1024, 3)
            );
        }
    },
    'rejected' => function($reason, $index) use (&$counter, $interval)
    {
        $counter++;
        if ($counter % $interval === 0) {
            echo sprintf(
                'Processed %s requests. Memory used: %s MB',
                number_format($counter),
                number_format(memory_get_peak_usage()/1024/1024, 3)
            );
        }
    }
]);

$promise = $pool->promise();
$response = $promise->wait();

How to make something similar for Amphp or Artax? I searched over the amp docs and stackoverflow, but couldn't find anything similar.

Btw, I've also found that Amp doesn't use Curl as a handler. Don't understand why there's no such an option available. Can you manually add it or there is something even better, what replaced curl functionality (various custom headers, debug/verbose possibilities and etc)?

The specific points where I need help:

  1. Is it possible that someone can show me where can I find pool equivalent example made with the use Amp framework or any of it's libraries and/or just show it even in more simple example?
  2. Where is Curl handler in Amp? Can I use it and how?

On Amphp website is said:

The Stack Overflow Community can answer your question if it's generic enough. Use the amphp tag so the right people find your question.

Since I provided simple enough (and working) example I thought it will be easy to understand exactly what I need.

With all due respect.

Community
  • 1
  • 1
iorsa
  • 60
  • 6

1 Answers1

1

There's no pool equivalent, but it can be written using a semaphore and async coroutines.

<?php

use Amp\Artax\DefaultClient;
use Amp\Loop;
use Amp\Sync\LocalSemaphore;

require __DIR__ . "/vendor/autoload.php";

Loop::run(function () {
    $concurrency = 10;
    $client = new DefaultClient;
    $semaphore = new LocalSemaphore(10);

    $packageResponse = yield $client->request("https://packagist.org/packages/list.json");
    $packageNames = json_decode(yield $packageResponse->getBody())->packageNames;

    $requestHandler = Amp\coroutine(function ($package) use ($client) {
        $url = "https://packagist.org/p/{$package}.json";

        $response = yield $client->request($url);
        $body = yield $response->getBody();

        return $body;
    });

    $counter = 0;

    foreach ($packageNames as $package) {
        $lock = yield $semaphore->acquire();

        $promise = $requestHandler($package);
        $promise->onResolve(function ($error, $body) use (&$counter, $lock) {
            $lock->release();

            if (++$counter % 50 === 0) {
                echo sprintf(
                    "Processed %s requests. Memory used: %s MB\n",
                    number_format($counter),
                    number_format(memory_get_peak_usage()/1024/1024, 3)
                );
            }
        });
    }
});

This examples uses a LocalSemaphore implementation, which is an implementation of Amp\Sync\Semaphore. The semaphore is used to limit the concurrency.

There's no Curl handler in Amp, because it doesn't work well with event loops. Curl has its own event loop, but that only allows multiple concurrent HTTP requests, no other non-blocking I/O. That's why Artax implements HTTP based on raw PHP sockets without any dependency on Curl.

kelunik
  • 6,750
  • 2
  • 41
  • 70
  • Thank you for providing such an enough deep answer for the first and main question! I've checked the libraries and functions/methods used in this both examples and I've understood that there actually have some similar core fundamentals (what is not surprise) used in both projects - (ofcourse) promises, invokes and iterations/coroutines/generators. Though, tbn, guzzle's project and design is easier to understand and use, at least for me, than amp's project. – iorsa Oct 05 '17 at 21:49
  • @iorsa: I fully agree that Guzzle might be easier to understand, but that's not really surprising given the reduced feature set. Guzzle is just an HTTP client while Artax is just one library based on Amp's event loop. Other I/O can happen concurrently like MySQL queries with `amphp/mysql` etc, not only HTTP requests. – kelunik Oct 05 '17 at 21:59
  • What about curl and php sockets - going to check both of them in real-world situations too, and artax should be interesting to try too. – iorsa Oct 05 '17 at 21:59
  • indeed I mean the whole Amphp project design with its core classes. To understand the provided example I had to look in many places, where descriptions/comments, method names and various implementations made it not so easy for me to understand the logic. It looks like more for advanced programmers, who I'm not, atm.. Though, it could be good to learn from here. Hopefully the object-oriented modeling is done good and related principles are well combined together. I'm just recently started to deeply learn all of it. – iorsa Oct 05 '17 at 22:18
  • Dont you think worker pools from Amp Parallel can handle the same task too? [https://github.com/amphp/parallel/blob/master/examples/worker-pool.php](https://github.com/amphp/parallel/blob/master/examples/worker-pool.php) – iorsa Oct 06 '17 at 10:21
  • @iorsa: They can handle the same task, but you don't need multiple processes / threads, non-blocking I/O is enough. – kelunik Oct 06 '17 at 13:32
  • Adjusted the answer now that `amphp/sync` is out. – kelunik Dec 26 '17 at 10:29