2

So i have this url $url = "localhost:8000/vehicles" that i want ot fetch through a cron job but the page returns html so i wanna use symfony dom crawler to get all the vehicles instead of regex

At the top of my file i added

use Symfony\Component\DomCrawler\Crawler;

To create a new instance i tried:

$crawler = new Crawler($data);

and i tried

$crawler = Crawler::create($data);

but that gives me an error, also tried adding

Symfony\Component\DomCrawler\Crawler::class,

to the service provider but when i execute the command:

composer dump-autoload it gives me the following error

In Crawler.php line 66:

  Symfony\Component\DomCrawler\Crawler::__construct(): Argument #1 ($node) must be of type DOMNodeList|DOMNode|array|string|null, Illuminate\Foundation\Application given, called in C:\xampp\htdocs\DrostMachinehandel\DrostMachinehandel\vendor\laravel\fr   
  amework\src\Illuminate\Foundation\ProviderRepository.php on line 208


Script @php artisan package:discover --ansi handling the post-autoload-dump event returned with error code 1

I have no idea how to fix this.

The fucntion for fetching the url is below:

   public function handle()
    {
        $url = SettingsController::fetchSetting("fetch:vehicles");

        $ch = curl_init();

        curl_setopt($ch, CURLOPT_URL, $url);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
        curl_setopt($ch, CURLOPT_TIMEOUT, 10);

        $data = curl_exec($ch);

        $vehicles = $this->scrapeVehicles($data, $url);

        Log::debug($vehicles);

        curl_close($ch);
    }



    private function scrapeVehicles(string $data, string $url): array
    {
        $crawler = Crawler::create($data);
        $vehicles = $crawler->filter(".vehicleTile");

        return $vehicles;
    }

Contents of $data:

https://pastebin.com/GJ300KEv

w3_
  • 64
  • 1
  • 1
  • 14
  • 1
    can you try to `dd($data)` right above `$crawler = Crawler::create($data)`? what's the result? – Gonras Karols Dec 19 '22 at 12:16
  • Try changing `$crawler = Crawler::create($data);` to `$crawler = new Crawler($data)` – Gonras Karols Dec 19 '22 at 12:33
  • already tried that but gives me the same result – w3_ Dec 19 '22 at 12:34
  • `$crawler = new Crawler($data);`, the $data is not of the right type. Try dd-ing that one first, before you send it to the crawler – UnderDog Dec 20 '22 at 04:20
  • @UnderDog please read my question to see what $data is – w3_ Dec 20 '22 at 08:49
  • I've got your HTML page from pastebin, then saved it as test.html, then I run composer require symfony/dom-crawler and composer require symfony/css-selector, and then just made simple script to test: require_once __DIR__ . '/vendor/autoload.php'; $html = file_get_contents('test.html'); $crawler = new \Symfony\Component\DomCrawler\Crawler($html); $vehicles = $crawler->filter(".vehicleTile"); Crawler opens your HTML without any errors. Could you share more info about the error you are getting? – Dmitry K. Dec 23 '22 at 05:20

2 Answers2

0

Since not been tested, I'm not sure.

Make sure you installed correct package

composer require symfony/dom-crawler

To initiate, use the full path. (since it's not Laravel way(package))

$crawler = \Symfony\Component\DomCrawler\Crawler::create($data);
Abdulla Nilam
  • 36,589
  • 17
  • 64
  • 85
0

composer require --dev "symfony/dom-crawler":"^6.3.x-dev"

Sample crawler method namespace App\Http\Controllers;

use GuzzleHttp\Exception\GuzzleException;
use Symfony\Component\DomCrawler\Crawler;

class CrawlerController extends Controller
{

    private $url;

    public function __construct()
    {
        $this->url = "https://www.everything5pounds.com/en/Shoes/c/shoes/results?q=&page=6";
    }

    public function index()
    {
        $client = new \GuzzleHttp\Client();
        try {
            $response = $client->request('GET', $this->url);
            if ($response->getStatusCode() == 200) {
                $res = json_decode($response->getBody());

                $results = $res->results;

                return $results;

                /*$results = (array)json_decode($res);


                $products = array();
                foreach ($results as $result) {
                    $product = [
                        "name" => $result["name"],
                    ];

                    array_push($products, $product);
                }

                return $products;*/

                //return $result;
                //return $this->parseContent($result);


            } else {
                return $response->getReasonPhrase();
            }
        } catch (GuzzleException $e) {
            return $e->getMessage();
        }
    }

Parse content and store

 public function parseContent($result)
    {
        $crawler = new Crawler($result);

        $elements = $crawler->filter('.productGridItem')->each(function (Crawler $node, $i) {
            return $node;
        });

        $products = array();
        foreach ($elements as $item) {
            $image = $item->filter('.thumb .productMainLink img')->attr('src');
            $title = $item->filter('.productGridItem .details')->text();
            $price = $item->filter('.productGridItem .priceContainer')->text();

            $product = [
                "image" => 'https:' . $image,
                "title" => $title,
                "price" => $price,
            ];

            array_push($products, $product);
        }

        return $products;
    }
Kevin Otieno
  • 55
  • 1
  • 10