8

Can I use Scrapy on PHP or are there similar tools that work with PHP?

I am not a technical person but just researching the available web scraping tools and their features to support my technical colleagues.

Khaled Shaheen
  • 91
  • 1
  • 1
  • 2
  • 2
    [Scrapy](http://scrapy.org/) is written in Python... so you could use something like [popen](http://php.net/manual/en/function.popen.php) but for a non-technical person - the short answer would be no. – naththedeveloper Jan 20 '14 at 14:00
  • Are you asking if you can write PHP code to utilise Scrapy or if you can use Scrapy to read websites that are written using PHP? – Quentin Jan 20 '14 at 14:28

3 Answers3

9

Scrapy is for python and you can't use that in PHP.

However, in PHP you can use Goutte to do this job. It uses Guzzle HTTP and Symfony components like BrowserKit and DomCrawler behind the scenes to do this job.

Check this out:

use Goutte\Client;

$client = new Client();

// Go to the symfony.com website
$crawler = $client->request('GET', 'http://www.symfony.com/blog/');

// Get the latest post in this category and display the titles
$crawler->filter('h2 > a')->each(function ($node) {
    echo $node->text().'\n';
});

More on usage

PS: Please do note that it doesn't do JavaScript.

kabirbaidhya
  • 3,264
  • 3
  • 34
  • 59
1

You can check PHP SimpleTest's ScriptableBrowser...

MarcoS
  • 17,323
  • 24
  • 96
  • 174
0

You can't write Scrapy spiders using PHP.

Nevertheless, it's very usual to use Scrapy (writing spiders in Python) and store the extracted data in a database or something accessible by your application. For example, it's fairly easy to store the extracted items directly to ElasticSearch and make your application query ES to search/filter/aggregate the data.

But, if your colleagues don't know Python they will need to spend some time learning the language and then the Scrapy framework.

R. Max
  • 6,624
  • 1
  • 27
  • 34