Scrapy

Scrapy (/ˈskrp/ SKRAY-peye) is a free and open-source web-crawling framework written in Python. Originally designed for web scraping, it can also be used to extract data using APIs or as a general-purpose web crawler. It is currently maintained by Zyte (formerly Scrapinghub), a web-scraping development and services company.

Scrapy
Developer(s)Zyte (formerly Scrapinghub)
Initial release26 June 2008 (2008-06-26)
Stable release
2.11.0  / 18 September 2023 (18 September 2023)
Repository
Written inPython
Operating systemWindows, macOS, Linux
TypeWeb crawler
LicenseBSD License
Websitescrapy.org 

Scrapy project architecture is built around "spiders", which are self-contained crawlers that are given a set of instructions. Following the spirit of other don't repeat yourself frameworks, such as Django, it makes it easier to build and scale large crawling projects by allowing developers to reuse their code.

Some well-known companies and products using Scrapy are: Lyst, Parse.ly, Sayone Technologies, Sciences Po Medialab, Data.gov.uk’s World Government Data site.

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.