1

I am running periodic spiders in Scrapy Cloud and exporting the results to an AWS S3 Bucket. I need to dynamically upload my Wordpress tables with these results and I am currently using TablePress plugin which has a "Import tables" option but it only allows me to update the tables each 15 minutes.

Is there any way I could perform these periodic updates every 5 minutes, or better, when the AWS S3 file changes?

A Wordpress plugin that works with Scrapinghub directly could solve my problem too, but I have searched and haven't found any.

Jorge Garcia
  • 117
  • 9

2 Answers2

0

You might be better off using a JSON feed - https://wordpress.org/plugins/json-content-importer/

Thierrydev
  • 141
  • 6
0

From you Spider on ScrapingHub, you can either

  1. Send each one of item from your Spider using item_scraped method
  2. Send all items once your Spider finishes using spider_closed method

Of course you will an API in your website to receive that data

Hope that helps

Umair Ayub
  • 19,358
  • 14
  • 72
  • 146
  • And how can I build an API in my Wordpress website to receive data and show it dynamically? – Jorge Garcia Mar 27 '19 at 18:08
  • You can create a lambda (AWS Lambda) which will trigger everytime a new file is uploaded in the S3 Bucket (with the feed export), this lambda can ping your website at a specific url with the path of the file. e.g: mywebsite.com/import_from_s3/?path=myspider/myfile_20190328.json – Sewake Mar 28 '19 at 08:35
  • @JorgeGarcia if you dont know how to create a simple API to receive data you should hire an actual programmer to do that – Umair Ayub Mar 29 '19 at 11:57