0

It took me awhile to make sense of python-scrapinghubs logic an the way it interacts with Scrapinghubs API but if made progress in my current troubleshoot...

Utilizing Scrapy, I have list multiple web scrapers that the sole function is to create m3u playlist. From various video hosting websites I scrape the title, the video Source stream URL, and if the website in particular being scraped calls for it also the categories of which is the deployed to scrapinghub.

When deployed in scrapinghub, each website I had created in their own projects folder, and amongs the projects are various other scrapy projects(relevant information).

Using ScrapinghubClient I first iterate through the projects to obtain all jobs keys:

from hubstorage import HubstorageClient
from scrapinghub import ScrapinghubClient, Connection

hc = HubstorageClient(auth='APIKEY')
client = ScrapinghubClient('APIKEY')
print(client)
ls = client.projects.list()

for j in ls:
    project = client.get_project(j)
    jobs_metadata = project.jobs.list()
#....

THEN I use a for loop to to get each projects job.key

...
    for j in jobs_metadata:
        print(j['key'])
        key = j['key']
        job = project.jobs.get(j['key'])
        print(job) 

THEN I call my scraped content by passing appropriate dict key to writeout to a file (in this case print)

for item in job.items.iter():
    print('#EXTINF:0, ' + str(item['title']) + '\n' + str(item['vidsrc']) + '\n')

Here is where the issue begins, I need to be able to handle error when the item called does not exist in the dictionary I need to be able two then pass the dictionary being iterated... as of now from the given code snippet, is the dictionary does not contain the key value then obviously it just stops...

 print('#EXTINF:0, ' + str(item['title']) + '\n' + str(item['vidsrc']) + '\n')
KeyError: 'title'

Process finished with exit code 1

I then need to be able to pass? or use none function to check if the key exist if then... etc... How would I be to deal with this?

scriptso
  • 677
  • 4
  • 14

1 Answers1

0

GOT IT! By using try and except/pass method I can iterate through all dictonaries to "pass" the instantiated dict if keyerror is presented

    for item in job.items.iter():
        try:
            i = item['title']
            print('#EXTINF:0, ' + str(item['title']) + '\n' + str(item['vidsrc']) + '\n')
        except KeyError:
            pass
scriptso
  • 677
  • 4
  • 14