I am newbie in Python. I have tried finding the solution everywhere but couldn't get through.
I have made a Scrapy project and because of the project structure, the spiders are by default stored in /spiders
directory.
Problem: We generally run the crawlers from the /project
directory which contains /spiders
inside it. The problem started after including this piece of code:
def implicit():
from google.cloud import storage
# If you don't specify credentials when constructing theclient,the
# client library will look for credentials in the environment.
storage_client = storage.Client()
# Make an authenticated API request
buckets = list(storage_client.list_buckets())
import os
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "remote url of the json file"
This code above doesn't catch the remote file if I crawl the spider from the /project
directory and throws error
"File json_filename.json was not found."
But when i crawl the spider from /project/spider
directory it runs smoothly without any error.
I guess I am missing some fundamentals over here, something to do with crawling location or the environment variables. Thanks people.