3

I'm trying to improve in Scrapy, and I'm facing a new kind of issue with Query String and Variables.

1) It's seems that the Query string need two inputs (storeInRadiusQuery & cache): Here is the request headers with the API url

2) When I'm going to Params, I have the 2 query strings which are grouped in a JSON format. Inside this JSON, there are 3 keys (operationName, query and variables).

In other scrapy project,the query were much more easier to format, but here I don't know how to handle this with the variables.

I tried the Formdata scrapy method with no success:

data = {
        "operationName":"storeInRadiusQuery",
        "variables":{"currentLocation":"50.4376478855132,2.82123986359978","service":[],"storeChain":[],"deliveryTypes":[],"date":[],"__typename":"storeLocatorFilters"},
        "query":"query storeInRadiusQuery($currentLocation: String!, $service: [String], $storeChain: [String], $deliveryTypes: [String], $date: [String]) {\n  viewer {\n    storesInRadius(currentLocation: $currentLocation, services: $service, storeChaine: $storeChain, deliveryTypes: $deliveryTypes, date: $date, radius: 20, isStoreLocator: true) {\n      source {\n        ...StoresMapStoreItemType\n        ...StoreLocatorList\n        store_location\n        sort\n        __typename\n      }\n      __typename\n    }\n    __typename\n  }\n}\n\nfragment StoreLocatorList on StoreItemType {\n  store_id\n  store_name\n  street\n  zip_code\n  city\n  seo_url\n  day_0\n  day_0_morning_open_time\n  day_0_morning_close_time\n  day_0_afternoon_open_time\n  day_0_afternoon_close_time\n  day_1\n  day_1_morning_open_time\n  day_1_morning_close_time\n  day_1_afternoon_open_time\n  day_1_afternoon_close_time\n  day_2\n  day_2_morning_open_time\n  day_2_morning_close_time\n  day_2_afternoon_open_time\n  day_2_afternoon_close_time\n  day_3\n  day_3_morning_open_time\n  day_3_morning_close_time\n  day_3_afternoon_open_time\n  day_3_afternoon_close_time\n  day_4\n  day_4_morning_open_time\n  day_4_morning_close_time\n  day_4_afternoon_open_time\n  day_4_afternoon_close_time\n  day_5\n  day_5_morning_open_time\n  day_5_morning_close_time\n  day_5_afternoon_open_time\n  day_5_afternoon_close_time\n  day_6\n  day_6_morning_open_time\n  day_6_morning_close_time\n  day_6_afternoon_open_time\n  day_6_afternoon_close_time\n  __typename\n}\n\nfragment StoresMapStoreItemType on StoreItemType {\n  store_id\n  store_name\n  store_location\n  zip_code\n  street\n  city\n  seo_url\n  __typename\n}\n"}

    url = "https://www.monoprix.fr/api/graphql?storeInRadiusQuery&cache"

    yield scrapy.FormRequest(url,
                                method='POST', 
                                body=json.dumps(data), 
                                headers={'Content-Type':'application/json'},
                                callback=self.parse)

I've seen this post on how to handle query string, but I don't know how to put properly the query string dictionaries.

Here I would like to try to modify the current location and the radius paramaters, to find a shop list.

If you have any idea.. Thansk!

Edouard_W
  • 33
  • 6

1 Answers1

1

The following link shows how to properly replicate a Graphql request. https://scrapfly.io/blog/web-scraping-graphql-with-python/
To accomplish this in Scrapy is similar as shown in the above link.

query = """
       Just copy the query from browser developer tools and paste it here. 
       Remove any newline(\n) and format it properly.
        """

json_data = {
            "query": query,
            'variables': {
                "variable1": abc,
                "variable2": abc,
                "variable2": "abc"
            }
        }

yield scrapy.Request(url=url, method='POST',
                             body=json.dumps(json_data),
                             headers={
                                 'content-type': 'application/json'
                             })