I am creating a little script which would save countless hours for me and my colleagues. The thing is I need to get data of my clients from web page based on their number (CLIENT_NO). The whole page is of course behind log in page, but I manually sign in in browser and copy the Bearer and X-Auth tokens which should be enough, to authorize these requests, right?.
Then I use URL "https://moje.csobstavebni-oz.cz/group/nel/vysledky-vyhledavani?searchText=CLIENT_NO" which mimics search request from search bar.This will get me on the desired page. I am looking for data such as "birthNumberIco" and others, as highlighted in screenshot.
A little problem I see is that Request URL is of course different from the one mentioned above. But I cannot use Request URL, because in this URL there is CLIENT_ID not CLIENT_NO and I don't know that.
Unfortunately, I can't get anything from it, Python will always return blank list []. I am suspecting it is because of all the authorization keys and tokens (as you can see in my header, they are of course not written completely for obvious reasons).
I tried several options I found on the Youtube but as of right now, I am completely desperate and I don't know, what else can I do. Maybe there is just some small mistake I did, that will fix the whole thing.
Screenshot screenshot2 screenshot3
Thank you so much in advance!
import scrapy
import json
class KlientUdaje(scrapy.Spider):
name = 'klient_udaje'
start_urls = ['https://moje.csobstavebni-oz.cz/group/nel']
headers = {
"Accept": "*/*",
"Accept-Encoding": " gzip, deflate, br",
"Accept-Language": " en-US,en;q=0.9,cs;q=0.8",
"Authorization": " Bearer d2ba2XXXXXX",
"Cache-Control": " no-cache",
"Connection": "keep-alive",
"Host": " moje.csobstavebni.cz",
"Origin": " https://moje.csobstavebni-oz.cz",
"Pragma": " no-cache",
"Referer": " https://moje.csobstavebni-oz.cz/",
"RequestId": " cklydjuq000073q679q5kd2tb",
"Sec-Fetch-Dest": " empty",
"Sec-Fetch-Mode": " cors",
"Sec-Fetch-Site": " cross-site",
"SystemId": ": 47",
"User-Agent": "Mozila/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.72 Safari/537.36 Edg/89.0.774.45",
"X-Auth-Token": "eyAidHlwIjogIkpXVCIsICJraWQiOiAiT2pDY3ErdklKTXXXXX"
}
def parse(self, response):
url = 'https://moje.csobstavebni-oz.cz/group/nel/vysledky-vyhledavani?searchText=CLIENT_NO'
yield scrapy.Request(url,
callback=self.parse_api,
headers=self.headers)
def parse_api(self, response):
raw_data = response.body
data = json.loads(raw_data)
rodne_cislo = data['birthNumberIco']
print(rodne_cislo)