I want to scrap Images from midjourney.com I had a perfectly working script that can do this but now my requests get blocked. I get a 403 ( Forbidden ) as response. To validate my code I converted the coped the request to the main page out off my Browser and converted it in to a script that dose the same request
My Guess is that this is a new Content Security Policy too prevent loading the site outside off a Browser. Has anyone a idea to get around this?
I would really appreciate some hints or ideas. Btw this is the test script:
import requests
cookies = {
'imageSize': 'medium',
'imageLayout_2': 'hover',
'getImageAspect': '2',
'fullWidth': 'false',
'showHoverIcons': 'true',
'_dd_s': 'rum=0&expire=1687926962555',
'__Host-next-auth.csrf-token': 'c19e3aa92427d9ade40721425bf5affb4955e52f766d0c2a8ca064d3ffef6d9c^%^7Cda294da6acc19312b0399c98e0148de068f41999f76145e15717bdcd3ee9f5c9',
'__Secure-next-auth.callback-url': 'https^%^3A^%^2F^%^2Fwww.midjourney.com',
}
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/114.0',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8',
'Accept-Language': 'en-US,en;q=0.5',
# 'Accept-Encoding': 'gzip, deflate, br',
'Alt-Used': 'www.midjourney.com',
'Connection': 'keep-alive',
# 'Cookie': 'imageSize=medium; imageLayout_2=hover; getImageAspect=2; fullWidth=false; showHoverIcons=true; _dd_s=rum=0&expire=1687926962555; __Host-next-auth.csrf-token=c19e3aa92427d9ade40721425bf5affb4955e52f766d0c2a8ca064d3ffef6d9c^%^7Cda294da6acc19312b0399c98e0148de068f41999f76145e15717bdcd3ee9f5c9; __Secure-next-auth.callback-url=https^%^3A^%^2F^%^2Fwww.midjourney.com',
'Upgrade-Insecure-Requests': '1',
'Sec-Fetch-Dest': 'document',
'Sec-Fetch-Mode': 'navigate',
'Sec-Fetch-Site': 'none',
'Sec-Fetch-User': '?1',
# Requests doesn't support trailers
# 'TE': 'trailers',
}
response = requests.get('https://www.midjourney.com/home/?callbackUrl=^%^2Fapp^%^2F', cookies=cookies, headers=headers)
print(response.status_code)
print(response.text)
Load website with CSP