I have been working with scrapy + splash trying to scrape images from different websites. The thing is that some pages load the images dynamically and I can't get them fully loaded and the 'src' attribute is not there.
I started using splash from Scrapy but I switched and used the Splash website to find the problem.
I have tried everything in: https://splash.readthedocs.io/en/latest/faq.html#website-is-not-rendered-correctly but i don't get the images loaded
I found this problem with https://decathlon.es but I don't know if I'll find this problem later.
This is the script that I used to render the page:
function main(splash, args)
splash.private_mode_enabled = false
splash.images_enabled = true
splash:set_user_agent("Different User Agent")
splash.plugins_enabled = true
splash.html5_media_enabled = true
assert(splash:go(args.url))
assert(splash:wait(3.5))
width, height = splash:set_viewport_full()
assert(splash:wait(3.5))
return {
html = splash:html(),
png = splash:png(),
har = splash:har(),
}
end