I am looking for options to extract the main image from a product page on a retailer website, the problem is there are multiple images in a product page (related images) , one approach I thought would work would be to extract all the image links, download each one of them and compare the size of each of those images and consider the one has the largest size in terms of storage bytes as the one that is the main product image.
Obviously that would be a very inefficient approach , we know that most of the retailers use certain ecommerce platforms like magento , bigcommerce etc, the major ecommerce platforms are only handful , is it possible to detect the ecommerce platform and leverage the template provided by each one of them to precisely extract the main product image?
I know the approach would never be perfect , but I am looking an algorithm that would be mostly accurate about 80% or so , is it doable?