0

The current situation is that i'm using PhantomJS and Selenium to load web pages, because the host website is behind cloudflare ddos protection so I can't use anything that doesn't have javascript built in. This has been working well for a while but the website has recently been using their own CDN to deliver these images, and this causes problems when setting PictureBox.ImageLocation to the src.

If there any way to get an <img> tags src, and convert that to bitmap or image to be able to use the image directly from PhantomJS in my picturebox, that'd be awesome.

Thanks for the help.

Sasha
  • 1,674
  • 1
  • 16
  • 23
  • @JeffC Sharing code would not be useful, since I have no valid code that would be working in this current situation. I've asked the question because i've found nothing online and have asked in the hope that someone has some something similar or understands selenium and phantomjs better than me. – Sasha Jul 29 '17 at 01:02
  • Obviously you aren't going to have working code but you need to share code attempts... something to show that you've done some investigation and the results of that investigation. You could also share a link to the site. – JeffC Jul 29 '17 at 02:12

1 Answers1

0

For those whom are in the same situation as me:

It turns out that it wasn't that easy to store appropriate caching for PhantomJS and selenium, so I turned to an alternative route which ended up working.

When PhantomJS accesses your website that is locked behind a JS wall, (such as CloudFlare DDOS Protection), it will most likely store a cookie with an auth token of sorts saying that your browser passes the test.

At first, it didn't work for me, because it seems CloudFlare also logs which User Agent has auth'd for that token, and any mismatch will discard the token used.

I managed to solve this using the following piece of code:

private Image GetImage(string ImageLocation)
{
    byte[] data = null;
    using (CustomWebClient WC = new CustomWebClient())
    {
        WC.Headers.Add(System.Net.HttpRequestHeader.UserAgent, "Mozilla/5.0 (iPhone; CPU iPhone OS 10_0_1 like Mac OS X) AppleWebKit/601.1 (KHTML, like Gecko) CriOS/53.0.2785.109 Mobile/14A403 Safari/601.1.46");
        WC.Headers.Add(System.Net.HttpRequestHeader.Cookie, "cf_clearance=" + PhantomObject.Manage().Cookies.GetCookieNamed("cf_clearance").Value);
        data = WC.DownloadData(ImageLocation);
    }
    Bitmap MP = new Bitmap(new System.IO.MemoryStream(data));
    data = null;
    return MP;
}

In this code, PhantomObject is my PhantomJS driver object, and CustomWebClient is just a normal website with a bit of adjusting for the website I was using.

I essentially use the same faked user agent that my PhantomJS driver was using, as well as passed over in the headers the CloudFlare clearance cookie, and from there my webclient was able to successfully access the websites data and download the image's data, which I then turned into a bitmap and returned back.

Sasha
  • 1,674
  • 1
  • 16
  • 23