1

I'm using c# and I'm trying to scrape information from a Chinese government site and I can't seem to get a thing. The site in question is http://www.ajxxgk.jcy.gov.cn/html/zdajxx/index.html.

However when I try to pull the page's html I keep getting this error. Is there a way around it?

class Program
{
    static void Main(string[] args)
    {
        var program = new Program();
        var data = program.htmlReturn("http://www.ajxxgk.jcy.gov.cn/html/zdajxx/4.html");
    }

    public string htmlReturn(string link)
    {
        using (WebClient client = new WebClient())
        {
            string htmlCode = client.DownloadString(link);
            return htmlCode;
        }
    }
}

Also, I can access the site normally and navigate it.

Neuron
  • 5,141
  • 5
  • 38
  • 59
cybera
  • 351
  • 2
  • 17
  • Not sure if i got spammed or hacked clicking on that link. also show your code – TheGeneral Mar 20 '18 at 10:02
  • Can you access it normally? – user202729 Mar 20 '18 at 10:03
  • 1
    Keep trying. 521 appears to be a [custom Cloudflare - Server is Down](https://en.wikipedia.org/wiki/List_of_HTTP_status_codes#5xx_Server_errors) error – phuzi Mar 20 '18 at 10:03
  • @user202729 Yes I can access it normally. – cybera Mar 20 '18 at 10:05
  • 4
    Apparently they run some DDOS protection and your client doesn't pass. Maybe fake a Browser call. – H H Mar 20 '18 at 10:07
  • @phuzi I saw those results online saything that, but its been showing nothing but this error while the site is obviously up. It makes no sense unless they are somehow deliberately trying to block people. However Postman manages to get a response and I'm not sure why. – cybera Mar 20 '18 at 10:08
  • @Henk Holterman This seems like it might be the answer I'll try that. – cybera Mar 20 '18 at 10:12
  • 2
    Here is a clue: https://stackoverflow.com/questions/44076962/how-do-i-set-a-default-user-agent-on-an-httpclient , and grab the string from a Chrome or FireFox request. – H H Mar 20 '18 at 10:13

0 Answers0