2

This is my lines of code for get content of website:

private string GetContent(string url) {
    var request = (HttpWebRequest)WebRequest.Create(url);
    request.Method = "GET";
    var content = String.Empty;
    HttpStatusCode statusCode;
    using (var response = request.GetResponse())
        using (var stream = response.GetResponseStream())
        {
            var contentType = response.ContentType;
            Encoding encoding = null;
            if (contentType != null)
            {
                var match = Regex.Match(contentType, @"(?<=charset\=).*");
                if (match.Success)
                    encoding = Encoding.GetEncoding(match.ToString());
            }

            encoding = encoding ?? Encoding.UTF8;

            statusCode = ((HttpWebResponse)response).StatusCode;
            using (var reader = new StreamReader(stream, encoding))
                content = reader.ReadToEnd();
        }
    return content;
}

I have tried to run this lines of code with link: http://google.com. And It's done. But when I runs with link: http://batdongsan.com.vn/. It doesn't work and display "sorry! something went wrong.". And I don't know why what happened with it. How I can get content of second link?

SBI
  • 2,322
  • 1
  • 17
  • 17
Hau Le
  • 667
  • 2
  • 17
  • 42

1 Answers1

3

Looks like the site is checking the User-Agent header and since it's not set by default it's returning an error message. I added what my browser sent and was able to get the contents of that link. Just add the line that sets the UserAgent as shown below:

// ...
var request = (HttpWebRequest)WebRequest.Create(url);
request.Method = "GET";
request.UserAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36";

var content = String.Empty;
HttpStatusCode statusCode;
// ...
Volkan Paksoy
  • 6,727
  • 5
  • 29
  • 40