2

I am trying to get the source code from a webpage. The WebBrowser control is giving me the information that I am looking for. However, I want to use HttpWebRequest, but its giving me different source code than the WebBrowser DocumentText.

Can anyone please tell me how can I get the same source code as WebBrowser using HttpWebRequest?

WebBrowser Code:

WebBrowser1.Navigate("http://www.networksolutions.com/whois/results.jsp?domain=" & txtUrl.Text)
textbox1.Text = WebBrowser1.DocumentText

WebBrowser Result:

http://textbin.com/f4368

HttpWebRequest Code:

Dim request As System.Net.HttpWebRequest = System.Net.HttpWebRequest.Create(url)
request.KeepAlive = False
request.Timeout = 10000

Dim response As System.Net.HttpWebResponse = request.GetResponse()

Dim sr As System.IO.StreamReader = New System.IO.StreamReader(response.GetResponseStream())
Dim sourcecode As String = sr.ReadToEnd()

HttpWebRequest Result:

http://textbin.com/2h445

Cody Gray - on strike
  • 239,200
  • 50
  • 490
  • 574
Nasim
  • 21
  • 1
  • 2

2 Answers2

5

Some sites will look at the user-agent string or other factors and return content that varies based on this. I've written a number of projects that downloaded web pages and have run into this a few times.

Jonathan Wood
  • 65,341
  • 71
  • 269
  • 466
  • 1
    Also, WebBrowser executes javascript that may modify the DOM. – Shurdoof Jan 01 '11 at 06:23
  • 2
    To solve a problem with this root cause, you could run a tool like Fiddler to examine the difference between your requests. Then you can try to tweak your user-agent/headers/referrer, etc to match – Merlyn Morgan-Graham Jan 01 '11 at 06:26
  • Hi, thanks a lot for the reply. I have added User Agent as you said and its working now. Here is the code I have used: Http.UserAgent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705)" – Nasim Jan 01 '11 at 06:38
0

This is an old-ish question but the reason this happens is that MSHTML - the Windows html rendering engine - modifies the incoming HTML before it renders it. You can change the rendering mode of the .NET web browser to use any of IE7, 8, or 9 rendering engines and you'll see HUGE differences in the HTML they return back out of the browser - IE9's is going to be the most similar to what you see come down in HttpWebRequest.

Patrick Sears
  • 328
  • 4
  • 8