10

I have a method to get ids and xpaths if given a particular url. How do I pass in the username and password with the request so that I can scrape a url that requires a username and password?

using HtmlAgilityPack;

_web = new HtmlWeb();

internal Dictionary<string, string> GetidsAndXPaths(string url)
{
    var webidsAndXPaths = new Dictionary<string, string>();
    var doc = _web.Load(url);
    var nodes = doc.DocumentNode.SelectNodes("//*[@id]");
    if (nodes == null) return webidsAndXPaths;
    // code to get all the xpaths and ids

Should I use a web request to get the page source and then pass that file into the method above?

var wc = new WebClient();
wc.Credentials = new NetworkCredential("UserName", "Password");
wc.DownloadFile("http://somewebsite.com/page.aspx", @"C:\localfile.html");
jessehouwing
  • 106,458
  • 22
  • 256
  • 341
Jonathan Kittell
  • 7,163
  • 15
  • 50
  • 93
  • I would first paste any errors that you are getting. Second, try using `System.Net.Http.HttpClient` instead, as it's more clear how to set authentication details. – Michael J. Gray Apr 25 '14 at 16:36

1 Answers1

5

HtmlWeb.Load has a number of overloads, these accept either an instance of NetworkCredential or you can pass in a username and password directly.

Name // Description 
Public method Load(String) //Gets an HTML document from an Internet resource.  
Public method Load(String, String) //Loads an HTML document from an Internet resource.  
Public method Load(String, String, WebProxy, NetworkCredential) //Loads an HTML document from an Internet resource.  
Public method Load(String, String, Int32, String, String) //Loads an HTML document from an Internet resource. 

You do not need to pass in a WebProxy instance, or you can pass in the system default one.

Alternatively you can wire up the HtmlWeb.PreRequest and setup the credentials for the request.

htmlWeb.PreRequest += (request) => {
    request.Credentials = new NetworkCredential(...);
    return true;
};
Striter Alfa
  • 1,577
  • 1
  • 14
  • 31
jessehouwing
  • 106,458
  • 22
  • 256
  • 341