0

I am working on Image Extraction Software from A WebPage . have created a function

 public static void GetAllImages()
        {

            WebClient x = new WebClient();
            string source = x.DownloadString(@"http://www.bbc.com");

            var document = new HtmlWeb().Load(source);
            var urls = document.DocumentNode.Descendants("img")
                                .Select(e => e.GetAttributeValue("src", null))
                                .Where(s => !String.IsNullOrEmpty(s));

            document.Load(source);


        }

It says "Uri is too long " ..

I tried to use Uri.EscapeDataString .. But not getting idea where to put it

Any Help would be appreciated

Neeraj Verma
  • 2,174
  • 6
  • 30
  • 51

1 Answers1

1

HtmlWeb.Load takes a URL as its source and deals with the downloading of the content. You don't need a supplementary WebClient to do this, it's all taken care of.

What you are doing is downloading the content, then attempting to use the downloaded content (HTML) as a URL (probably under the assumption that Load means Parse).

So remove

WebClient x = new WebClient();
string source = x.DownloadString(@"http://www.bbc.com");

then change the next line to

var document = new HtmlWeb().Load(@"http://www.bbc.com");

and you'll be good to go.

spender
  • 117,338
  • 33
  • 229
  • 351