16

Looking for something similar to Mechanize for .NET...

If you don't know what Mechanize is.. http://search.cpan.org/dist/WWW-Mechanize/

I will maintain a list of suggestions here. Anything for browsing/posting/screen scraping (Other than WebRequest and WebBrowser Control).

Parsing

Web App Testing

Tools

  • Firebug for Firefox
  • Internet Explorer Developer Toolbar for IE
  • Chrome has one too

Note

WatiN is close to what I am looking for, except it opens up a browser, which is annoying and awesome at the same time. Depends on what you are doing.

Jason
  • 11,435
  • 24
  • 77
  • 131
  • 1
    In WatiN, just set "IE.Settings.MakeNewIeInstanceVisible" property to false, and take a look at "IE.Settings.AutoStartDialogWatcher". More info on http://www.watin.net – Oscar Mederos Mar 30 '10 at 08:03
  • Just call perl from c#, there's nothing like web mechanize in .NET – Matthew Lock Feb 26 '14 at 23:56

7 Answers7

3

You can use the WebBrowser control, which can be automated to an extent.

John Saunders
  • 160,644
  • 26
  • 247
  • 397
2

You need to use the HTML Agility Pack, which can parse tag soup from real websites into a DOM structure.

SLaks
  • 868,454
  • 176
  • 1,908
  • 1,964
  • 2
    That's good for parsing the pages once you get them, but what about everything else? Like, logging into sites, managing cookies, handling redirection etc? – Jason Jan 27 '10 at 20:14
  • Use the `WebClient` class, or HAP's `HtmlWeb` class, or the `HttpWebRequest` class with a `CookieContainer`. – SLaks Jan 27 '10 at 20:15
  • I know you can do it that way in .NET, looking for something a little higher level. Like Mechanize. Its OK if there isn't one, I just was curious if there was a library that did what I have done using WebClient etc. – Jason Jan 27 '10 at 20:20
  • 2
    A friend of mine wrote a program that does what my C# program does using Mechanize, and it is 13 lines. Mine is WAY more. 13 LINES! =P – Jason Jan 27 '10 at 20:22
2

I've been using WatiN to great effect. It's an easy way to 1) automate user input w/ IE and 2) navigate the DOM.

PhilChuang
  • 2,556
  • 1
  • 23
  • 29
  • 2
    WatiN requires actually launching browser windows. Mechanize is in-memory only – kenwarner Jan 28 '10 at 04:28
  • It is a bit slow, but it is fun to watch the browser be automated. Even more fun is to call .Highlight on whichever part of the DOM you're processing, and you can watch the processing happen. – PhilChuang Jan 28 '10 at 15:02
  • Another benefit of WatiN - since it interacts with IE, it can process the *live* DOM - that is, there's no problem if the page been built by javascript. I don't believe the HTML agility pack can do that. – PhilChuang Jan 30 '10 at 04:48
  • @qntmfred: You can hide those windows: "IE.Settings.MakeNewIeInstanceVisible = false". There is another property which handles dialogs: "IE.Settings.AutoStartDialogWatcher". – Oscar Mederos Mar 30 '10 at 08:05
2

You can also use Selenium. It's for unit testing web sites. It has a java application that drives the browser and a C# interface that you can write your code in. It also has the downside of showing the browser, but it's pretty full featured in terms of control, waiting on responses and getting the results.

Pete McKinney
  • 1,211
  • 1
  • 11
  • 21
1

Design Canvas is the best tool out there for this type of thing. Works with IE, Firefox, Safari, or an in-memory browser. It allows you to record and then playback any kind of web interaction.

jaws
  • 1,952
  • 4
  • 20
  • 27
0

You want HttpWebRequest for automating web requests and HtmlAgilityPack for processing the resulting HTML.

kenwarner
  • 28,650
  • 28
  • 130
  • 173
0

I have reverse engineered Python-Mechanize, and recreated it in C#, called Mechanize.NET.

https://github.com/WilliamABradley/Mechanize.NET

This should hopefully cover all use cases of Mechanize, if not, or you discover a bug, create an issue.

It uses .NET Standard, so it should be usable across .NET, such as with F#, VB, etc.

It utilises HtmlAgilityPack internally, and you can even collect the HtmlDocument for each loaded page.

William Bradley
  • 543
  • 5
  • 14