I'm developing a web crawler in .Net C# that works like this.
Step1 Visits main page of the site (let's call this page Main.aspx)
Step2 Use httpwebrequest to get the form page (Let's call this page Form.aspx)
Step3 Post the form to another page and get the results. (Let's call this page Results.aspx)
It's pretty straight forward in terms of web crawling.
The current problem is, I can't access Form.aspx page if I dont set a bunch of cookies before. All of these cookies are javascript generated by Main.aspx.
Whenever i try to directly get the Form.aspx page, i get redirected to the Main page. The code that generates the cookies have more than 20kb and its aboslutelly messy and insane, also it uses a lot of "document." references which would block a simple attempt to use JINT or Javascript.net
So after a lot of research i found out that a headless browser would be what I'm looking for, tried a lot of them, but it seems a lot of complication. I already have a class library project with all my web crawlers in there, i just wanted another dll to make it work. Any suggestions?
I'm trying to be as clear as possible, if you have any doubt, please post on comments before giving negative votes...