0

I was having a look at DRKSpider to find problems with a website in our production server, but it seems its export feature generates different outputs (with different content).

My goal is to find a good tool that shows every type of status code that might be an error 404, 500, 403, etc.

Could you guys please suggest some open source tools to craw a website in order to list all server codes related to error?

masegaloeh
  • 18,236
  • 10
  • 57
  • 106
Junior Mayhé
  • 185
  • 1
  • 10
  • `man wget` and it's spider option. Or HTTrack or cURL or anything else found on Google. Even Google itself with the Webmaster tools. – mailq Aug 15 '11 at 18:26

1 Answers1

0

I think the hardest part of this is that most open source tools won't implement a full DOM with a js and css engine. So you're likely to run into issues where even using wget will not expose broken java script issues on your site. If you are trying to figure out what errors your site might be generating for users you should look at implementing a spider that does support js/css/etc. Something like:

http://atomz.com/ (free up to 10,000 pages)

You can also use google webmaster tools like @mailq mentioned, here is more details on their crawl errors section:

http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=35120&ctx=cb&src=cb&cbid=g2fqlm56h5t&cbrank=0

Lastly, if you aren't already you should be watching your logs for these errors and tracking referrer information so you can hopefully investigate them then as well.

polynomial
  • 4,016
  • 14
  • 24