I'm tryng to verify if all my page links are valid, and also something similar to me if all the pages have a specified link like contact. i use python unit testing and selenium IDE to record actions that need to be tested.
So my question is can i verify the links in a loop or i need to try every link on my own?
i tried to do this with __iter__
but it didn't get any close ,there may be a reason that i'm poor at oop, but i still think that there must me another way of testing links than clicking them and recording one by one.

- 1,376
- 1
- 21
- 36
-
i need to do more than just verify the link ,rather i thought of putting all the links on a page in a list and after that use the list to verify all the elements of a page – Decebal Aug 04 '10 at 08:32
4 Answers
Though the tool is in Perl, have you checked out linklint? It's a tool which should fit your needs exactly. It will parse links in an HTML doc and will tell you when they are broken.
If you're trying to automate this from a Python script, you'd need to run it as a subprocess and get the results, but I think it would get you what you're looking for.

- 5,774
- 4
- 31
- 49
-
i need to do more than just verify the link ,rather i thought of putting all the links on a page in a list and after that use the list to verify all the elements of a page – Decebal Aug 04 '10 at 08:32
I would just use standard shell commands for this:
- You can use wget to detect broken links
- If you use wget to download the pages, you can
then scan the resulting files with
grep --files-without-match
to find those that don't have a contact link.
If you're on windows, you can install cygwin or install the win32 ports of these tools.
EDIT: Embed Info from the use wget to detect broken links
link above:
When ever we release a public site its always a good idea to run a spider on it, this way we can check for broken pages and bad urls. WGET has a recursive download command and mixed with --spider option it will just crawl the site.
1) Download WGET Mac: http://www.statusq.org/archives/2008/07/30/1954/ Or use macports and download wget. Windows: http://gnuwin32.sourceforge.net/packages/wget.htm Linux: Comes built in ---------------------------------------- 2) In your console / terminal, run (without the $): $ wget --spider -r -o log.txt http://yourdomain.com 3) After that just locate you "log.txt" file and at the very bottom of the file will be a list of broken links, how many links there are, etc.

- 16,580
- 17
- 88
- 94

- 66,094
- 13
- 157
- 251
What exactly is "Testing links"?
If it means they lead to non-4xx URIs, I'm afraid You must visit them.
As for existence of given links (like "Contact"), You may look for them using xpath.

- 5,753
- 7
- 35
- 53
You could (as yet another alternative), use BeautifulSoup to parse the links on your page and try to retrieve them via urllib2.

- 49,299
- 29
- 200
- 290