how can i verify all links on a page as a black-box tester

Question

I'm tryng to verify if all my page links are valid, and also something similar to me if all the pages have a specified link like contact. i use python unit testing and selenium IDE to record actions that need to be tested. So my question is can i verify the links in a loop or i need to try every link on my own? i tried to do this with __iter__ but it didn't get any close ,there may be a reason that i'm poor at oop, but i still think that there must me another way of testing links than clicking them and recording one by one.

i need to do more than just verify the link ,rather i thought of putting all the links on a page in a list and after that use the list to verify all the elements of a page — Decebal, Aug 04 '10 at 08:32

score 1 · Answer 1 · answered Aug 03 '10 at 15:16

1

Though the tool is in Perl, have you checked out linklint? It's a tool which should fit your needs exactly. It will parse links in an HTML doc and will tell you when they are broken.

If you're trying to automate this from a Python script, you'd need to run it as a subprocess and get the results, but I think it would get you what you're looking for.

answered Aug 03 '10 at 15:16

bedwyr

5,774
4
31
49

i need to do more than just verify the link ,rather i thought of putting all the links on a page in a list and after that use the list to verify all the elements of a page – Decebal Aug 04 '10 at 08:32

score 1 · Accepted Answer · edited Aug 11 '18 at 00:20

I would just use standard shell commands for this:

You can use wget to detect broken links
If you use wget to download the pages, you can then scan the resulting files with grep --files-without-match to find those that don't have a contact link.

If you're on windows, you can install cygwin or install the win32 ports of these tools.

EDIT: Embed Info from the `use wget to detect broken links` link above:

When ever we release a public site its always a good idea to run a spider on it, this way we can check for broken pages and bad urls. WGET has a recursive download command and mixed with --spider option it will just crawl the site.
1) Download WGET

    Mac:
    http://www.statusq.org/archives/2008/07/30/1954/
    Or use macports and download wget.

    Windows:
    http://gnuwin32.sourceforge.net/packages/wget.htm

    Linux:
    Comes built in
    ----------------------------------------

2) In your console / terminal, run (without the $):

    $ wget --spider -r -o log.txt http://yourdomain.com

3) After that just locate you "log.txt" file and at the very bottom
 of the file will be a list of broken links, how many links there 
are, etc.

score 0 · Answer 3 · answered Aug 03 '10 at 15:09

0

What exactly is "Testing links"?

If it means they lead to non-4xx URIs, I'm afraid You must visit them.

As for existence of given links (like "Contact"), You may look for them using xpath.

answered Aug 03 '10 at 15:09

Almad

5,753
7
35
53

score 0 · Answer 4 · answered Aug 03 '10 at 18:19

0

You could (as yet another alternative), use BeautifulSoup to parse the links on your page and try to retrieve them via urllib2.

answered Aug 03 '10 at 18:19

Wayne Werner

49,299
29
200
290

how can i verify all links on a page as a black-box tester

4 Answers4

EDIT: Embed Info from the use wget to detect broken links link above:

EDIT: Embed Info from the `use wget to detect broken links` link above: