0

I have been trying to figure out how to do this without a ridiculous amount of code for the past few days, I can not find anything on it, google, Stack Overflow, etc.

I am building a very advanced web scraper and I would like for the output to be in a tree type layout, example:

for aurl in aurls:
    print aurl
    burls = urlScraper(curl, scrape, savePgs)
    for burl in burls:
        print burl
        curls = urlScraper(burl, scrape, savePgs)
        (This would keep repeating A Lot.)

The planned output would be like this:

link.example.com/
    link.example.com/
        link.example.com/
           link.example.com/
           link.example.com/
           link.example.com/
               link.example.com/
               link.example.com/
           link.example.com/
           link.example.com/
               link.example.com/
               link.example.com/
               link.example.com/
    link.example.com/
link.example.com/
    link.example.com/
        link.example.com/
        link.example.com/
    link.example.com/

I would need this to continue until the scraper has reached the end of the tree. I feel that I am over thinking this very much and it is going to be something like a while loop. I have already built the web scraping API to return the depth of the url it is currently scraping, the url, and other factors that do not matter at this time.

I have already made a small function to print the depth of the script:

def depthIndent(depth):
    depthLevel = ""
    if depth == 1:
        depthLevel = depthLevel + ">"
        return str(depthLevel)
    else:
        for i in range(0,depth):
            depthLevel = depthLevel + "    "
    return str(depthLevel) + "-"

I just need to be able to run the for loop so it will not end until it hits the end of the tree! Any help is highly appreciated, example code would be nice but a brief explination would be good too, It's annoying working on one error all day!

Summery: I need to display text at the given depth, I am not able to detect the depth. I need to print the given output until the end of the tree.

Thank you

Windows65
  • 57
  • 1
  • 7
  • You're doing DFS which is why you're not getting the "tree structure" you want. Implement BFS to achieve the "hierarchy (distance) level" – Nir Alfasi Jun 22 '14 at 07:08
  • Hey alfasin, I have been researching BFS quite a bit and can not find a suitable example for what I am working on, mind giving me a bit more explanation? – Windows65 Jul 13 '14 at 20:02
  • I'll rephrase: you *can* detect depth with DFS, but, if you want to print the tree in "levels" (depths) like you showed in the example above, you'll need [*BFS*](http://en.wikipedia.org/wiki/Breadth-first_search). – Nir Alfasi Jul 13 '14 at 22:41

0 Answers0