3

I know how to download 1 file by pointing the urllib to the specific file.

How can I download all of the files in url? which includes links that a relative non relative

#Code to download 1 file, which im using at the moment
import urllib, os
    os.chdir('/directory/to/save/the/file/to')
    url = 'http://urltosite/myfile.txt'#Download all files in this url
    urllib.urlretrieve(url)
Santosh Kumar
  • 26,475
  • 20
  • 67
  • 118
Scott
  • 43
  • 3
  • Is looping what you want? – Santosh Kumar Nov 27 '15 at 18:05
  • Yeah, something that will go through the site and download each file associated with .jpg, .gif, .docx – Scott Nov 27 '15 at 18:08
  • @Scott It seems you are looking for so called web crawlers (or spiders). One example is here: http://www.netinstructions.com/how-to-make-a-web-crawler-in-under-50-lines-of-python-code/ – MartyIX Nov 27 '15 at 18:56
  • @MartinVseticka Im looking to download the files found in websites, Im juts not able to implement a method to download all .doc and .gif/.jpg files on a site as apposed to downloading 1, with the method on the original post – Scott Nov 27 '15 at 19:01
  • @Scott ... and it won't be easy without some effort - i.e. reading and learning. :-) If you need some advice, it's OK but "download files found in websites" is too general to answer *correctly*. There are many existing implementations for Python so maybe you can just try some of them or modify them for your needs. – MartyIX Nov 27 '15 at 19:10
  • I know its difficult, I've been at if for a while hah! Can you give my some advice on how to get started then? :) – Scott Nov 27 '15 at 19:28
  • 2
    This should get you all links on a webpage - http://stackoverflow.com/a/3075568/99256 (for links that are not generated by JavaScript) – MartyIX Nov 27 '15 at 20:04
  • This actually helped!! Thanks! – Scott Nov 27 '15 at 21:25

0 Answers0