0

I recently started writing Python for a project I am working on. I wrote a script that takes a list of URLs of images (like in a txt file) and downloads them all. However, some of the URLs on the list are old and do not work any more. This causes an error. In addition, if a link takes to long to load it will also cause an error.

Code: import urllib.request

import random


def downloadImageFromURL(url):

    name = random.randrange(1, 10000)

    full_name = str(name) + ".jpg"

    urllib.request.urlretrieve(url, full_name)


f = open('url.txt','r')

for row in range(0, 10):

   line = f.readline()

    try:

        downloadImageFromURL(line)

    except ConnectionError:

        print("Failed to open url.")

    print(line)

f.close()

NEW CODE:

import urllib.request
import random

def sendRequest(url):
    try:
        page = requests.get(url, stream = True, timeout = 5)
    except Exception:
       return False

    else:
        if (page.status_code == 200):
            return page

        else:
            return False

f = open('url.txt','r')
for row in range(0, 10):
    line = f.readline()
    try:
        sendRequest(line)
    except ConnectionError:
        print("Failed to open url.")
    print(line)
f.close()

Thank you!

  • 1
    What's your question? – Paolo Jul 14 '18 at 17:53
  • 1
    Please give a [mcve]. – jonrsharpe Jul 14 '18 at 17:54
  • My question is, how can I stop these errors from coming. I will add the code above, thanks for the quick response. –  Jul 14 '18 at 17:55
  • Please show us the full traceback for the error and a particular input that causes that error. Read and follow [How to create a Minimal, Complete, and Verifiable example](http://stackoverflow.com/help/mcve). – Rory Daulton Jul 14 '18 at 18:56
  • An url that fails to load _should_ throw an error, right ? If you have a reasonable case to deviate from the pattern, skip printing the 'Failed to open url' message. To deal with timeouts specifically try adding a handler for the `TimeoutError` exception class parallel to the `ConnectionError` handler. – collapsar Jul 14 '18 at 21:46

1 Answers1

0
import os
import requests
import shutil

outputDirectory = r"C:\Users\Joshua\Documents\Downloaded Media"

def sendRequest(url):
    try:
        page = requests.get(url, stream = True, timeout = 5)

    except Exception:
        pass

    else:
        if (page.status_code == 200):
            return page

    return False

def downloadImage(imageUrl: str, filePath: str):
    img = sendRequest(imageUrl)

    if (img == False):
        return False

    with open(filePath, "wb") as f:
        img.raw.decode_content = True

        try:
            shutil.copyfileobj(img.raw, f)
        except Exception:
            return False

    return True


URL = "https://upload.wikimedia.org/wikipedia/commons/b/b6/Image_created_with_a_mobile_phone.png"

imageName = URL.split("/")[-1] # Image_created_with_a_mobile_phone.png

# C:\Users\Joshua\Documents\Downloaded Media\Image_created_with_a_mobile_phone.png
imagePath = os.path.join(outputDirectory, imageName)

downloadImage(URL, imagePath)
Joshua Nixon
  • 1,379
  • 2
  • 12
  • 25
  • I am very new to python, but it seems that this code does not download the image, just prints it. My new code is above. –  Jul 14 '18 at 18:19
  • I will update my answer in a sec – Joshua Nixon Jul 14 '18 at 18:20
  • Actually, I encountered a problem. All of my images are being named the same thing therefor being replaced one after the another. Is there a way around this? –  Jul 14 '18 at 22:31
  • That's because you must be giving the function the same path, nothing to do with my code. If you update your answer and show me I can tell you what you are doing wrong – Joshua Nixon Jul 15 '18 at 09:30
  • It's all good. I figured it out, thanks for all the help! –  Jul 17 '18 at 19:46