Python write function is not saving all images

Question

I am trying to download images from hyperlinks (example). To accomplish this I am using the following function:

def download_logos(lst):
  image_url = lst[1]
  img_data = requests.get(image_url).content
  df.append([lst[0], img_data, lst[2]])
  filename = 'logos/{}/{}.png'.format(lst[2], lst[0])
  os.makedirs(os.path.dirname(filename), exist_ok = True)
  with open(filename, 'wb') as f:
     f.write(img_data)

The variable lst is a row in a matrix which includes the team name, the link of the image and the competition in which the team plays. When running this function for all of my data (543 teams) it seems to skip a lot images only 200-300 images downloaded.

To see if it was an issue with the script not being able to access the link and download the image data I tried to perform the action in two steps, i.e. first download the image data for all teams and then save the data to disk. To my surprise image data was present for all 543 teams, so when I tried saving the data I expected all images to be present. To my surprise, this time around 500 images were saved, which was still an improvement.

I am unable to find out what could be causing this problem, therefore I am hoping someone can point out where I made a mistake and/or how I can fix the problem.

Your code is working. Maybe your server restricting too many connections? — Viktor Ilienko, Oct 04 '18 at 19:57
@ViktorIlyenko I figured out the fix to the problem I was having, see my answer below. — Oxbowerce, Oct 05 '18 at 21:17

score 0 · Accepted Answer · answered Oct 05 '18 at 21:17

I managed to find the problem, which was not related to my downloading function. I found out that the way I retrieved the download links from a webpage was incorrect, causing me to end up with duplicates. Since I was limiting the image links to the first x entries I was missing the image links after that. Rewriting the function that retrieved the links to get rid of the duplicates and retrieve the links correctly fixed the problem, allowing me to use the function defined above to download and save the images.

Python write function is not saving all images

1 Answers1