2

Is it possible to only download images larger than a given amount of kb?

I have this now: wget -r -P download/location -U Mozilla -A jpeg,jpg,bmp,gif,png http://www.website.com

Kind regards, n00bly

n00bly
  • 395
  • 6
  • 21

1 Answers1

2

There's no size option for wget's recursive downloading, but you can spider your way to a list of image URLs that you can check their Content-Length for download. You can do this in a bash script.

#Retrieve image URLs from site
image_urls=`wget --spider --force-html -r -l2 "http://www.website.com" 2>&1 | grep '^--' | awk '{ print $3 }' | grep '\.\(jpeg\|jpg\|bmp\|gif\|png\)$'`
for image_url in $image_urls
do
  size=`wget -d -qO- "$image_url" 2>&1 | grep 'Content-Length' | awk {'print $2'}`
 #download only download images less than 100,000 bytes
  if [[ $size < 100000 ]] ;then 
    wget $image_url
  fi
done
Shawn Conn
  • 431
  • 6
  • 11
  • Thank you, but when i try to run the script (on Mac OSX) i got these messages: /Users/Name/Desktop/ImageScript4: command substitution: line 3: unexpected EOF while looking for matching `"' /Users/Name/Desktop/ImageScript4: command substitution: line 4: syntax error: unexpected end of file – n00bly Sep 05 '15 at 10:25
  • Looks like i got it working... Can i just change the "if [[ $size < 100000 ]] ;then" to "if [[ $size > 100000 ]] ;then" to get only images above the 100kb? – n00bly Sep 06 '15 at 13:19