-1

I have a text file C:\folder\filelist.txt containing a list of numbers, for example:

 

345651
342679
344000
349080

I want to append the URL as shown below, download only the files that are >1000KB, and strip the parameters after "-a1" from the filename, for example:

This is the code I currently have, which works for downloading the files and appending the .jpeg extension, given the full URL is in the text file. It does not filter out the smaller images or strip the parameters following "-a1".

cd C:\folder\
wget --adjust-extension --content-disposition -i C:\folder\filelist.txt

I'm running Windows and I'm a beginner at writing batch scripts. The most important thing 'm trying to accomplish is to avoid downloading images <1000kb: it would be acceptable if I had to manually append the URL in the text file and rename the files after the fact. Is it possible to do what I'm trying to do? I've tried modifying the script by referencing the posts below, but I can't seem to get it to work. Thanks in advance!

Wget images larger than x kb

Downloading pdf files with wget. (characters after file extension?)

Spider a Website and Return URLs Only

Matt D
  • 11
  • 3
  • Thanks! I've got a working code below, but it takes a very long time to run considering I have tens of thousands of files to loop through. do you know any way I could make it go faster? I think it's slow because I call wget for every iteration so I can't batch-process a list of files to determine the size. – Matt D Sep 19 '21 at 21:06

1 Answers1

0
#change working directory
cd /c/folder/

#convert input file list to unix
dos2unix filelist.txt


for image in $(cat filelist.txt)
do

imageURL="https://some.thing.com/gab/abc-$image-def-a1?scl=1&fmt=jpeg"
size=`wget -d -qO- "$imageURL" 2>&1 | grep 'Content-Length' | awk {'print $2'}`

  if [[ $size -gt 1024000 ]] ;then
    imgname="/c/folder/abc-$image-def-a1.jpeg"
    wget -O $imgname $imageURL
  fi

done
Matt D
  • 11
  • 3