4

I managed (with the help of SO) to make perfect png-snippets from a pdf file with graphicsmagick. My pdf contains text and formula each "snippet" on a single page. My command trims the content of a page to the very content and finally scales this up to 2000 pixel width.

Untill now, I need to repeat that command for each single page in every pdf. I am wondering how to automate this. I think I could try a loop for the repetition of the command for every page i untill the last page.

Assume file1.pdf is in my current working directory.

gm convert -density 300x300 file1.pdf[0] -trim -resize 2000x file1_page1.png
gm convert -density 300x300 file1.pdf[1] -trim -resize 2000x file1_page2.png
gm convert -density 300x300 file1.pdf[2] -trim -resize 2000x file1_page3.png
...

How can I set a counter and run a loop for every page in my document?

Marco
  • 2,368
  • 6
  • 22
  • 48

2 Answers2

4

You are in luck. GraphicsMagick knows how to do that for you:

gm convert -density 300x300 input.pdf -trim -resize 2000x +adjoin output-%d.png

If you are ok using ImageMagick instead, you can set the starting output file number to 1 instead of 0 and don't need the -adjoin:

convert -density 300x300 input.pdf -scene 1 -trim -resize 2000x output-%d.png

Or, if you want them all done in parallel, use GNU Parallel:

parallel gm convert -density 300x300 {} -trim -resize 2000x output-{#}.png  ::: $(identify input.pdf | awk '{print $1}')
Mark Setchell
  • 191,897
  • 31
  • 273
  • 432
2
for file in *.pdf
do
    pages=$(identify "$file" | wc -l)
    for (( i=0; i<$pages; i++ ))
    do
        name=$(sed "s/\.pdf$/$i.png/g" <<< "$file");
        gm convert -density 300x300 "$file[$i]" -trim -resize 2000x "$name"
    done
done

Try this one. It will convert every page in every *.pdf file to .png.

JUSHJUSH
  • 367
  • 1
  • 11
  • Hi Jushjush, how should I execute this loop? I saved it to a loop.sh and tried to run in terminal. I did nothing. Should I add some #bash line? I'm new to shell scripting. – Marco Aug 13 '19 at 11:52
  • just use `bash loop.sh`. You have to be in the same directory as your pdf files. – JUSHJUSH Aug 13 '19 at 12:01
  • @MarcoDoe Please, check if `identify file1.pdf | wc -l` returns number of pages in your shell. – JUSHJUSH Aug 13 '19 at 12:03
  • Puh I really try, but cannot use the `identify` command. `bash loop.sh` gives errors, it says e.g. "line 1 $'\r': command not found$." – Marco Aug 13 '19 at 12:34
  • @MarcoDoe The problem with 'line 1 $\r' is probably linked with using windows. Try to run command `dos2unix loop.sh`, then run script. Also, do you have any other program to check number of pages in your pdf files? I think 'identify' program is part of imagesmagick package. – JUSHJUSH Aug 13 '19 at 12:38
  • Oh yes, I have Ubuntu shell on Windows 10. I created a sh with text editor. That probably crashed something. I converted with `dos2unix` line1 is now ok. But running the script still says "line 4: identify not found". It's a command from gm right? Do I have to put this in front of it? – Marco Aug 13 '19 at 12:44
  • @MarcoDoe Try to install imagemagick. If i remember correctl `sudo apt-get install imagemagick` should do the work. – JUSHJUSH Aug 13 '19 at 12:46
  • It gives new errors each time: "identify -im6.q16: not authorized @ error/constitute.c/ReadImage/412" more crypthic than ever. I will opt for the pure gm solution. Thanks Jush for the loop-structure, I will get it run one day :) – Marco Aug 13 '19 at 13:09