6

I have a sequence of JPG images. Each of the scans is already cropped to the exact size of one page. They are sequential pages of a valuable and out of print book. The publishing application requires that these pages be submitted as a single PDF file.

I could take each of these images and just past them into a word-processor (e.g. OpenOffice) - unfortunately the problem here is that it's a very big book and I've got quite a few of these books to get through. It would obviously be time-consuming. This is volunteer work!

My second idea was to use LaTeX (actually pdflatex) - I could make a very simple document that consists of nothing more than a series of in-line image includes. I'm sure that this approach could be made to work, it's just a little on the complex side for something which seems like a very simple job.

It occurred to me that there must be a simpler way - so any suggestions?

I'm on Ubuntu 9.10, my primary programming language is Python, but if the solution is super-simple I'd happily adopt any technology that works.


UPDATE, can somebody explain what's going wrong here?

sal@bobnit:/media/NIKON D200/DCIM/100HPAIO/bat$ convert '*.jpg' bat.pdf
convert: unable to open image `*.jpg': No such file or directory @ blob.c/OpenBlob/2439.
convert: missing an image filename `bat.pdf' @ convert.c/ConvertImageCommand/2775.

Is there a way in the convert command syntax to specify that bat.pdf is the output?

Thanks

Salim Fadhley
  • 22,020
  • 23
  • 75
  • 102

3 Answers3

12

It occurred to me that there must be a simpler way - so any suggestions?

You're right, there is! Try this:

sudo apt-get install imagemagick
cd ~/rare-book-images
convert "*.jpg" rare-book.pdf

Note: depending on what shell you're using "*.jpg" might not work as expected. Try omitting the quotes and seeing if that gets you the results you expect.

John Feminella
  • 303,634
  • 46
  • 339
  • 357
6

If you're interested in a Python solution, you can use the ReportLab library. For example:

from reportlab.platypus import SimpleDocTemplate, Image
from reportlab.lib.pagesizes import letter
from glob import glob

doc = SimpleDocTemplate('image-collection.pdf', pagesize=letter)
parts = [Image(filename) for filename in glob('*.jpg')]
doc.build(parts)

This will take all the jpg files in your current directory and produce a file called "image-collection.pdf".

ars
  • 120,335
  • 23
  • 147
  • 134
0

I wonder if you could just do it with a for loop with a \includegraphics command inside and some suitably nifty standard image file naming and so on inside a LaTeX file. This might have the advantage of allowing title pages etc and page numbering and so on. (I'm not sure either of the other solutions do this and I can't be bothered to check. I'm just pondering out loud here, really)

Seamus
  • 2,041
  • 6
  • 22
  • 42