0

As per my understanding,

1. .eps format images are vector images.
2. When we draw something in word (like a flowchart) that is stored 
as a vector image.  

I am almost sure about the first, not sure about the second. Please correct me if I am wrong.

Assuming this two things, when a latex file (where .eps images are inserted) or a word file (that contains vector images) is converted into pdf, do the images get converted into raster images?

Also, I think PDFBox/xpdf can only extract raster images from the pdf (as they are embedded as XObjects), not vector images. Is that understanding correct? This question in stackoverflow is related, but have not been answered yet.

Community
  • 1
  • 1
rivu
  • 2,004
  • 2
  • 29
  • 45

1 Answers1

3

Your point 1 is incorrect, eps files are PostScript programs, they may contain vector information, or text or image data, or all of the above.

point 2 In PDF there isn't a 'vector image', an image means a bitmap and therefore cannot be vector.

If you convert a PostScript program to a PDF file, then the result depends entirely on the conversion program you use. In general vectors will be retained as vectors, and text as text. However it is entirely possible that an application might render the entire PostScript program and insert the result as an image in the PDF.

So the answer to your first question ("do the images get converted into raster images") is 'maybe, but probably not'.

I'm afraid I have no idea about the capabilities of PDFBox/xpdf, but since collections of vectors may not be arranged as 'images' (they could be held as Form XObjects, or Patterns) in any atomic fashion, there isn't any obvious way to know when to stop extracting. And what format would you store the result in anyway ?

KenS
  • 30,202
  • 3
  • 34
  • 51
  • Thanks for your answer. I got the answer for my first question, but as I didn't get the answer for the second question, I am keeping this open by not accepting it yet. Btw, is there a way to know if we have a vector image in a pdf? Inkscape can do that, but I need a batch software like PDFBox/xpdf. – rivu Feb 13 '13 at 18:08
  • 1
    We start to get into definition problems quickly with these sorts of questions. If a page is blank, does it contain vector drawing operations (not images, please, these have quite a specific different meaning). Now how about if I draw a white rectangle on it ? What if I draw a coloured rectangle, but outside the media box, or how about inside the media box but outside the crop box ? I wouldn't accept my answer above because it only really addresses half your problem at most. You do need to think about what you want to do with vector drawing ops, how you want them stored after extraction. – KenS Feb 14 '13 at 08:22