3

Is there a way of specifying components to remove from MS or Openoffice documents via ruby? I'm talking about removing macros/meta information and also removing/replacing images. I've looked at a number of conversion programs with a view to doing a conversion from/to the same file format, but I can't find any that allow such options to be specified.

I've looked at:

Alexis Pigeon
  • 7,423
  • 11
  • 39
  • 44
Simmo
  • 1,717
  • 19
  • 37

1 Answers1

1

Docx files are really zip files. You can unzip them (inflate) into a directory and delete or change the files you need, and update references to those files. The files inside the zip are text files, XML, so you can use LibXML-Ruby or Nokogiri.

Chloe
  • 25,162
  • 40
  • 190
  • 357
  • 1
    I was hoping for something that covered older versions of Word as well, but in the absence of a better answer, I'll mark yours as correct. Obviously what I'm looking for doesn't currently exist, publicly at least. – Simmo Nov 19 '13 at 11:22