2

The following question (and there're a few more places where this issue has been discussed and resolved) discusses how to convert an .html file with base64 images to a .docx file.
Posiible to use pandoc with HTML containing base64 inline images?

I want to go the other way — convert .docx which has images to a standalone .html file with base64 code which reproduces (not necessarily with the same quality) the images present in the .docx file. For starters, I tried:

pandoc -s -o chapter1.html cc.docx

as well as

pandoc -o chapter1.html cc.docx

In both cases the .html file generated contains lines like img src="media/image1.png" which indicate that Pandoc tried to create (or thinks it has created) a folder named media where the figures from the .docx file are placed. But there is no such folder created by Pandoc. In any case I want the .html file to be a standalone document (just like the .docx file) and I don't need the folder.

I tried looking this up on the web, but the only solutions I get, pertain to the problem of converting base64 images in .html to .docx and not the other way round.

Kurt Pfeifle
  • 86,724
  • 23
  • 248
  • 345
Shashank Sawant
  • 1,111
  • 5
  • 22
  • 40

1 Answers1

4

Maybe it didn't work two years ago (March 2013) when you asked. Now it does, with the latest version of Pandoc (v1.13.2.1):

pandoc -o out.html --self-contained in.docx
Kurt Pfeifle
  • 86,724
  • 23
  • 248
  • 345
  • improvment: --self-contained is nowadays deprecated `[WARNING] Deprecated: --self-contained. use --embed-resources --standalone` btw: I used it for extrac images, e.g. see this [Thread](https://stackoverflow.com/questions/63220021/can-one-extract-images-from-pandocs-self-contained-html-files) please. Good luck – andreas-supersmart Aug 17 '23 at 15:50