4

I'm trying to convert .adoc files to .docx

Actually I'm using:

asciidoctor file.adoc -o file.html
pandoc -s -S file.html -o output.docx

My math equations or symbols inside .adoc are equal to:

latexmath:[$\phi$] and more text as Inline test latexmath:[$\sin(x)$]

It returns after conversion to docx the strange lines inside .docx:

\($\phi$\) and more text as Inline test \($\sin(x)$\)

Any hint?

$ pandoc --version
pandoc 1.13.2.1
Compiled with texmath 0.8.2, highlighting-kate 0.5.15.
arnaldo
  • 876
  • 8
  • 22

1 Answers1

2

You need to explicitly enable the tex_math_dollars extension with html input:

pandoc -s -S file.html -f html+tex_math_dollars -o output.docx

And sed can deal with the remaining escaped parenthesis (\() :

asciidoctor file.adoc -o file.html
sed -r 's/\\\(/\(/g' < file.html | sed -r 's/\\\)/\)/g' > file2.html
pandoc -s -S file2.html -f html+tex_math_dollars -o output.docx
scoa
  • 19,359
  • 5
  • 65
  • 80
  • Nice!! It's a progress. But it renders the Formulas with " \ ( " before and after LaTeX stuff. – arnaldo May 04 '15 at 23:32
  • @arnaldo I don't use asciidoc but I might be able to help on the html to docx conversion if you add the output of `asciidoctor file.adoc -o file.html` – scoa May 05 '15 at 06:13
  • Thanks @scoa! Follows the HTML generated by asciidoc. http://pastebin.com/9t1RD9U1 – arnaldo May 05 '15 at 13:36
  • @arnaldo isn't there something in your original asciidoc that causes this? If I compile this file : `($C_y / C_x = \frac{Z_x - G}{Z_x}$) (Eq. 1)` with ` asciidoctor input.adoc -o temp.html ; pandoc -s -S temp.html -o output_from_adoc.docx -f html+tex_math_dollars`, it works fine. – scoa May 05 '15 at 16:56
  • Hi @scoa, You are right, if I include directly the equation inside Dollars, it render the right equation inside my docx, but it fails to generate the right equation symbols to .tex and also .pdf. I was including equations as referenced at AciiDoc manual (e.g. latexmath:[$C_y / C_x = \frac{Z_x - G}{Z_x}$]. By the way, my toolchain is generate .html from asciidoc, after .xml (docbook) and using dblatex to convert to pdf. Any trick to fix both converters? – arnaldo May 05 '15 at 18:19
  • @arnaldo In that case, I would just preprocess the html file, in between the asciidoctor and the pandoc command to remove escaped parenthesis : `sed -r 's/\\\(/\(/g' < file.html | sed -r 's/\\\)/\)/g' > file2.html` Unless you really need escaped parenthesis somewhere in the document. – scoa May 05 '15 at 19:29
  • Thanks a lot @scoa!! Sometimes I think is really tricky to write with AsciiDoctor, or to be obligated to deal with multiple extension files. – arnaldo May 05 '15 at 22:25
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/77048/discussion-between-arnaldo-and-scoa). – arnaldo May 05 '15 at 22:59