9

Is there some tool that would turn an HTML with mathjax into a valid LaTeX document? I undernstand that mathjax is already LaTeX, but if that is mixed with text, then simply saving the text representation of an HTML document is not going to work. E.g., the underscore should be turned into a backslash + _, when it is in the text section, while it should be left alone, if it is in a math environment. My question is, whether there is a way to do this automatically. I would prefer a javascript solution, but if that is absolutely not possible, I could live with a tool (e.g., python) that I can call from the command line.

Thanks,

v923z
  • 1,237
  • 5
  • 15
  • 25
  • 2
    [This question](http://stackoverflow.com/questions/11338049/how-to-convert-html-with-mathjax-into-latex-using-pandoc) might help. – Lars Kotthoff Apr 12 '13 at 08:08
  • Thanks! I have already seen that, but I think that solution is a bit convoluted: there is perl, haskel, and pandoc involved. – v923z Apr 12 '13 at 17:49
  • @v923z - Pandoc is a Haskell library - you need something to parse the HTML. John MacFarlane has written a simpler alternative that should work with little adaptation at http://stackoverflow.com/questions/16014717/convert-html-mathjax-to-markdown-with-pandoc/16022396#16022396 – Charles Stewart Apr 19 '13 at 07:47
  • @charles-stewart pandoc is also an application: http://code.google.com/p/pandoc/downloads/list – Cfr Apr 23 '13 at 11:28
  • Thanks for the comments! Since I wanted to convert to LaTeX only, I ended up with a small script that strips the formulae from the HTML document. I see that pandoc is quite capable, though. – v923z Apr 23 '13 at 11:44

0 Answers0