-1

I've used python-docx to create some tables using a specified style format in my docx file. I now need to use these tables with this same formatting. Is there a way I can either convert the table including all of the formatting and styles, colours etc. to html? Or failing that a simple (automated) way of making the table into a figure which could be used?

oben
  • 31
  • 2
  • 5

2 Answers2

4

To covert Docx to HTML use below code:

Below code do not identify the tables and images from docx.It convert docx to html but not preserve tables and images.

import mammoth

Docx = open("docx_file.docx", 'rb')
html = open('html_filename.html', 'wb')
document = mammoth.convert_to_html(Docx )
html.write(document.value.encode('utf8'))
Docx.close()
html.close()

To keep formatting and images use win32 package for converting docx to html.

import win32com.client

doc = win32com.client.GetObject ("docx_InputFile.docx")
doc.SaveAs (FileName="Html_FileName.html", FileFormat=8)
doc.Close ()
Anonymous
  • 659
  • 6
  • 16
  • Thanks Suraj M. This will be very useful. I am however getting a few errors and am stuck with: pywintypes.com_error: (-2147221020,'Invalid syntax', None,None) in Moniker of win32. Do you happen to know how to fix it? My searches haven't proved very useful and any advice found so far hasn't worked – oben Apr 04 '19 at 12:51
  • @oben which version of python you are using? – Anonymous Apr 04 '19 at 13:03
  • the version I've got is 2.7 – oben Apr 04 '19 at 13:46
  • @oben I tested my code on python 3.6 It works fine for me. Are you passing docx path as a absolute path.Don't pass relative path. – Anonymous Apr 04 '19 at 14:13
  • Yep using absolute paths – oben Apr 05 '19 at 08:37
  • hey guys can we append different word document in html file ...i don't want to create new html file i just wanted to append the data using win32 is it possible? – snehil singh Jul 27 '21 at 06:17
  • @oben - did you solve the com_error error? I have the same – gtomer Jun 12 '23 at 14:15
0

I can't find suitable solution, that supports conversion with formatting and styles. But you may try to convert docx to jpg by using this: DOCX to JPG API. Python library and snippets for this service are here: ConvertAPI/convertapi-python

SmartWaddles
  • 131
  • 2
  • 10
  • Thanks SmartWaddles, do you know if it is possible to automate that it only converts the table and not the whole document? I can't view the first link unfortunately – oben Apr 03 '19 at 16:17
  • @oben, it looks like it converts the whole document into jpg-file. You may use something like [VBA](https://en.wikipedia.org/wiki/Visual_Basic_for_Applications) or some Python modules to extract tables from your initial docx file into docx only with tables, that you need to convert. – SmartWaddles Apr 04 '19 at 07:09
  • Thanks @SmartWaddles I'm not sure I can do that for this work as the formatting is tied to the particular style template of the document if copied to another I think the formatting will change. – oben Apr 04 '19 at 12:56