I've used python-docx to create some tables using a specified style format in my docx file. I now need to use these tables with this same formatting. Is there a way I can either convert the table including all of the formatting and styles, colours etc. to html? Or failing that a simple (automated) way of making the table into a figure which could be used?
Asked
Active
Viewed 2,914 times
-1
-
Save the docx file as a .html form using python code. I tried manually it works fine. – Anonymous Apr 03 '19 at 13:20
-
Thanks Suraj M, that sounds very useful. Do you have an example or a link which shows how to do this? Does it preserve all of the formatting, for exanple cell colouring , font size etc.? – oben Apr 03 '19 at 16:08
-
I added sample code below to convert docx to html – Anonymous Apr 04 '19 at 06:57
-
Below I posted two codes use `win32` module to keep formatting and images. – Anonymous Apr 04 '19 at 07:23
-
If below code sample is usefull can you vote me. – Anonymous Apr 04 '19 at 10:17
2 Answers
4
To covert Docx to HTML use below code:
Below code do not identify the tables and images from docx.It convert docx to html but not preserve tables and images.
import mammoth
Docx = open("docx_file.docx", 'rb')
html = open('html_filename.html', 'wb')
document = mammoth.convert_to_html(Docx )
html.write(document.value.encode('utf8'))
Docx.close()
html.close()
To keep formatting and images use win32 package for converting docx to html.
import win32com.client
doc = win32com.client.GetObject ("docx_InputFile.docx")
doc.SaveAs (FileName="Html_FileName.html", FileFormat=8)
doc.Close ()

Anonymous
- 659
- 6
- 16
-
Thanks Suraj M. This will be very useful. I am however getting a few errors and am stuck with: pywintypes.com_error: (-2147221020,'Invalid syntax', None,None) in Moniker of win32. Do you happen to know how to fix it? My searches haven't proved very useful and any advice found so far hasn't worked – oben Apr 04 '19 at 12:51
-
-
-
@oben I tested my code on python 3.6 It works fine for me. Are you passing docx path as a absolute path.Don't pass relative path. – Anonymous Apr 04 '19 at 14:13
-
-
hey guys can we append different word document in html file ...i don't want to create new html file i just wanted to append the data using win32 is it possible? – snehil singh Jul 27 '21 at 06:17
-
0
I can't find suitable solution, that supports conversion with formatting and styles. But you may try to convert docx to jpg by using this: DOCX to JPG API. Python library and snippets for this service are here: ConvertAPI/convertapi-python

SmartWaddles
- 131
- 2
- 10
-
Thanks SmartWaddles, do you know if it is possible to automate that it only converts the table and not the whole document? I can't view the first link unfortunately – oben Apr 03 '19 at 16:17
-
@oben, it looks like it converts the whole document into jpg-file. You may use something like [VBA](https://en.wikipedia.org/wiki/Visual_Basic_for_Applications) or some Python modules to extract tables from your initial docx file into docx only with tables, that you need to convert. – SmartWaddles Apr 04 '19 at 07:09
-
Thanks @SmartWaddles I'm not sure I can do that for this work as the formatting is tied to the particular style template of the document if copied to another I think the formatting will change. – oben Apr 04 '19 at 12:56