python docx delete table from document

Question

I want to remove some tables from a document based on the contents of the upper left cell.

I tried:

allTables = document.tables
for activeTable in allTables:
    if activeTable.cell(0,0).paragraphs[0].text == 'some text':
        allTables.remove(activeTable)

I expected to have removed all tables containing 'some text' in cell(0,0), but they are still in the document.

The process enters the line with "allTables.remove(activeTable)" as expected: indexToDelete = allTables.index(activeTable)within the if statement gives the tables, I'm looking for.

The message is "Process finished with exit code 0"

score 6 · Answer 1 · answered Dec 11 '17 at 07:19

6

The solution is:

allTables = document.tables

for activeTable in allTables:
    if activeTable.cell(0,0).paragraphs[0].text == 'some text':
        activeTable._element.getparent().remove(activeTable._element)

Thanks to scanny.

answered Dec 11 '17 at 07:19

Bartli

315
2
11

score 3 · Accepted Answer · answered Dec 10 '17 at 21:40

3

It sounds like your test if activeTable...text == 'some text' is not succeeding for any of the tables. In that case, the .remove() call is never executed but the script still returns an exit code of 0 (success).

Start by validating your test, maybe something like:

for table in document.tables:
    print("'%s'" % table.cell(0, 0).paragraphs[0].text)

and make sure the paragraph text is what you think it is. This should print out something like:

'some text but also some other text'
...

Once you have that determined, you may want to test on something other than the entire string, perhaps using .startswith():

text = table.cell(0, 0).paragraphs[0].text
if text.startswith('some text'):
    print('found one')

Once you have that working you can move on to the next problem.

answered Dec 10 '17 at 21:40

scanny

26,423
5
54
80

This is not the problem. The code `for activeTable in allTables: if activeTable.cell(0,0).paragraphs[0].text == 'some text': indexOfTableToDelete = allTables.index(activeTable) print("Table to delete: ", 1+ indexOfTableToDelete) del allTables[indexOfTableToDelete] ` – Bartli Dec 11 '17 at 06:35
Sorry. my comment was sent before it was finished. The code `for activeTable in allTables: if activeTable.cell(0,0).paragraphs[0].text == 'some text': indexOfTableToDelete = allTables.index(activeTable) print("Table to delete: ", 1+ indexOfTableToDelete) del allTables[indexOfTableToDelete] ` prints: Table to delete: 3 Table to delete: 8 Table to delete: 10 Those are the tables, that I'm lookoing for. – Bartli Dec 11 '17 at 06:43
3

Ah, okay. I see what's happening now. You're removing the table from the list that `document.tables` returns (`allTables`) but that doesn't remove the table from the underlying XML. Try `activeTable._element.getparent().remove(activeTable._element)` instead. – scanny Dec 11 '17 at 06:54
Thank you. This is the solution. I thought that allTables and document.tables are two names of the same object. But they are two different objects. – Bartli Dec 11 '17 at 07:17
2

The `Table._element` objects will be the same in all instances (one for each table). But the `Table` object itself is just a proxy object (wrapper) around the element and the "tables" object is simply a list. So each time you call `Document.tables` you get a new one (`.tables` is a property, so the result you get is the return value of a method, not a "static" attribute of the document). Don't neglect to accept this answer. That's how you "pay" for someone taking the time to answer :) – scanny Dec 11 '17 at 07:21

score 0 · Answer 3 · answered Jan 25 '22 at 10:50

0

You can use this function

from docx import Document

document = Document('YOUR_DOCX')

def Delete_table(table):
        document.tables[table]._element.getparent().remove(document.tables[table]._element)

Delete_table(0)

document.save('OUT.docx')

answered Jan 25 '22 at 10:50

Liam-Nothing

167
8

python docx delete table from document

3 Answers3