-1

I'm kind of new to Python and I struggle to get bold text from tables from .docx files.

I know how to add all info from tables to the list:

document = Document('path_to_the_.docx_ file')
document.save('path_to_the_.docx_ file')
tables = []
for table in document.tables:
    for row in table.rows:
        for cell in row.cells:
            for para in cell.paragraphs:
                tables.append(para.text)
tables

And I know how to get bold text that is not inside tables:

document = Document('path_to_the_.docx_ file')
for paragraph in document.paragraphs:
    for run in paragraph.runs:
        if run.bold:
            print(run.text)

Please help me to get bold text from tables.

Thanks in advance!

Here's an example of info that is stored in some table of my .docx files:


Bla1 bla1 bla1 – co-owner, president, reported chairman of the board of directors

Mr Bla1 is a high-profile Russian entrepreneur, whose business interests and career has been primarily associated with the IT, marketing, advertising and consulting services sectors.

Bla2 bla2 – general director, reported chief executive officer

Mr bla2 - is a medium-profile German individual, whose career has been primarily associated with the marketing, as well as car, consumers goods, and food manufacturing and trading sectors. According to publicly available sources, in 1994–2005 he was a senior engineer...

Bla3 bla3 bla3 – financial director

Bla4 bla4 – chief accountant

As provided by the requestor of this report, the Target’s chief accountant is also in charge for functions attributed in general to chief financial officer, e.g. managing the finances and financial risks as well as financial planning.


So I want to get only words: Bla1 bla1 bla1, Bla2 bla2, Bla3 bla3 bla3, Bla4 bla4 since these words are the only bold ones

salehelas
  • 1
  • 2

2 Answers2

1
from docx import Document

document = Document('test.docx')

if document.tables:
    table = document.tables[0]
    
for row_index, row in enumerate(table.rows):
    for cell_index, cell in enumerate(row.cells):
        for paragraph in cell.paragraphs:
            for run in paragraph.runs:
                if run.bold:
                    print(run.text)
0
from docx import Document

document = Document('test.docx')

p1 = document.add_paragraph()
p1.add_run("Bla1 bla1 bla1").bold=True 
p1.add_run(' - co-owner, president, reported chairman of the board of directors') 

p2 = document.add_paragraph('Mr Bla1 is a high-profile Russian entrepreneur, whose business interests and career has been primarily associated with the IT, marketing, advertising and consulting services sectors.')

p1 = document.add_paragraph()
p1.add_run('Bla2 bla2').bold=True 
p1.add_run(' - general director, reported chief executive officer') 

p3 = document.add_paragraph('Mr bla2 - is a medium-profile German individual, whose career has been primarily associated with the marketing, as well as car, consumers goods, and food manufacturing and trading sectors. According to publicly available sources, in 1994-2005 he was a senior engineer...')

p4 = document.add_paragraph()
p4.add_run('Bla3 bla3 bla3').bold=True
p4.add_run(' - financial director')

p5 = document.add_paragraph()
p5.add_run('Bla4 bla4 ').bold=True
p5.add_run('- chief accountant')

p6 = document.add_paragraph("As provided by the requestor of this report, the Target's chief accountant is also in charge for functions attributed in general to chief financial officer, e.g. managing the finances and financial risks as well as financial planning.")

document.save('test.docx')
Robin Sage
  • 969
  • 1
  • 8
  • 24