How can I search a word document using python to extract the paragraph text after searching and matching the paragraph heading i.e. "1.2 Summary of Broadspectrum Offer".
i.e. see below for a doc example, i basically would like to get the following text "A summary of our Offer to deliver the Scope of Work as outlined in the tender documents is provided below. Please refer to the various terms and conditions of our Offer as detailed herein. Please also find the cost breakdown "
1. Executive Summary
1.1 Summary of Services
Energy Savings (Carbon Emissions and Intensity Reduction)
Upgrade Economy Cycle on Level 2,5,6,7 & 8, replace Chilled Water Valves on Level 6 & 8 and install lighting controls on L5 & 6..
1.2 Summary of Broadspectrum Offer
A summary of our Offer to deliver the Scope of Work as outlined in the tender documents is provided below. Please refer to the various terms and conditions of our Offer as detailed herein.
Please also find the cost breakdown
note that the headings number change from doc to doc and do not want to rely on this, more so i want to rely on the search text in the heading
so far i can search the documents but just a start.
filename1 = "North Sydney TE SP30062590-1 HVAC - Project Offer - Rev1.docx"
from docx import Document
document = Document(filename1)
for paragraph in document.paragraphs:
if 'Summary' in paragraph.text:
print paragraph.text