Context
I have been working for some time on creating a Python Script that uses the docxtpl package (and Jinja2 for managing tags and templates) to automate creation of MS Word reports.
My script (see below) is located in abase directory, along with an excel document for auto-filling tags and a template word document that is referenced. Within the base directory, there is a sub-directory (Image_loop) that contains a further directory for each placeholder image that must be replaced. The images are replaced using the Alt-text that has been assigned to each placeholder image in the template document, and has the same name as the directories within Image_loop (Image1, Image 2, etc). My directory setup can be seen in the photos below.
My Code
import jinja2
import json
import numpy as np
from pathlib import Path
import pandas as pd
from docxtpl import DocxTemplate
import glob
import os, sys
from docxtpl import DocxTemplate, InlineImage # pip install docxtpl
from docx.shared import Cm, Inches, Mm, Emu # pip install python-docx
base_dir = Path('//mnt//c//Users//XXX//Desktop//AUTOMATED_REPORTING') #make sure base directory is your own, the one you are going to be working out of, in Ubuntu directory format
word_template_path = base_dir / "Template1.docx" #set your word document template
excel_path = base_dir / "Book1.xlsx" #set the reference excel document
output_dir = base_dir / "OUTPUT" # set a directory for all outputs
output_dir.mkdir(exist_ok=True) # creates directory if not already existing
df = pd.read_excel(excel_path, sheet_name="Sheet1", dtype=str) #read the excel reference document as a pandas dataframe, datatype as string to avoid formatting issues
df2 = df.fillna(value='', method=None, axis=None, inplace=False, limit=None, downcast=None) #turns N/A values to blanks, as pandas data frame cannot have empty cells, but we want no value to be displayed in some instances
doc = DocxTemplate(word_template_path)
context = {}
image_filepath = Path('//mnt//c//Users//XXX//Desktop//AUTOMATED_REPORTING//Image_loop')
for record in df2.to_dict(orient="records"): #for loop that allows for values from Excel Spreadsheet to be rendered in template document
output_path = output_dir / f"{record['Catchment']}-Test_document.docx"
for address, dirs, files in os.walk(image_filepath): #for loop that iterates through 'image filepath' to find relevant sub-directories and the associated images within, to replace placeholder image in template word document
i = 0
while i < len(dirs):
dir_int = [*dirs[i][-1]]
directory = str(dirs[i])
if os.path.exists(image_filepath / f"{directory}/{record['Catchment']}.png"):
doc.replace_pic(f"{directory}", image_filepath / f"{directory}/{record['Catchment']}.png")
i += 1
doc.render(record)
doc.save(output_path)
Problem (help please)
My problem is that for some of my reports, there are no images for some of the placeholders. So for the sub-directories within Image_loop (Image1, Image 2, etc.), there is no image that corresponds to the template image number for that specific report.
So whilst the sub-directory 'Image_1' may contain for reports A,B,C,D:
- Map_A.png (for report A)
- Map_B.png (for report B)
- Map_C.png (for report C)
- Map_D.png (for report D)
i.e a map for every report
The sub-directory 'Image_2' only contains for reports A,B,C,D:
- Graph_A (for report A)
- Graph_B (for report B)
- Graph_D (for report D)
i.e. there is to be no graph for report C
I am able to avoid bullet points or tables from the template document being automatically printed when there is no corresponding value to be filled by the Excel document for a specific report. This is done directly in the template document, using a 'new paragraph if statement' in Jinja 2 (https://jinja.palletsprojects.com/en/3.0.x/templates/). It looks something like this:
{%p if <TEMPLATE_VALUE> != '' %} {%p endif %}
(i.e. don't print the bullet points, table, etc ,if there is no value to fill them with)
BUT if I wrap this same if statement at the start and end of a template image within the template document, I get an error running the code in Linux Ubuntu: ValueError: Picture ImageXYZ not found in the docx template
The error is attributed to the last line of my code: doc.save(output_path). I assume this is because the Jinja 2 '%p if statement' is removing the placeholder image when there is no replacement image to be found, and this creates a problem when trying to save report documents that are outliers (with no actual image to replace the placeholder image). When the code is run, reports are generated for those that have images for all placeholders, but not the 'outlier' document.
I'm sure there is a way to modify my code to generate the outlier reports, even though the placeholder image is not going to be replaced. Perhaps with a 'try:, except:' statement?
But I'm a bit stuck...