Some time ago, I needed to programmatically convert Jupyter Notebook presentations to PDF slides. I did some research and you can use puppeteer to automate the process. You need a simple Python script for this:
import asyncio
import os
import tempfile
from subprocess import PIPE, Popen
from pyppeteer import launch
import concurrent.futures
async def html_to_pdf(html_file, pdf_file, pyppeteer_args=None):
"""Convert a HTML file to a PDF"""
browser = await launch(
handleSIGINT=False,
handleSIGTERM=False,
handleSIGHUP=False,
headless=True,
args=["--no-sandbox"],
)
page = await browser.newPage()
await page.setViewport(dict(width=994, height=768))
await page.emulateMedia("screen")
await page.goto(f"file://{html_file}", {"waitUntil": ["networkidle2"]})
page_margins = {
"left": "20px",
"right": "20px",
"top": "30px",
"bottom": "30px",
}
dimensions = await page.evaluate(
"""() => {
return {
width: document.body.scrollWidth,
height: document.body.scrollHeight,
offsetWidth: document.body.offsetWidth,
offsetHeight: document.body.offsetHeight,
deviceScaleFactor: window.devicePixelRatio,
}
}"""
)
width = dimensions["width"]
height = dimensions["height"]
await page.pdf(
{
"path": pdf_file,
"format": "A4",
"printBackground": True,
"margin": page_margins,
}
)
await browser.close()
if __name__ == "__main__":
html_input_file = "/you/need/full/path/here/presentation.slides.html?print-pdf"
pdf_output_file = "slides.pdf"
pool = concurrent.futures.ThreadPoolExecutor()
pool.submit(
asyncio.run,
html_to_pdf(
html_input_file,
pdf_output_file
),
).result()
The script accepts the HTML slides as input and produces the PDF slides as output. Please note that you need to provide full path for the HTML file. I wrote an article on how to convert notebook presentations to pdf slides. If you would like to apply styling, here is longer version of the script.