3

I have a directory of .md documents that each contain a YAML header specifying document title, author, date, categories,tags, etc. The directory contains journal entries and the filenames are simply the date of the entry.

I have no trouble using pandoc to generate a PDF for each .md file, however I'm looking for a way generate a single PDF in book or memoir format with each .md document's title field as a chapter in the table of contents, arranged by the date value. Ideally, the date would also appear in the table of contents, but that's not critical if the individual chapters will also display that information.

I haven't been able to find a way to do this as pandoc seems to ignore all but the first YAML header when concatenating multiple documents. One possible solution I can think of is to convert all relevant YAML header info to markdown headings and then demote existing headings in each .md document. But I'm not sure how to do this or if this is even the best approach. I was also looking at the R bookdown package, but it also uses markdown headers for chapters and not sure if it can be adapted to use YAML header info.

what is the easiest way to accomplish what I need? Thanks.

clau
  • 93
  • 1
  • 5

1 Answers1

2

Your idea as outlined in your question is a good way to go:

The demoting of the title to a header can be done via a filter, e.g. a Lua filter if you are using pandoc >2.0. The following assumes that you are using the current version 2.0.6:

demote.lua:

-- List is available since pandoc 2.0.4
local List = require 'pandoc.List'

function Header (h)
  h.level = h.level + 1
  return h
end

function Pandoc (doc)
  local title = doc.meta.title
  local header = pandoc.Header(1, title)
  doc.blocks = {header} .. doc.blocks
  return doc
end

Now run the following command to create your pdf:

for f in /path/to/docs/*.md; do
    pandoc --lua-filter=demote.lua -t markdown
    printf "\n" # insert empty line between articles
end | pandoc -o combined.pdf
tarleb
  • 19,863
  • 4
  • 51
  • 80