0

I am using officer to manipulate a PowerPoint template in order to fill placeholders with values calculated in r.

Is there a way to do a search-and-replace to the texts in a slide?

BurninLeo
  • 4,240
  • 4
  • 39
  • 56

1 Answers1

2

To my knowledge, officer only provides the function body_replace_all_text for Word files (docx), but no such function for PowerPoint.

Fortunately, the slide$get() function offers access to the XML document underlying the respective slide. The library xml2 then allows to modify these nodes. Hope this snippet will save others the frustrating search through the packages.

library("officer")
library("xml2")

# Function to replace text in all tags matching xpath
xml_replace_text = function(xml, search, replace, xpath = "//a:t") {
    
    # Function to replace in a single text node
    replace_in_node = function(node, search, replace) {
        xml_text(node) = gsub(pattern = search, replacement = replace, fixed = T, x = xml_text(node))
        return()
    }
    
    text_nodes = xml_find_all(xml, xpath = xpath)
    lapply(text_nodes, FUN=replace_in_node, search=search, replace=replace)
    return()
}

# Function to search and replace text in a slide in a pptx file
replace_in_slide = function(ppt, slide_index=1, search, replace) {
    xml_replace_text(ppt$slide$get_slide(slide_index)$get(), search, replace)
    return()
}

# Übersichtsseite
ppt = read_pptx("template.pptx")
replace_in_slide(ppt, 2, "%placeholder%", "Test THREE")
print(ppt, target = "example.pptx")

Note that replacing will only work if a text is actually stored in one XML tag. Avoid editing placeholder names after placing them in the text.

BurninLeo
  • 4,240
  • 4
  • 39
  • 56
  • Until now, text replacement cases have always been avoided by a simple call to ph_with, possibly preceded by `ph_remove()`. We don't recommend using the internals (`slide$get()`), we will not maintain any code using these calls, these are internals and can change, as opposed to exported functions. Basically, we don't understand the point of replacing one word with another when it's perfectly possible to simply replace the sentence containing it and control its formatting. Also, you code, does not assign the result, this is risky as it is based on the assumption the internals will never change – David Gohel Jun 11 '23 at 11:01
  • 1
    Please note also, we don't recommend `body_replace_all_text()` as explained in the documentation here https://ardata-fr.github.io/officeverse/officer-for-word.html#replacement. – David Gohel Jun 11 '23 at 11:04
  • Thank you for the advice. The use case - to make the point of replacing single words more clear - is this: I have a template with some formatting, including formatted blocks where different values from an analysis have to be added. Not only did I experience some frustration trying to exactly replace a block of text (including the position, font size, ...), but creating formatted text that still flows smothly is quite a pain. Even more so, if there are a lot of values to (re)place in the text. – BurninLeo Jun 11 '23 at 15:21
  • 1
    I agree, replacing something in the middle of a set of paragraph in a powerpoint is painful because there is no mechanism to mark/identify a chunk in PowerPoint (if you type 'he' pause, then 'llo', there could be two chunk and regexp won't be able to help...). That's why we recommend using computed fields in Word as opposed to pptx where nothing like that exist. For PPT, we think replacing the whole block is better/simpler. We build formatted paragraphs with https://ardata-fr.github.io/officeverse/paragraphs-chunks.html. You said 'it does not flow smoothly', could you explain what it means? – David Gohel Jun 11 '23 at 18:13
  • Thank you. Concering the "flow" issues ... let's assume I simply did not get the concept right how to build complex paragraphs. With the link you have posted in your comment, I shall solve that. Nonetheless, in my use case where only some values shall be modified on a complex page (the template might also change over time), I still like the idea of just replacing placeholders ;) I will dive into the chunk stuff a bit and try to merge "same-style" chunks to get a more reliable replacing. – BurninLeo Jun 16 '23 at 11:51