0

Background and Objective

I am generating a formatted collection of itemized details (similar to a catalog) in MSWord using the officer package.

Each item in the collection has a header followed by a line for each defined detail.

Each line (header or detail) has a label, followed by a tab, then the value.

For example: Item #001 <w: tab\> The Name of the First Item

The tab stop is defined by the paragraph style in an existing MSWord document that is referenced as a template. The template file is empty, but contains the style definitions we need. (Note: I cannot post a Word file - see end of post for steps to create a minimally reproducible MSWord template document; however, the challenge appears to be how tabs are handled by officer.)

In R, I generate the content as follows:

library(officer)

# Open an MSWord document containing the style definitions
doc <- read_docx("my_template.docx")

# Add the Header
doc <- doc %>% body_add_par("Item #001: The first item", style = "Equip Header")

# Add some details
doc <- doc %>% body_add_par("QUANTITY:<w:tab/>One (1)", style = "Equip Detail")
doc <- doc %>% body_add_par("PROVIDED BY:  K.E.C.", style = "Equip Detail")
doc <- doc %>% body_add_par("PROVIDED BY:  &#9; K.E.C.", style = "Equip Detail")
#... and so on ...

# save the file
print(doc, target = "test.docx")

Note that in the first detail item, I added the Word xml tag for a tab, in the second detail item, I used a tab character (using the tab key) within the value string, and in the third I used html character for a horizontal tab.

The script works as expected except for the tabs. Here is the content saved to test.docx.

enter image description here

As shown, the tab stop still exists in the paragraph style but the "value" part of the label/value set is not tabbed over. We can see in the image first and third tabs were treated as escaped strings. In the sectond item, the tab was treated as two spaces.

This is by design. The body_add function escapes special characters. Such that, ">" becomes "&lt;" and, "&#9;" becomes "&amp;#09;"

Here is what it should look like if tabs / tab stops are used.

enter image description here

Question:

How can I generate content in MSWord that uses MSWord paragraph styles, respects tab stops and preserves the use of tab within a string?

I am open to other r packages / solutions


Creating a Simple Reproducable Template for Testing

  • Open a new MSWord Document
  • Type some content (your label)
  • With your cursor on the same line, set a tab stop (See here for steps to create a tab stop)
  • Type some more content at the tab stop (your value)
  • Select the line and define the style
  • From the Home RIbbon, expand the Styles menu and select "Create a Style", enter the name for your style (this is the name used in the ...body_add("your string", style = "your style name")
  • Click "OK"
  • Delete all content from the file and save it in the working directory.
AWaddington
  • 725
  • 8
  • 18

2 Answers2

1

This can be done with '\t'. The way you define template is the correct one. You can also define the positions of tabulations in the Paragraph properties box.

define tabs

I posted the template here: https://github.com/davidgohel/officer/files/8581914/template.docx

library(officer)
library(dplyr)

doc <- read_docx(path = "template.docx")
styles_info(doc, type = "paragraph") %>% 
  filter(style_name %in% "NormalPlusTabs")
doc %>% 
  body_add_fpar(fpar("Name:\tdoudou\tbleu"), style = "NormalPlusTabs") %>% 
  print(target = "test.docx")

enter image description here

David Gohel
  • 9,180
  • 2
  • 16
  • 34
0

The answer is to use \t to denote the tab.

AWaddington
  • 725
  • 8
  • 18