3

I am applying for junior data analyst positions and have come to the realization that I will be sending out a lot of cover letters.

To (somewhat) ease the pain and suffering that this will entail, I want to automate the parts of the cover letter that is suited for automation and will be using R Markdown to (hopefully) achieve this.

For the purposes of this question, let's say that the parts I am looking to automate is the position applied for and the company looking to hire someone for that position, to be used in the header of the cover letter.

These are the steps I envision in my mind's eye:

  • Gather the positions of interest and corresponding company in an Excel spreadsheet. This gives and Excel sheet with two columns with the variables position and company, respectively.
  • Read the Excel file into the R Markdown as a data frame/tibble (let's call this jobs).
  • Define two parameters in the YAML header of the .Rmd file to look something like this:
---
output: pdf_document
params:
 position: jobs$position[i]
 company: jobs$company[i]
---

The heading of the cover letter would then look something like this:

"Application for the position as r params$position at r params$company"

To summarize: In order to not have to change the values of the parameters manually for each cover letter, I would like to read an Excel file with the position titles and company names, loop these through the parameters in the YAML header, and then have R Markdown output a PDF for each pair of position and company (and ideally have the name of each PDF include the position title and company name for easier identification when sending the letters out). Is that possible? (Note: the title of the position and the company name does not necessarily have to be stored in an Excel file, that's just how I've collected them.)

Hopefully, the above makes clear what I am trying to achieve.

Any nudges in the right direction is greatly appreciated!

EDIT (11 July 2021):

I have partly arrived at an answer to this.

The trick is to define a function that includes the rmarkdown::render function. This function can then be included in a nested for-loop to produce the desired PDF files.

Again, assuming that I want to automate the position and the company, I defined the rendering function as follows (in a script separate from the "main" .Rmd file containing the text [named "loop_test.Rmd" here]):

render_function <- function(position, company){
  rmarkdown::render( 
    # Name of the 'main' .Rmd file
    'loop_test.Rmd',
    # What should the output PDF files be called?
    output_file = paste0(position, '-', company, '.pdf'),
    # Define the parameters that are used in the 'main' .Rmd file
    params = list(position = position, company = company),
    evir = parent.frame()
  )
}

Then, use the function in a for-loop:

for (position in positions$position) {
  for (company in positions$company) {
    render_function(position, company)
  }
}

Where the Excel file containing the relevant positions is called positions with two variables called position and company.

I tested this method using 3 "observations" for a position and a company, respectively ("Company 1", "Company 2" and "Company 3" and "Position 1", "Position 2" and "Position 3"). One problem with the above method is that it produces 3^2 = 9 reports. For example, Position 1 is used in letters for Company 1, Company 2 and Company 3. I obviously only want to match outputs for Company 1 and Position 1. Does anyone have any idea on how to achieve this? This is quite unproblematic for two variables with only three observations, but my intent is to use several additional parameters. The number of companies (i.e. "observations") is, unfortunately, also highly likely to be quite numerous before I can end my search... With, say, 5-6 parameters and 20 companies, the number of reports output will obviously become ridiculous.

As said, I am almost there, but any nudges in the right direction for how to restrict the output to only "match" the company with the position would be highly appreciated.

R.W.
  • 99
  • 5

1 Answers1

0

You can iterate over by row like below.

for(i in 1:nrow(positions)) {
  render_function(positions$position[i], positions$company[i])
}
aimi
  • 1