0

This is my first questions on StackOverflow, so please let me know if I'm doing anything wrong.

I'm using R to generate a lot of very large PDF documents. My data is about 580,000 observations, and breaks down in to 32 categories with each category containing 70 answers to between 20 and 300 questions. Currently I use two for loops (I try to avoid for loops, but for creating these pdfs it was the only way that worked). The first goes through and creates a pdf for the category with a title page, then the second adds a page for each graph showing the results of that question. I'm using ggplot2 & the "pdf" function.

The script works great, creating 32 pdfs (one for each category) with a custom title page and pages for all the questions in that category. I would like to add a Table of Contents after the title page. I know how to add a page with labels and page numbers, but I need one that links to each question.

I've searched this site and Google, but haven't found any way to do this in R. This question: Adding a table of contents to PDF with R plots talks about using RPython. I've also come across sources mentioning "hyperref", LaTex, Pandoc, and Knitr. I know how to use Kintr in an Rmarkdown doc, but that doesn't work for what I'm trying to do. I'm not really sure how to work with any of the others, so solutions with using them went over my head.

Is there not a way to work with creating a Table of Contents or just hyperlinks to PDF pages inside R, without going to those other languages?

Community
  • 1
  • 1
kyle7day
  • 1
  • 2
  • Have you tried xtable? http://stackoverflow.com/questions/24132503/how-can-i-include-hyperlinks-in-a-table-within-an-sweave-document – Brad D Aug 19 '15 at 17:26
  • I had come a crossed it before, but didn't try it. After looking at your link and checking the xtable documentation I think I understand how to add links to web pages in a PDF doc. I'm still not sure how to reference other pages of the same PDF document? If there is a way to do that then xtable would work well. – kyle7day Aug 20 '15 at 15:04

2 Answers2

1

Have you tried just clicking on the section names in the table of contents? By default, these seem to be hyperlinked, although there isn't any colouration that hints at it.

To help you see what might be happening, add / change your YAML header to add the following:

output:
  pdf_document:
    keep_tex: true
    toc: true
    toc_depth: 3

That will get the intermediate .tex file kept. If you open that up after knitting, you should already see references to hyperref in it.

I then find my table of contents being defined as:

{
\hypersetup{linkcolor=black}
\setcounter{tocdepth}{3}
\tableofcontents
}

which produces a hyperlinked TOC, but with "black" hyperlinks!

If you want to change the colour and see them show up, you can open the tex file in RSudio and simply change the "black" to "blue" and have RStudio run "Compile PDF" and you should see them showing up.

If you want your page numbers hyperlinked rather than the description, add the following into your YAML:

header-includes:
   - \hypersetup{linktocpage}

Share & Enjoy!

dsz
  • 4,542
  • 39
  • 35
0

I just remembered I left this open and thought I'd go back and post how I ended up solving it, well sorta. Instead of an R script, I used a R Markdownfile to create a combined pdf, which included all sections with their subsequent questions as different levels. I was able to create a pdf for each section individually with a linked clickable Table of Contents including all of its questions(pages) and different header levels for title pages.

The key was pandoc.header, which allowed me to create the headers, which show in the TOC. I think neither the for loops, nor the ggplot, which was created for each page, is relevant. Here is an overview of the .rmd :

title: 
author: 
output: 
    pdf_document:
        toc: true

```{r results = "asis", message=FALSE, warning=FALSE, echo=FALSE, fig.height = 11, fig.width = 8}

for(i in 1:length(categories){

pandoc.header(paste("Category ",category_num, ": ", category discription), level = 1)

category title page

for(i in 1:numberofquestions){

pandoc.header(paste("Question ",question_num, ": ", subtitle1), level = 2)

print(ggplot())

}}

```

The only inconvenient part is that each page must have a header to be linked to and I really didn't like the title pages having one, but it looks like I can manually edit that out with what dsz posted.

Community
  • 1
  • 1
kyle7day
  • 1
  • 2