24

I am using Rstudio to streamline Sweave and R for data analyses that I will share with other analysts. In order to make the coding of variables crystal clear, it would be great to have something like a help file so they can call ?myData and get a helpful file, if they need. I like the Rd markdown and think it actually has great potential to document analytic datasets, including an overall summary, a variable by variable breakdown, and an example of how to run some exploratory analyses.

It's easy to do this if you're specifically creating a package, but I think that it's confusing since packages are ultimately a collection of functions and they don't integrate Rnw files.

Can I use Roxygen2 to create help files for datasets that aren't a part of any package?

AdamO
  • 4,283
  • 1
  • 27
  • 39
  • how would I access this documentation if you don't give me a package to install with the compiled latex? I don't understand why a package with the documentation is not wanted. Install the package, hand off say an rproject file with the analysis files, load the package to get all the docs. that seems to solve all the problems except for the no package stipulation. How can I access `ggplot2` without installing `ggplot2`? – rawr Aug 13 '14 at 17:31
  • @KonradRudolph Do you find a solution now? I am facing exactly same issue. I am using modules with R scripts and thus want to avoid creating package. I simply want to add document for functions inside modules – englealuze Nov 25 '19 at 15:40
  • @englealuze Yes, seem my answer. – Konrad Rudolph Nov 25 '19 at 16:00

7 Answers7

14

Before I take a crack at this, I would like to reiterate what others are saying. R's package system is literally exactly what you are looking for. It is used successfully by many to distribute just data and no code. Combined with R's lazyloading of data, you can distribute large datasets as packages and not burden users who don't wish to load it all.

In addition, you will not be able to take advantage of R's help system unless you use packages. The original question explicitly asks about using ?myData and your users will not be able to do that if you do not use a package. This is quite simply a limitation of R's base help function.


Now, to answer the question. You will need to use some non-exported roxygen functions to make this work, but it's not too onerous. In addition, you'll need to put your R file(s) documenting your data into a folder of their own somewhere, and within that folder you will want to create an empty folder called man.

Example directory structure:

# ./
# ./man/
# ./myData.R
# ./otherData.R

myData.R

#' My dataset
#' 
#' This is data I like.
#' 
#' @name myData
NULL

otherData.R:

#' My other dataset
#' 
#' This is another dataset I like
#' 
#' @name otherData
NULL

Now, the code that will bring it all together (and you can of course wrap this in a function):

library(roxygen2)
mydir <- "path/to/your/data/directory/"
myfiles <- c("myData.R","otherData.R")

# get parsed source into roxygen-friendly format
env <- new.env(parent = globalenv())
rfiles <- sapply(myfiles, function(f) file.path(mydir,f))
blocks <- unlist(lapply(rfiles, roxygen2:::parse_file, env=env), recursive=FALSE)
parsed <- list(env=env, blocks=blocks)

# parse roxygen comments into rd files and output then into the "./man" directory
roc <- roxygen2:::rd_roclet()
results <- roxygen2:::roc_process(roc, parsed, mydir)
roxygen2:::roc_output(roc, results, mydir, options=list(wrap=FALSE), check = FALSE)

You should now have properly formatted myData.Rd and otherData.Rd files in the once-empty man folder.

charliebone
  • 563
  • 3
  • 8
  • Your limitation isn’t correct: it’s fairly easy to override `utils::help` or `utils::?`: `devtools` does just that. – Konrad Rudolph Aug 13 '14 at 16:21
  • In anticipation of a question as to why the `man` directory must be used, this is a limitation of the roxygen2 code. If you *really* wanted to avoid the use of the man directory, I would essentially create a new roclet S3 class and borrow the `roxygen2:::roc_process.had` and `roxygen2:::roc_output.had` functions. You will need to modifiy `roc_output.had` and change that first line where they set the man variable for the Rd output directory. – charliebone Aug 13 '14 at 16:24
  • @KonradRudolph Fair enough. But masking the base help function is not something users would expect a package to do, so use caution if you do to at least maintain compatibility with the existing help method. I am still correct however in saying that R's base help is limited in the sense that it only can be used with packages. – charliebone Aug 13 '14 at 16:26
  • @KonradRudolph Did this answer the question satisfactorily? I see the bounty is still open.. – charliebone Aug 14 '14 at 20:44
  • Yes. I wanted to keep the bounty open until I was sure I could implement this satisfactorily. I have now done so in the [`modules` project](https://github.com/klmr/modules/tree/help), and [I’ve reported an issue](https://github.com/klutometis/roxygen/issues/273) to `roxygen2` for them to enlarge their public API to make this solution officially supported. Incidentally, the `roc_output` call isn’t necessary at all, since I’m attaching the Rd strings to the module in memory, and pass it directly to `tools::parse_Rd` when the user is calling `?` (or `help`). – Konrad Rudolph Aug 15 '14 at 15:15
8

roxygen2 now supports this natively but, because the relevant functions are marked “internal”, they are not exposed to the documentation index.

Still, the functions are exported and form part of the official API:

And, to display the resulting help, you’ll need

The workflow is as follows:

source_env = roxygen2::env_file(sourcefile)
rd_blocks = roxygen2::parse_file(sourcefile, source_env)
help_topics = roxygen2::roclet_process(roxygen2::rd_roclet(), rd_blocks, source_env, dirname(sourcefile))
rd_code = lapply(help_topics, format)

This gives you a list of help topics in a file. To display one of them you need the {tools} package, which is part of base R, but not attached by default.

The following shows how to display text help. Displaying HTML help is a bit more convoluted (I invite you to read and understand the source code of utils:::print.help_files_with_topic, which does the actual displaying of help topics, and which is completely undocumented.

# Display first help topic. In reality you’d want to select a specific one.
topic = names(rd_code)[1L]
help_text = rd_code[[topic]]

rd = tools::parse_Rd(textConnection(help_text))
packagename = tools::file_path_sans_ext(basename(sourcefile))
helpfile = tools::Rd2txt(rd, out = tempfile('Rtxt'), package = packagename)
helptitle = gettextf('R Help on %s', sQuote(sub('\\.Rd$', '', topic)))
file.show(helpfile, title = helptitle, delete.file = TRUE)
Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
  • file.show(helpfile, title = helptitle, delete.file = TRUE) will create a file and ask me to save somewhere. I tried to save as text and html. The content looks correct, but the function title and titles of description, usage, arguments and examples are wrongly formatted with strange symbols and not readable. Is there a way to view the help nicely within Rstudio instead of creating and saving a file somewhere? like ? or help function usually does? Thanks a lot – englealuze Nov 25 '19 at 16:43
  • @englealuze The function call, as written in this answer, should definitely *not* ask you anything at all. Apart from that, as mentioned in the answer, the code I show uses the standard [troff formatting](https://en.wikipedia.org/wiki/Troff), which RStudio’s pager may not understand. For RStudio you need to go the long and complicated route via HTML. – Konrad Rudolph Nov 25 '19 at 16:55
  • Are there any functions to create a .Rd file under man folder with the rd object created with tools::parse_Rd in your code? The help function can show the document in Rstudio then – englealuze Nov 26 '19 at 16:01
  • @englealuze This only works if the `.Rd` file is in the expected location for an *installed package*. Unfortunately you cannot just create an `.Rd` file and pass it to a function to be displayed. Don’t ask me why it is that way but that’s indeed the way it is. – Konrad Rudolph Nov 26 '19 at 16:05
  • I found the help_text which is simple string actually can be used to write to a .Rd file directly under any specific path. Then I just need to wrap your code to my own version of roxygenise and automatized a bit more. I think this is good solution. Thanks a lot – englealuze Nov 26 '19 at 16:29
  • Add a function based on your code. Tested work with my project – englealuze Nov 27 '19 at 14:48
  • @englealuze Please write your own answer rather than editing my existing one, because it’s substantially different from what’s written so far, and should stand on its own. – Konrad Rudolph Nov 27 '19 at 15:08
2

Another (simpler) way is by using document package:

> document::document("~/Downloads/tmp.R") #your temporal R file to convert to Rd 
# it brings an error, but document are correctly built in a temporal directory 
# (copy the path in below variable: tmppath)

> tmppath <- "/var/folders/dl/zj51mknn0x17lp376dpx_j3r0000gn/T//RtmpaikYJb/document_8e706d7cd54a/tmp/man"
> rstudioapi::previewRd(paste0(tmppath, "/tmp.Rd")) #to preview 
jgarces
  • 519
  • 5
  • 17
1

Here is a generic function wrapped from @Konrad Rudolph's code which can be used to generate .Rd files for R scripts under specified folder. For a project using modules package that has "non-standard" folder structure, this can be a solution for documentation without creating installed package.

moxygenise <- function(codepath, manpath) {

  apply_at_level <- function(l, f, n, ...) {
    ## function to apply a function at specified level of a nested list
    if (n < 0) {
      stop("Invalid parameter - n should be integer >= 0 -- APPLY_AT_LEVEL")
    } else if (n==0) {
      return(l)
    } else if (n == 1) {
      return(lapply(l, f, ...))
    } else {
      return(lapply(l, function(x) {apply_at_level(x, f, n-1)}))
    }
  }

  list.files.paths <- function(path, pattern) {
    ## function to list absolute path of all files under specified path matching certain pattern
    path <- normalizePath(path)
    return(file.path(path, list.files(path=path, pattern=pattern)))
  }

  sourcefiles <- list.files.paths(codepath, "\\.R$")
  source_envs <- lapply(sourcefiles, roxygen2::env_file)
  rd_blockss <- mapply(roxygen2::parse_file, sourcefiles, source_envs)

  help_topicss <- mapply(function(rdblock, sourceenv, sourcefile) {
      return(roxygen2::roclet_process(
          roxygen2::rd_roclet(), 
          rdblock, sourceenv, 
          dirname(sourcefile)))},
          rd_blockss, source_envs, sourcefiles)

  rd_codes <- purrr::flatten(apply_at_level(help_topicss, format, 2))

  mapply(function(text, topic, outpath=manpath) {
    cat("Write", topic, "to", outpath, "\n")
    write(text, file=file.path(outpath, topic))
    }, rd_codes, names(rd_codes))
  return(NULL)
}

Specify the path where your module source files are saved and the path where you want to generate .Rd files (should be projecthome/man/, if you want help function works with your source package)

moxygenise('path/of/module/source/', 'path/of/output.Rds')
englealuze
  • 1,445
  • 12
  • 19
  • It's been almost a year, but this answer saved me when my team was not using packages yet. We only used Rstudio's "R Projects", so using your function and then converting the Rd files to Markdown we were able to complete our project's wiki :) Thank you! – eduardokapp Sep 11 '20 at 19:48
  • @englealuze is it possible to make this work for R6 classes also? I always get an error: `R6 class (Person) without sources references` – Bolle Oct 01 '20 at 09:46
  • @Bolle seems the error has something to do with roxygen2. You may try some tricks mentioned here https://github.com/r-lib/R6/issues/3, https://github.com/r-lib/roxygen2/issues/1014 – englealuze Oct 02 '20 at 04:25
  • @englealuze I tried all the solutions but nothing worked because they are based documentation of packages. Do you know how Roxygen: list(r6 = FALSE) works in DOCUMENTATION file? Maybe it is possible to build a workaround or integrate the things that happen by Roxygen: list(r6 = FALSE) only for your nice function. I wasn't able to find a solution yet so I posted the question here https://stackoverflow.com/questions/64153144/creating-rd-documentation-files-for-r6-classes-not-in-a-package too – Bolle Oct 03 '20 at 14:30
0

Here's a hacky approach that works. Create a dummy package in a temp directory, use that to generate your Rd files, then extract the Rd files out, and clean up. See code below.

Hope this helps.

Note: Make sure you have the @export tag in the functions you want to generate Rd files for, in order for this to work.

makeRd <- function(rscript, dir.out){
  stopifnot(require(devtools))

  # Prepare paths
  pkg.path = tempdir()
  r.path = file.path(pkg.path, 'R')
  man.path = file.path(pkg.path, 'man')
  desc.path = file.path(pkg.path, 'DESCRIPTION')

  # Create directories
  dir.create(r.path, F)
  dir.create(man.path, F)

  # Write dummy description
  z = c('Package', 'Type', 'Title', 'Version', 'Date', 'Author', 'Maintainer', 'Description', 'Licence')
  writeLines(paste0(z, ': X'), desc.path)

  # Copy rscript file over to dummy package and generate rd files
  file.copy(rscript, r.path)
  suppressMessages( document(pkg.path) )

  # Copy generated Rd files to output directory
  f.in = list.files(man.path, full.names = T)
  f.out = file.path(dir.out, basename(f.in))
  for(i in 1:length(f.in)) file.copy(f.in[i], f.out[i], overwrite = T)

  # Unlink
  unlink(pkg.path, T, T)
  return(f.out)
}

# Example
rd = makeRd(rscript='foo.R', dir.out='~/Desktop')
print(rd)
# [1] "~/Desktop/myFunction.Rd"
Omar Wagih
  • 8,504
  • 7
  • 59
  • 75
  • “Create a dummy package in a temp directory” – Unfortunately the overhead of this is prohibitive. In addition, it’s backwards: roxygen2 must solve this problem internally: that is, it parses the individual `.r` source files and generates `.rd` files for them. – Konrad Rudolph Aug 13 '14 at 15:44
  • What overhead are you talking about? The function does what was asked and takes on the order of milliseconds, at most seconds. – Omar Wagih Aug 13 '14 at 15:56
  • Unnecessary (!) runtime overhead, and overhead (even if not much) on the file system. Like you’ve said, it’s a hacky way. I need this for a package, and I cannot in good conscience redistribute such hacky code. For using it in my own code one-off it’s certainly more than adequate. For redistribution – not so much. – Konrad Rudolph Aug 13 '14 at 15:57
0

There is a function called parse_Rd in the tools package. You could generate the .Rd files, run parse_Rd on them, and save the output as objects in the module namespace. You would need a new search function (maybe modHelp) that finds the appropriate Rd object in the namespace and displays it using Rd2text or a different one, or a custom solution. Not sure if you can get anything other than the basic text help that Rd2text spits out, but you might.

rmflight
  • 1,871
  • 1
  • 14
  • 22
  • “You could generate the .Rd files” – that is precisely the part which `roxygen2` is supposed to solve, and which this question is about. – Konrad Rudolph Aug 13 '14 at 17:25
-4

My answer is why don't you put the analysis in a package? This way you get all the bits and pieces of support that come with packages, including documentation (of data and any self-written functions), and having vignettes that automatically know where your data lives (and being able to list the vignettes from within R-help). You want the features of a package, without a package, that's just being needy. Instead, co-opt the package structure for an analysis, and use it to your advantage, like getting your datasets documented.

You comment that packages don't integrate Rnw files, but I rather think you are wrong. The default format for package vignettes is the Rnw or Sweave file. You can easily co-opt the vignette as a way to do the analysis report of the package.

I actually use this approach in my own analyses, and have documented it in a couple of blog posts: why, how, and comparison to project template. I've also used it in both academic analysis projects (doing it more and more, can't point to an example yet), and personal projects (e.g. https://github.com/rmflight/timmysDensity, http://rmflight.github.io/posts/2013/06/timmysDensity.html, note I was not yet using package mechanism to find data yet).

BTW, outside of putting the data in a package (which there are data only packages, Bioconductor has quite a number), I don't think there is a way to do what you are asking, outside of simply providing the raw roxygen2 tags in the .R file as was outlined above for a dataset.

rmflight
  • 1,871
  • 1
  • 14
  • 22
  • There are good reasons not to do that. I don’t “want the features of a package”, I want documentation. Case in point, I use [modules](https://github.com/klmr/modules) instead of packages. They are superior to packages in several regards, but they don’t (yet) support documentation, which I want to add. – Konrad Rudolph Aug 13 '14 at 15:37
  • ah, well then, I've got no sweet clue, unless you come up with some new way of getting documentation into R outside of Rd files, as that is I'm pretty sure the way R does it. And I don't know of any way to make R do it on the fly. BTW, R's way to provide "documentation", is put it into a package. – rmflight Aug 13 '14 at 15:41