2

Can someone please explain what "extdata" means in R?

For instance, I was looking at the "cronR" library in R (used for automatically scheduling jobs), and came across the term "extdata":

f <- system.file(package = "cronR", "extdata", "helloworld.R")
cmd <- cron_rscript(f)
cmd
cron_add(command = cmd, frequency = 'minutely',
id = 'test1', description = 'My process 1', tags = c('lab', 'xyz'))
cron_add(command = cmd, frequency = 'daily', at='7AM', id = 'test2')
cron_njobs()
cron_ls()
cron_clear(ask=TRUE)
cron_ls()

Similarly, the "taskscheduleR" package (also used for automatically scheduling jobs) also makes reference to "extdata":

library(taskscheduleR)
myscript <- system.file("extdata", "helloworld.R", package = "taskscheduleR")

## run script once within 62 seconds
taskscheduler_create(taskname = "myfancyscript", rscript = myscript, 
                     schedule = "ONCE", starttime = format(Sys.time() + 62, "%H:%M"))

My Question: Can someone please explain what is "extdata"? Is this just some "formality" that needs to be added to the "system.file()" command? Can someone please explain its relevance here?

Thanks!

References:

stats_noob
  • 5,401
  • 4
  • 27
  • 83
  • 2
    See [this section of the R packages guide](https://r-pkgs.org/data.html#data-extdata). `inst/extdata` is where packages store "external data" _e.g._ data files used for examples in the package documentation. – neilfws Jan 23 '22 at 22:35
  • @ neilfws : thank you so much for your reply! – stats_noob Jan 23 '22 at 22:35

1 Answers1

5

This is a convention, not a formally defined term. (However, it's a convention defined by the package authors and coded in the package structure; it's not something you can change unless you mess around with the package structure yourself.) "extdata" is presumably short for "external data".

However, this doesn't mean that you need to use "extdata" when you are structuring your own code; you only need it when finding the files that are included by the package. cron_rscript("~/my_cron_jobs/foo.R") should work fine (provided you actually have something there, and provided that the ~ == home directory shortcut works across OS, which I think it does).

system.file() takes a package argument, but otherwise strings its arguments together into a file path; i.e. system.file(package = "cronR", "extdata", "helloworld.R") means

  • look in the system folder that R has set up for the cronR package (in my case that is /usr/local/lib/R/site-library/cronR, but the precise location will vary by OS and configuration)
  • within that folder look in the extdata folder
  • within that folder look for helloworld.R

So this command will refer in my case to /usr/local/lib/R/site-library/cronR/extdata/helloworld.R.

Since "/" works as a path separator (at least when used from within R) for all current operating systems, you would get the same results from system.file(package="cronR", "extdata/helloworld.R")

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
  • @ Ben Bolker: Thank you so much for your answer! I think this clarifies it - It looks like I will always have to add the "extdata" argument when scheduling R jobs. After a few minutes, I will be able to formally accept your answer! – stats_noob Jan 23 '22 at 22:32
  • If you have time, could you please take a look at this related question? https://stackoverflow.com/questions/70825523/r-scheduling-tasks-in-base-r thank you so much! – stats_noob Jan 23 '22 at 22:32