Questions tagged [reproducible-research]

Reproducible research is the idea that the result of scientific research should be published with data and code in order to make it possible for other researchers to verify the results.

Reproducible research is the idea that the result of scientific research should be published with data and code in order to make it possible for other researchers to verify the results.

Reproducible research may be especially important to you if your investigation involves large amount of data or very complex calculations.

One possible set of tools for reproducible research is using with or .

Related links:

227 questions
0
votes
1 answer

Reproducible splitting of data into training and testing in R

A common way for sampling/splitting data in R is using sample, e.g., on row numbers. For example: require(data.table) set.seed(1) population <- as.character(1e5:(1e6-1)) # some made up ID names N <- 1e4 # sample size sample1 <- data.table(id =…
0
votes
1 answer

can set.seed be made to automatically return to the initial seed after each time it is used in R?

I have my seed set to 1981 in R. I have several lines of code that run functions that use the random seed. however, the seed does not stay at 1981. after being used the first time, it is changed and so each subsequent function has a different…
0
votes
1 answer

Set adaptive col.names in knitr::kable()

Please consider the following. I started writing reproducible documents in with R markdown and want some output for a report. As I am working with more than one data.frame and their column names are not very informative or pretty I would like to…
Frederick
  • 810
  • 8
  • 28
0
votes
1 answer

make for reproducible research

Make is handy for making research and data analysis with dependencies more reproducible, e.g.: # make file R = R CMD BATCH --no-save --no-restore datafiles = *.csv outputfiles = *.{pdf,Rout} .PHONY: all clean all: fig_A.pdf fig_B.pdf clean: …
dzeltzer
  • 990
  • 8
  • 28
0
votes
1 answer

Preferred way to share data with rmarkdown html document?

I created a rmarkdown html document to share code from an analysis in R. I'd like to include the data as well, but I am not sure of the most convenient way (for the recipient) of providing data. I can embed a CSV as a URI data scheme like this:
Skaqqs
  • 4,010
  • 1
  • 7
  • 21
0
votes
0 answers

Missing polygons while using Leaflet in R

I'm trying to reproduce a code to display polygons in a map in another computer, however, in one computer the polygons are not shown. Does someone had this kind of problem while sharing code? You can download the shapefile from here:…
Heber Trujillo
  • 123
  • 1
  • 5
0
votes
0 answers

unnumbered special headers in rmarkdown

I'm writing my thesis with in Rmarkdown (and I will export it to PDF). My third chapter should include three unnumbered parts (or some sort of special header) because it makes sense thematically. Is there any way to include them but not as regular…
0
votes
1 answer

Scrapy: How to reproduce results without downloading the html again?

Having downloaded the HTML to my harddisk with Scrapy (e.g., using the builtin Item Exporters with a field HTML, or storing all HTML files to a folder), how can I use Scrapy to read the data from my harddisk again and execute the next step in the…
David
  • 1,238
  • 2
  • 13
  • 20
0
votes
0 answers

RStudio and git workflow on a closed network

Is anyone aware of an R-based data analysis setup that works well in a research data centre with no internet access? I would like to use good reproducible analysis practices, but I do not have the permission upload files to a repository, for…
0
votes
1 answer

Image of a data volume using docker

I am very interested in reproducible data science work. To that end, I am now exploring Docker as a platform which enables bundling of code, data and environment's settings. My first simple attempt is a Docker image which contains the data it needs…
Dror
  • 12,174
  • 21
  • 90
  • 160
0
votes
1 answer

How do I use Hmisc::latex()

Hmisc::latex() seems to ignore all the arguments I give it, other than object. It's hard to point to a specific question I need answered, other than -- "How can I get Hmisc::latex()" to recognize the arguments its documentation says it should? For…
rcorty
  • 1,140
  • 1
  • 10
  • 28
0
votes
1 answer

Reproduce line plot in matplotlib or R

I came across wonderful figure which summarizes (scientific) authors collaboration over years. The figure is pasted below. Each vertical line refers to single author. The start of each vertical line correspond to the year the pertaining author…
Andrej
  • 3,719
  • 11
  • 44
  • 73
0
votes
1 answer

Why does mlr give different results in different runs even when using set.seed()?

To publish reproducible results obtained in the mlr package one should use the set.seed() function to control the randomness of the code. Testing, it seems such practice doesn't lead to the desired results, in which different runs of the code give…
catastrophic-failure
  • 3,759
  • 1
  • 24
  • 43
0
votes
1 answer

Sampling a graph in R

I want to make a directed graph in R by making 2 data frame: one for vertices and one for edges. Also, my graph should have these attributes: No circle (therefore no A -> A) There is maximum 1 edge between 2 nodes. I come up with the code as…
0
votes
1 answer

Failure to create reproducible example by means of replicate/dput function

Im trying to use dput() to create a reproducible example with a large database. The database needs to be large as the reproducible example involves moving averages. The way I've found to do this involves the function reproduce, shared here How to…
Krug
  • 1,003
  • 13
  • 33