Questions tagged [reproducible-research]

Reproducible research is the idea that the result of scientific research should be published with data and code in order to make it possible for other researchers to verify the results.

Reproducible research is the idea that the result of scientific research should be published with data and code in order to make it possible for other researchers to verify the results.

Reproducible research may be especially important to you if your investigation involves large amount of data or very complex calculations.

One possible set of tools for reproducible research is using with or .

Related links:

227 questions
2
votes
0 answers

Why does stacking CNN wreck reproducibility (even with seed & CPU)?

REPRODUCIBLE: ipt = Input(batch_shape=batch_shape) x = Conv2D(6, (8, 8), strides=(2, 2), activation='relu')(ipt) x = Flatten()(x) out = Dense(6, activation='softmax')(x) NOT REPRODUCIBLE: ipt = Input(batch_shape=batch_shape) x = Conv2D(6,…
2
votes
0 answers

Where to set "os.environ['PYTHONHASHSEED']='0'" for reproducible results in Google Colab

I want to train a CNN in Google Colab and try to get reproducible results as possible. In the following code I am not sure if the order of the instructions for reproduciblity is correct. I wonder if os.environ['PYTHONHASHSEED']='0' is in the right…
2
votes
0 answers

Extracting Body of Text from Research Articles; Several Attempted Methods

I need to extract the body of texts from my corpus for text mining as my code now includes references, which bias my results. All coding is performed in R using RStudio. I have tried many techniques. I have text mining code (of which only the first…
Hunter
  • 65
  • 7
2
votes
0 answers

Issue with installing R packages from github with devtools on university computers with network drives

I'm trying to configure a project in R that uses the spotifyr and scrobbler (last.fm API) packages, but am having trouble when downloading from dev tools . The code works perfectly from my Mac/Windows desktop at home, but when working on my work…
2
votes
0 answers

checkpoint package R - loading library not installed

I'm trying to use checkpoint package which seems pretty simple. I installed checkpoint library and use the main function checkpoint("yyyy-mm-dd") After this, checkpoint just installed dplyr, which is what I expected ( it was the only library…
Marco Fumagalli
  • 2,307
  • 3
  • 23
  • 41
2
votes
0 answers

Using `/exec` directory in R packages for R scripts for reproducible research

For my research project I want to use a make-based workflow but also deliver the project in the form of a package. Thus, I want to put reusable functions in the /R directory for others to access but also use R scripts executable from the command…
cgmil
  • 410
  • 2
  • 18
2
votes
2 answers

mlr: why does reproducibility of hyperparameter tuning fail using parallelization?

I use code based on Quickstart example in mlr cheatsheet. I added parallelization and tried to tune parameters several times. Question: Why does reproducibility fail (why aren't the results identical) even if I set set.seed() every time before…
GegznaV
  • 4,938
  • 4
  • 23
  • 43
2
votes
2 answers

Parsing R function names, arguments, and return values

How can I programatically parse the names of functions, arguments, and their return values? I am interested in generating workplan dataframes for automating R data analysis workflows with the drake package. One can generate such workplan dataframes…
ropolo
  • 117
  • 1
  • 1
  • 8
2
votes
2 answers

Reproducibility of random numbers (Python 2/random)

In Python 2 documentation of the random.seed() function I found a warning: If a hashable object is given, deterministic results are only assured when PYTHONHASHSEED is disabled. From…
abukaj
  • 2,582
  • 1
  • 22
  • 45
2
votes
2 answers

Anonymize names in paragraph variable by matching and replacement

I am analyzing a school's student report card database. My dataset consists of around 3000 records structured similarly to the example below. Each observation is one teacher's assessment of one student. Each observation contains a three-sentence…
Anders Swanson
  • 3,637
  • 1
  • 18
  • 43
2
votes
1 answer

Can I use virtualization to control for differences in host performance when benchmarking for performance regressions in my application?

Is it possible to set up a virtualized environment---be it a Docker container or a qemu VM---to run benchmarks that would not be much affected by the performance of the virtualization host? For example, that my computation benchmark would always…
user7610
  • 25,267
  • 15
  • 124
  • 150
2
votes
1 answer

Retrieve list of libraries / packages required in a script for reproducibility

This is a question of convenience on code reproducibility. You may end up or receive a long code with various custom libraries called at various times (e.g. in various sections of a markdown document). Suppose you have a poorly constructed…
puslet88
  • 1,288
  • 15
  • 25
2
votes
1 answer

Using knitr to produce complex dynamic documents

The minimal reproducible example (RE) below is my attempt to figure out how can I use knitr for generating complex dynamic documents, where "complex" here refers not to the document's elements and their layout, but to non-linear logic of the…
Aleksandr Blekh
  • 2,462
  • 4
  • 32
  • 64
2
votes
1 answer

Dynamic Greek Letters as Variable Names in RMarkdown Tables

How do I get xtable (though I also have this issue with pander.table) to assign Greek letters to columns of a data frame within the print function without me needing to render the table and then manually type in the Latex for the Greek…
bfoste01
  • 337
  • 1
  • 2
  • 14
2
votes
5 answers

Classloader vulnerability reproducing procedure in struts 1.1

In Struts1, I heard that there is a classloader vulnerability issue which is cause by CVE-2014-0114. But I am unable to reproduce this respect to my project. Can anyone help me how to reproduce this issue. I googled but not get any procedure of…
SkyWalker
  • 28,384
  • 14
  • 74
  • 132