Questions tagged [reproducible-research]

Reproducible research is the idea that the result of scientific research should be published with data and code in order to make it possible for other researchers to verify the results.

Reproducible research may be especially important to you if your investigation involves large amount of data or very complex calculations.

One possible set of tools for reproducible research is using r with sweave or knitr.

automated text for reproducible research

I am using RStudio, R Markdown, Latex, and Pandoc to clean data, construct variables, run my analysis, and report the results. I'm new to the concept of reproducible research, but I'm hooked. Makes a lot of sense. Dynamic tables and figures are no…

r latex r-markdown reproducible-research

asked Dec 29 '12 at 19:16

Eric Green

7,385
11
56
102

votes

0 answers

Markdown for Reproducible Research in Python

I would like to know whether there is something equivalent to R-markdown in Python which can help me do reproducible research. Please note: I'm not interested in IPython Notebooks as an answer. I want to have the syntactic joy of r-markdown with…

python r r-markdown reproducible-research

asked Mar 17 '16 at 08:19

Naimish Agarwal

votes

2 answers

Parallel processing in R - setting seed with mclapply() vs. pbmclapply()

I'm parallelizing simulations in R (using mclapply() from the parallel package) and wanted to track my progress with each function call. So I instead decided to use pbmclapply() from the pbmcapply package in order to have a progress bar each time I…

r parallel-processing random-seed reproducible-research mclapply

asked May 23 '21 at 02:50

bob

votes

2 answers

How do I assign a random seed to the dplyr sample_n function?

This is the "sample_n" from dplyr in R. https://dplyr.tidyverse.org/reference/sample.html For reproducibility, I should place a seed so that someone else can get my exact results. Is there a built-in way to set the seed for "sample_n"? Is this…

r dplyr random-seed reproducible-research

asked Aug 16 '20 at 20:11

EngrStudent

1,924
31
46

votes

1 answer

Set random seed for matplotlib plotting backend

I am generating and saving SVG images using matplotlib and would like to make them as reproducible as possible. However, even after setting np.random.seed and random.seed, the various id and xlink:href values in the SVG images still change between…

python matplotlib svg random reproducible-research

asked Jan 05 '18 at 05:45

saladi

3,103
6
36
61

votes

1 answer

Trouble with Pandoc installation on Ubuntu 14.04LTS for using with R Markdown

This question is a corollary of my attempts to get some experience with creating reproducible reports from R Markdown documents via knitr and rmarkdown R packages. While it seems that .Rmd => HTML conversion is automated from within RStudio (Knit…

r knitr pandoc r-markdown reproducible-research

asked Jul 21 '14 at 10:42

Aleksandr Blekh

2,462
4
32
64

votes

2 answers

Can I write identical xlsx files from the same data frame in R?

Can I make sure that two XLSX files (written with openxlsx::write.xlsx) are identical, when given the same data to write? I think there's a timestamp written to the spreadsheet which means the same data written more than one second apart creates a…

r excel openxlsx reproducible-research

asked Jan 31 '22 at 12:50

Spacedman

92,590
12
140
224

votes

1 answer

What does the difference between 'torch.backends.cudnn.deterministic=True' and 'torch.set_deterministic(True)'?

My network includes 'torch.nn.MaxPool3d' which throw a RuntimeError when cudnn deterministic flag is on according to the PyTorch docs (version 1.7 - https://pytorch.org/docs/stable/generated/torch.set_deterministic.html#torch.set_deterministic),…

pytorch deterministic reproducible-research

asked Feb 10 '21 at 03:32

chungseok

votes

1 answer

create references in each section in Rmarkdown

I want to use Rmarkdown but what I've read is that when creating a bibliography using pandoc, references go at the end of the document: pandoc/citeproc issues: multiple bibliographies, nocite, citeonly So even if I have a parent document named…

r r-markdown pandoc reproducible-research

asked Dec 06 '16 at 17:13

Juliana Benitez

votes

1 answer

Python sklearn RandomForestClassifier non-reproducible results

I've been using sklearn's random forest, and I've tried to compare several models. Then I noticed that random-forest is giving different results even with the same seed. I tried it both ways: random.seed(1234) as well as use random forest built-in…

python machine-learning random random-forest reproducible-research

asked Nov 22 '17 at 11:46

Ruslan

votes

1 answer

Package for formatting numeric values in reproducible research

Is there a standard way of converting numeric values to character with a particular type of formatting applied. I'm thinking of something like: formatR(32390,"dollars") # returns "$32,390" formatR(1.25,"percent") # returns "125%" Obviously, not so…

r knitr sweave reproducible-research

asked May 13 '13 at 17:45

Ari B. Friedman

71,271
35
175
235

votes

2 answers

Does setting the seed in tf.random.set_seed also set the seed used by the glorot_uniform kernel_initializer when using a conv2D layer in keras?

I'm currently training a convolutional neural network using a conv2D layer defined like this: conv1 = tf.keras.layers.Conv2D(filters=64, kernel_size=(3,3), padding='SAME', activation='relu')(inputs) My understanding is that the default…

python tensorflow keras random-seed reproducible-research

asked Apr 22 '20 at 12:10

code_to_joy

votes

4 answers

If Keras results are not reproducible, what's the best practice for comparing models and choosing hyper parameters?

UPDATE: This question was for Tensorflow 1.x. I upgraded to 2.0 and (at least on the simple code below) the reproducibility issue seems fixed on 2.0. So that solves my problem; but I'm still curious about what "best practices" were used for this…

python tensorflow keras reproducible-research

asked Nov 27 '19 at 17:07

user2543623

1,452
2
15
24

votes

1 answer

Problem to reproduce results from parallelSVM in R

I am not able to set a seed value to get reproducible results from parallelSVM(). library(e1071) library(parallelSVM) data(iris) x <- subset(iris, select = -Species) y <- iris$Species set.seed(1) model <- parallelSVM(x,…

r machine-learning parallel-processing random-seed reproducible-research

asked Sep 24 '19 at 22:48

Mirko

votes

3 answers

Tensorflow-Keras reproducibility problem on Google Colab

I have a simple code to run on Google Colab (I use CPU mode): import numpy as np import pandas as pd ## LOAD DATASET datatrain = pd.read_csv("gdrive/My Drive/iris_train.csv").values xtrain = datatrain[:,:-1] ytrain = datatrain[:,-1] datatest =…

tensorflow google-colaboratory reproducible-research

asked Aug 01 '19 at 09:29

malioboro

3,097
4
35
55

Prev 1

…

15 16 Next