Questions tagged [reproducible-research]

Reproducible research is the idea that the result of scientific research should be published with data and code in order to make it possible for other researchers to verify the results.

Reproducible research is the idea that the result of scientific research should be published with data and code in order to make it possible for other researchers to verify the results.

Reproducible research may be especially important to you if your investigation involves large amount of data or very complex calculations.

One possible set of tools for reproducible research is using with or .

Related links:

227 questions
1
vote
1 answer

Different results on different machines tensorflow

I'm running a Tensorflow 1.12 code (I'm not using GPU). I have set import os os.environ['TF_DETERMINISTIC_OPS'] = '1' os.environ['TF_CUDNN_DETERMINISTIC'] = '1' os.environ['PYTHONHASHSEED'] = '42' import tensorflow as…
1
vote
1 answer

Cross table in gtsummary with computed data set/weighted count

I would like some more help. I need a cross table in the most publishable format (scientific paper) possible. For these demands I have been using gtsummary. The data frame I have is a result of previous counts done by other descriptive routines. I…
Cristiano
  • 233
  • 1
  • 9
1
vote
1 answer

How do I find the public URL of the RStudio connect server?

I am trying to publish an RMarkdown document that I have edited in RStudio. but it's unfortunate that I am quite confused the way I need to go about this publishment. I am requested to Enter the Public URL of the RStudio Connect server: I searched…
Birasafab
  • 152
  • 11
1
vote
1 answer

Issue when Re-implement Matrix Factorization in Pytorch

I try to implement matrix factorization in Pytorch as the data extractor and model. The original model is written in mxnet. Here I try to use the same idea in Pytorch. Here is my code, it can be runned directly in codelab import torch import…
jason
  • 1,998
  • 3
  • 22
  • 42
1
vote
1 answer

Set seed Does Not work on my Windows as I Copied from Different Examples Using R

I searched and got the below example as a way of setting seed for loop from an answer here ## Load packages and prepare multicore…
Daniel James
  • 1,381
  • 1
  • 10
  • 28
1
vote
1 answer

How Do I Set.Seed for simulation in R to attain reproducibility on Windows OS

I have a simulation done with the below function in R: ## Load packages and prepare multicore process library(forecast) library(future.apply) plan(multisession) library(parallel) library(foreach) library(doParallel) n_cores <- detectCores() cl <-…
Daniel James
  • 1,381
  • 1
  • 10
  • 28
1
vote
2 answers

How to trace-back exact software version(s) used to generate result-files in a snakemake workflow

Say I'm following the best practise workflow suggested for snakemake. Now I'd like to know how (i.e. which version) a given file, say plots/myplot.pdf, was generated. I found this surprisingly hard if not impossible only having the result folder at…
1
vote
0 answers

wor2vec() in R is producing different embedding over repetition even after using same seed via set.seed()

wor2vec() in R is producing different embedding over repetition even after using same seed via set.seed(). I am using R 3.6.1 and tried with RNGversion("3.6.1") before set.seed() but still different result. Need help. Code snippet: word2vec_model <-…
1
vote
0 answers

Is a computation using R-Portable less reproducible than using a Container

I am investigating ways for my group to improve the reproducibility of our analyses. The aim is that reviewers or we in 10 years are able to recompute our results. My first choice would be containers using Singularity which are basically a SquashFS…
akraf
  • 2,965
  • 20
  • 44
1
vote
1 answer

Generate Reproducible Results in for loop using R

I created two data frames (3 columns with 6000 observations each) with randomly generated variables using a for loop in R. The results must be reproducible afterwards. I tried to implement the set.seed command but failed so far. Any idea how to…
Agge
  • 15
  • 3
1
vote
1 answer

"NameError: name 'session_conf' is not defined" when try to get reproducible results in google colab

I try to get reproducible results in Google Colab by using the following code. But I get the error "NameError: name 'session_conf' is not defined". import numpy as np import tensorflow as tf import random as rn import…
1
vote
1 answer

Failure to reproduce scikit-learn and numpy dependent code when multiprocessing is used

The code below is completely reproducible when n_jobs=1 at cross_validate function, but not so when n_jobs=-1 or 2. import numpy as np from sklearn.tree import DecisionTreeClassifier from sklearn.datasets import load_iris from…
1
vote
2 answers

Logging input and output in Spyder's console

I learnt data manipulation and analysis through Stata and I used the log command to record all the commands written and the output generated. Doing so I could reproduce my findings, check previous results and share it with others in pdf or txt. What…
Filippo Sebastio
  • 1,112
  • 1
  • 12
  • 23
1
vote
2 answers

how do I reprex reproduce a data frame in R?

I sometimes have to copy data from Excel into R. The workflow goes something like this: # Step 1: Highlight Excel spreadsheet to be copied into R # Step 2: Run this command to get the data into R excelss <- read.delim("clipboard") # for Windows If…
stackinator
  • 5,429
  • 8
  • 43
  • 84
1
vote
0 answers

How can I log the IDs of the training samples with Keras model.fit?

I currently have reproducibility issues, although I set the seeds. I know that the model is initialized the same way (checked via inspection of model.save("initial.h5") with h5dump and meld). The next thing for me to check if the training samples…
Martin Thoma
  • 124,992
  • 159
  • 614
  • 958