Questions tagged [reproducible-research]

Reproducible research is the idea that the result of scientific research should be published with data and code in order to make it possible for other researchers to verify the results.

Reproducible research is the idea that the result of scientific research should be published with data and code in order to make it possible for other researchers to verify the results.

Reproducible research may be especially important to you if your investigation involves large amount of data or very complex calculations.

One possible set of tools for reproducible research is using with or .

Related links:

227 questions
0
votes
0 answers

Role that learning_rate plays in the reproducibility of the model in PyTorch models

I have a Bayesian neural netowrk which is implemented in PyTorch and is trained via a ELBO loss. I have faced some reproducibility issues even when I have the same seed and I set the following code: # python seed =…
0
votes
1 answer

How to systematically replicate the results of running n LASSOs on n data sets in R using enet() with lars()

My code used to fit k LASSO Regressions on k csv file-formatted data sets via the enet() function from the following: set.seed(150) system.time(LASSO <- lapply(datasets, function(J) elasticnet::enet(x = as.matrix(dplyr::select(J, …
0
votes
0 answers

R Permission denied, running as administrator

Im trying to run multiple R files from a master file (as in other statistical softweres like Stata or Python) and, evendough Im already running R studio as an administrator, I keep running into this warning: Error in file(filename, "r", encoding =…
0
votes
0 answers

Unable to get reproducibility when resuming from checkpoint in PyTorch

I'm attempting to get 100% reproducibility when resuming from a checkpoint for a reinforcement learning agent I'm training in PyTorch. What I currently find is that if I train the agent from scratch twice in a row, at 10000 timesteps the training…
0
votes
0 answers

Error: Files must have consistent column names: * File 1 column 5 is: 1 * File 2 column 5 is: 0

Note: all of the datasets and scripts referenced here can be easily found in my GitHub Repository for this research project. The code referenced and reprinted both in this question and the one linked to in the previous question should be in the…
0
votes
0 answers

How to increase the efficiency of a for loop used to run Stepwise Regressions iteratively

All of the code in this question can be found in my GitHub Repository for this research project on Estimated Exhaustive Regression. Specifically, in the "Both BE & FS script" and "LASSO code" Rscripts, and you may use the significantly truncated…
0
votes
1 answer

Statistic Shopware 6 versions

Is there a chart/statistic which versions of Shopware 6 are used by online shops actually. Background: developing custom plugins, it's hard to cover all versions from 6.1 - current (6.4)
0
votes
0 answers

Setting the seed for random forest with different number of mtry and trees

Good morning, I am familiarizing with both machine learning and the caret packet to run some algorithms (random forest and support vector machine). I would like to run the random forest in parallel using the caret package and set the seeds for…
Cinzia
  • 1
0
votes
0 answers

Cannot reproduce GridSearchCV results in distributed pySpark

I have read several threads on this and made every possible change suggested, but I am still having trouble reproducing results across runs. I have made the following changes to my…
0
votes
1 answer

How to quantify the # of correctly selected models by a variable selection algorithm (BE Stepwise) in R

I have run a Backward Elimination Stepwise Regression on 58,000 different randomly generated synthetic datasets sequentially, separated out and reformatted the output in the manner I need it, namely, just the name of each csv formatted dataset and…
0
votes
0 answers

How to return a single global optimum regression model in R when running an Exhaustive Regression via the regsubsets function where id = 3:15

I am comparing the results of a novel procedure which is a new proposed optimal model selection technique in machine/statistical learning which is a modified version of Exhaustive Regression aka All Subsets Regression, double aka Best Subset…
0
votes
0 answers

How to get my (Forward Selection) Stepwise Regression in R to return more than just the intercept?

I am comparing the properties of a new automated optimal factor/variable or more correctly, an optimal model selection technique to two or three standard benchmarks. For those Benchmark methods, we have decided to go with LASSO as the 1st and…
0
votes
1 answer

How can I concatenate the writing of 2 lists (same length) in R to a single csv file such that it has the same # of elements with the elements merged?

I have a file folder with 47,000 csv files containing datasets, but for this question, I'll use my practice code & the results from a different folder I created by copying & pasting just the top 15 csv file formatted datasets into a new folder. I am…
0
votes
1 answer

Is there a way 2 store factors selected by a (BE) Stepwise Regression run on N datasets via lapply(full_model, FUN(i) {step(i[[“Coeffs”]])})?

I have already written the following code, all of which works OK: directory_path <- "~/DAEN_698/sample_obs" file_list <- list.files(path = directory_path, full.names = TRUE, recursive = TRUE) head(file_list, n = 2) > head(file_list, n = 2) [1]…
0
votes
0 answers

Is there a way to fix the fitted coefficients from running the same LASSO i datasets using lapply(csvs, FUN(i) { enet(i) })?

I comparing a new statistical learning algorithm which tries to find the optimal factors & overall model among all candidates for a paper and I need to compare it to the two main benchmark methods currently used, namely, LASSO & Stepwise Regression.…