Highest Voted 'k-fold' Questions

0

votes

0 answers

Whats the correct way to format X and Y from binnary dataframe to use on Stratified K-Fold cross-validation

My data is an dataframe of a table of 25 columns and 2737 rows containg binnary data. The goal is to train using each row as an INPUT and get as an OUTPUT a probabilistic prediction of what the next sequence could be. Data on this scenario is always…

asked Feb 20 '23 at 14:32

Wisdom

121
1
1
13

0

votes

0 answers

Getting unnacurate number of rows when using predict function in a cross validation excercise

I'm Performing a K-fold exercise with K = 10 for polinomials from degree 1 to 5 with the purpose of identifying which polynomial fits the best the data provided. Never the less, when I try to predict Y-Hat using the testing data (X-test) which…

r cross-validation k-fold

asked Feb 19 '23 at 19:51

Lucpi

1

0

votes

0 answers

I am getting the following error. "You should leave random_state to its default (None), or set shuffle=True."

I'm trying to test several families, using different algorithms to see if any perform well. And I want to compare AUC with Standard Deviation using cross-validation with K-Fold. X = pd.concat([X_train, X_test]) y = pd.concat([y_train, y_test]) from…

k-fold

asked Feb 16 '23 at 16:37

Guilhermino Gomes

1
2

0

votes

1 answer

How to do K-fold cross validation without using python libraries?

I am trying to do a cross validation, however, I am only allowed to use those libraries below (as the professor demanded): import numpy as np from sklearn import svm from sklearn.datasets import load_iris Therefore, I am not able to use KFold for…

python scikit-learn k-fold

asked Jan 26 '23 at 22:39

Shan

3
5

0

votes

1 answer

k-fold cross validation in quanteda

I've been using the quanteda SML workflow as described in the quanteda tutorial (https://tutorials.quanteda.io/machine-learning/nb/) and found it extremely helpful to set up my own classification task. However, instead of the fixed held-out…

r quanteda k-fold

asked Jan 16 '23 at 08:55

Max Overbeck

5
3

0

votes

1 answer

K-Folds cross-validator show KeyError: None of Int64Index

I try to use K-Folds cross-validator with dicision tree. I use for loop to train and test data from KFOLD like this code. df = pd.read_csv(r'C:\\Users\data.csv') # split data into X and y X = df.iloc[:,:200] Y = df.iloc[:,200] X_train, X_test,…

python scikit-learn cross-validation k-fold

asked Jan 09 '23 at 14:53

user572575

1,009
3
25
45

0

votes

0 answers

How can I measure the probability of error of a trained model, in particular the random forest?

To do the binary classification of a set of images, I trained the random forest on a set of data. I now want to evaluate the error probability of my model. For that, I did two things and I don't know what corresponds to this error probability: I…

validation machine-learning testing random-forest k-fold

asked Dec 25 '22 at 18:23

Sab

1

0

votes

1 answer

Building neural network using k-fold cross validation

I am new to deep learning, trying to implement a neural network using 4-fold cross-validation for training, testing, and validating. The topic is to classify the vehicle using an existing dataset. The accuracy result is 0.7. Traning Accuracy An…

deep-learning neural-network cross-validation k-fold

asked Dec 25 '22 at 12:04

zed

11
3

0

votes

0 answers

10-fold cross validation for a logistic regression in google colab python

y3_data is the death variable 0 for alive and 1 for dead, x3_data are my categorical variable the are all have binary output for example Diabetes 0 for yes 1 for no and so on i have around 6 variables in x3_data that have a significant P value with…

python machine-learning statistics logistic-regression k-fold

asked Dec 24 '22 at 07:14

kjnk

19
3

0

votes

0 answers

Should the same cross-validation method be used across multiple models?

The assignment is to write a simple ML program that trains and predicts on a dataset of our choice. I want to determine the best model for my data. The response is a class (0/1). I wrote code to try different cross-validation methods (validation…

python linear-regression logistic-regression cross-validation k-fold

asked Dec 01 '22 at 15:45

Oliver

1,465
4
17

0

votes

0 answers

Can we apply two time cross validation on same dataset?

First, we split the dataset using stratify parameter train_test_split(np.array(X), y, train_size=TRAIN_SIZE, stratify=y, random_state=42) and then apply KFlod Cross Validation kfold = KFold(n_splits=num_folds, shuffle=True) fold_no = 1 for train,…

machine-learning scikit-learn deep-learning train-test-split k-fold

asked Nov 25 '22 at 11:49

Saghir Ahmed

7
3

0

votes

0 answers

Imbalanced categorical predictors cross validation with continuous target

I am working on a project where I want to measure the predictive performance of some categorical variables on click-through rate (continuous). However, the categorical variables are highly imbalanced: packaged_goods: 796 food: 104 person:…

regression cross-validation train-test-split imbalanced-data k-fold

asked Nov 24 '22 at 20:28

donhendriko

1
1

0

votes

0 answers

How to see the indices of the split on the data that GridSearchCV used when it made the split?

When using GridSearchCV() to perform a k-fold cross validation analysis on some data is there a way to know which data was used for each split? For example, assumed the goal is to build a binary classifier of your choosing, named 'model'. There are…

training-data gridsearchcv k-fold

asked Nov 17 '22 at 03:46

jensenn

1
1

0

votes

1 answer

How to split the dataset into mutiple folds while keeping the ratio of an attribute fixed

Let's say that I have a dataset with multiple input features and one single output. For the sake of simplicity, let's say the output is binary. Either zero or one. I want to split this dataset into k parts and use a k-fold cross-validation model to…

tensorflow k-fold tfx tensorflow-extended

asked Nov 16 '22 at 20:40

Mehran

15,593
27
122
221

0

votes

0 answers

Creating a random forest function

I am trying to create a function that takes a 2-d numpy array (i.e. the data) and data_indices (a list of (train_indices,test_indices) tuples) as input.For each (train_indices,test_indices) tuple in data_indices ---the function should: Train a new…

machine-learning linear-regression random-forest k-fold mean-square-error

asked Nov 10 '22 at 16:51

Dushu

31
4

Questions tagged [k-fold]