Python module providing a bridge between Scikit-Learn’s Machine Learning methods and pandas-style DataFrames
Questions tagged [sklearn-pandas]
1336 questions
-1
votes
1 answer
How does test_size relate when used in python sklearn for a 10 fold cross validation
I am trying to implement a ML algorithm in which I would like to use a 10 fold cross validation process but I would just like to get confirmation if my procedure is correct.
I am doing a binary classification and have about 50 samples of each class…

Joe
- 357
- 2
- 10
- 32
-1
votes
1 answer
Modifying a dataFrame in jupyter
i want to delete what after the : in this data (the password) and let just the emails
[emails] https://i.stack.imgur.com/dX9RB.png

anass
- 13
- 2
-1
votes
1 answer
Best way to search for 3 comparisons in a Bank Note dataset
So, I need to create a classifier with 3 simple comparisons to detect a fake bill, based on something like this pseudocode:
assume you are examining a bill
with features f_1 ,f_2 ,f_3 and f_4
your rule may look like this :
if ( f_1 > 4) and ( f_2 >…

Neonleon
- 25
- 4
-1
votes
1 answer
ValueError: Invalid parameter C for estimator LogisticRegressionCV
Can't seem to perform a gridsearch on a logistic regression using an l1 penalty.
reg = LogisticRegressionCV(cv=5,random_state=42, solver='liblinear',penalty='l1')
grid = {'C': [0.001, 0.01, 0.05, 0.1, 1, 10, 100]}
grid_search = GridSearchCV(reg,…

Nick
- 39
- 5
-1
votes
2 answers
How to normalize only one column using sklearn.preprocessing's StandardScaler
if i have a list say
l = [[1, 2], [1, 3], [4, 5], [5, 10]]
how can i only normalize the column 2,3,5,10 using sklearn.preprocessing -> StandardScaler

alex ale
- 31
- 5
-1
votes
1 answer
How to plot a scatter plot to understand the general trend in data, when we have multiple features
Here,
Features are X_train
Target is y_train
When there is a dataset with 'n' number of features how will we select that one feature to make a scatter plot with the target variable to understand the general trend of the training data, to select a…

yuvraj singh
- 88
- 2
- 7
-1
votes
1 answer
How to output predicted values as a string in excel?
so I was able to output my predicted numerical values into an excel file but I was wondering if it is possible to instead of the numerical value, it actual exports the string instead.
Currently it looks like this,
Column 1
Answer…
user13572241
-1
votes
1 answer
In machine learning(Linear Regression), in the training/test process I got this Type Error. Can someone help me with that?
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder = LabelEncoder()
X[:, 3] = labelencoder.fit_transform(X[:, 3])
onehotencoder = OneHotEncoder(categorical_features = [3])
X =…

Joseph Nderph
- 1
- 1
-1
votes
1 answer
sklearn cross validation : The least populated class in y has only 1 members, which is less than n_splits=10
i'm working in a machine learning project and i'm stuck with this warning when i try to use cross validation to know how many neighbours do i need to achieve the best accuracy in knn; here's the warning:
The least populated class in y has only 1…

B1N4RY B1RD
- 3
- 3
-1
votes
2 answers
KNeighborsClassifier from sklearn.neighbors import error
guys am having import error while trying to import KNeighborsClassifier from sklearn.neighbors import k
its showing the following errors
ImportError: cannot import name 'kNeighborsClassifier' from 'sklearn.neighbors'…

Abdulrahman Isah
- 19
- 1
- 2
- 4
-1
votes
2 answers
ModuleNotFoundError: No module named 'sklearn.ensemble.voting_classifier'
I am trying to load a pickle file using below code.
# Load model from file classiferM1 = joblib.load("Model 1_ensemble.pkl")
Now I am facing error
ModuleNotFoundError: No module named 'sklearn.ensemble.voting_classifier'
I understand that sklearn…

Surendra_Suri
- 19
- 1
- 5
-1
votes
1 answer
How to know which features contribute significantly in prediction models?
I am novice in DS/ML stuff. I am trying to solve Titanic case study in Kaggle, however my approach is not systematic till now. I have used correlation to find relationship between variables and have used KNN and Random Forest Classification, however…

hash_atifshk
- 1
- 1
-1
votes
1 answer
Python NameError: name 'ridge_regression_sklearn' is not defined
I am working on Cross validation for k-fold using ridge regression. I want to do y_pred using ridge_regression_sklearn and got the error message that 'ridge_regression_sklearn' is not defined.
Some can please help me how to fix it. I didn't found…

Atif Saeed
- 1
- 1
-1
votes
1 answer
A function to insert data in dataset using python
I create a program that predict digits from in a dataset. I want when it predict data their should be two cases if it predict right then data should added automatically in dataset otherwise it takes right answer throw user and insert to…

Ajay Kumar Joshi
- 11
- 3
-1
votes
1 answer
How to convert images as input to a ML classifier?
I want to build a image classifier i gathered images from web and i resized them using PIL libray
now i want those images to be converted as input .what operations do i need to perform on these
images.I also did covert images in to numpy arrays…

sriram anush
- 77
- 3
- 13