Python module providing a bridge between Scikit-Learn’s Machine Learning methods and pandas-style DataFrames
Questions tagged [sklearn-pandas]
1336 questions
0
votes
1 answer
SVM: From The Scratch-Generate Model after training
How can I generate my model after training? I didn't use sklearn package for my fit and predict. My code looks like this:
class SVM(object):
def __init__(self, kernel=polynomial_kernel, C=None):
self.kernel = kernel
self.C = C
if self.C…

Christel Junco
- 123
- 1
- 2
- 15
0
votes
1 answer
In which order PCA components is printed? I need the parameters to solve pca formula. How do I know who the beta values are?
I'm using sklearn PCA technique. I need to solve:
pca1 = beta1. c1 + beta2. c2 + beta3. c3 + beta4. c4 + beta5. c5
I read in the documentation that The components are sorted by explained_variance_. How do I know who the beta values are?
d = {'c1':…

Thaise
- 1,043
- 3
- 16
- 28
0
votes
1 answer
error when using `dataframemapper` class from pickle
I am trying to save a DataFramMapper object to use on new data for a model.
mapper = DataFrameMapper([
(['price', 'Argentina', 'Canada', 'Australia', 'barcat_numeric'], None),
('TTL',CountVectorizer( ngram_range=(1, 2))),
…

eliavs
- 2,306
- 4
- 23
- 33
0
votes
1 answer
Unknown label type error while I'm trying to fit x_train and y_train to Perceptron and MLPClassifier using Sklearn
This is a snippet of my code, I can't add more for some reason but,
per = Perceptron()
per.fit(x_train,y_train)
and this is the following error
ValueError: Unknown label type: (array([0.055, 0.09 , 0.095, 0.1 , 0.105, 0.11 , 0.115, 0.12 , 0.125,
…

Malpa
- 11
- 4
0
votes
0 answers
set parameters for BayesianRidge
What is the difference between alpha and lambda in linear_model.BayesianRidge() of sklearn?
I would like to estimate a linear regression y = w_0 + w_1 x_1 + w_2 x_2 + e with priors for w_0, w_1, w_2 to be normally distributed. w_0 = N(0, sigma0), w1…

zmicer
- 1
- 2
0
votes
1 answer
Sklearn: how to get mean squared error on classifying training data
I'm trying to do some classification problems using sklearn for the first time in Python, and was wondering what was the best way to go about calculating the error of my classifier (like a SVM) solely on the training data.
My sample code for…

Joe J.
- 119
- 1
- 7
0
votes
1 answer
How to tell Pandas/Scikit-Learn how one field impacts predictive model
I am trying to create/validate a predictive model using a fictitious dataset, using Phyton with sklearn, following this tutorial.
The dataset contains information about baseball pitcher throws, and these are the most important fields:
Result…

Irina
- 1,333
- 3
- 17
- 37
0
votes
1 answer
pandas return index of rows having more than one 'NA' value
my code:
import pandas as pd
from sklearn.preprocessing import LabelEncoder
column_names =…

Pratik Kumar
- 2,211
- 1
- 17
- 41
0
votes
1 answer
When trying to perform GaussianNB on data get TypeError - python beginner
i'm trying to build a prediction model using GaussianNB.
I have a csv file that looks like this:
csv data
My code looks like as follows:
encoded_df = pd.read_csv('path to file')
y = encoded_df.iloc[:,12]
X = encoded_df.iloc[:,0:12]
model =…

finH
- 3
- 5
0
votes
0 answers
PermissionError when loading fetch_20newsgroups from sklear.dataset
from sklearn.datasets import fetch_20newsgroups
data = fetch_20newsgroups()
data.target_names
PermissionError: [WinError 5] Access is denied: 'C:\Users\liu.h\scikit_learn_data\20news_home\20news-bydate-test\sci.crypt'

北京挖掘机
- 1
- 1
0
votes
0 answers
Replace column with rows pandas
How do I reshape pivot(using pandas):
0 1 \
trans -0.521058 -0.521058
serie -0.521816 -0.521816
recor -0.468133 -0.468133
to:
trans serie recor
0 -0.521058 -0.521816 …

Gil Shay
- 1
- 1
0
votes
0 answers
LabelEncoder in sklearn_pandas mapper with pipeline after cross_val_score returns error
I have a strange error, that I could not understand.
I have a data:
import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.cross_validation import cross_val_score
from sklearn.pipeline import…

Shin
- 251
- 1
- 3
- 8
0
votes
2 answers
Error using KFold (from sklearn.model_selection import KFold)
I am getting an error while using from sklearn.model_selection import KFold in my jupyter notebook.
The error says "No module named 'sklearn.model_selection'". When I printed
print(sklearn.__version__)
I got the version to be 0.17.1.
Can anyone…

Khan
- 81
- 2
- 7
0
votes
1 answer
Counting matrix pairs using a threshold
I have a folder with hundreds of txt files I need to analyse for similarity. Below is an example of a script I use to run similarity analysis. In the end I get an array or a matrix I can plot etc.
I would like to see how many pairs there are with…

aviss
- 2,179
- 7
- 29
- 52
0
votes
1 answer
LabelEncoding to multiple columns in pandas
I'm currently working on Titanic dataset. It consists of 4-5 non numeric columns. I want to apply sklearn.LabelEncoder class to get encoded values for these non-numeric columns. I can, no doubt, apply this method one by one to each column. But the…

Nuance
- 101
- 2
- 14