Highest Voted 'sklearn-pandas' Questions

0

votes

1 answer

Learning_curve error

I tried to use plot_learning_curve to plot logistic regression below, but got error. Could anyone help? from sklearn.linear_model import LogisticRegression lg = LogisticRegression(random_state=42, penalty='l1') parameters = {'C':[0.5]} # Use…

python sklearn-pandas

asked Mar 15 '17 at 23:01

Frank Hee

17
6

0

votes

0 answers

sklearn.feature_selection and RFECV

import pandas as pd from sklearn.cross_validation import StratifiedKFold from sklearn.feature_selection import SelectPercentile a = pd.read_csv('NCAA_2003-2016_with_diff.csv') logreg = lm.LogisticRegression() rfecv = RFECV(estimator=logreg,…

python scikit-learn sklearn-pandas

asked Mar 10 '17 at 21:47

Hong

1

0

votes

0 answers

type wrong when using sklearn and pandas.Dataframe

I want to use sklearn to do some predict and i stored my data in a Dataframe. Data = DataFrame(columns = columns,index = range(1,501)) The data has no problem. from sklearn.cross_validation import train_test_split Xtrain,Xtest,Ytrain,Ytest =…

python-3.x numpy dataframe scikit-learn sklearn-pandas

asked Mar 01 '17 at 07:04

Mengyang LIU

3
2

0

votes

1 answer

Does binary log loss exclude one part of equation based on y?

Assuming the log loss equation to be: logLoss=−(1/N)*∑_{i=1}^N (yi(log(pi))+(1−yi)log(1−pi)) where N is number of samples, yi...yiN is the actual value of the dependent variable, and pi...piN is the predicted likelihood from logistic regression How…

python machine-learning logistic-regression sklearn-pandas cross-entropy

asked Feb 17 '17 at 22:58

Liam Hanninen

1,525
2
19
37

0

votes

2 answers

Receiving a value error when using OneHotEncoder and fitting data

I'm working on an assignment and we are using OneHotEncoder in scikit-learn to make all categories print out. Here is the a sample of the data and the code I used to transform it: grade sub_grade short_emp emp_length_num home_ownership …

python sklearn-pandas

asked Feb 13 '17 at 03:52

macshaggy

357
1
4
17

0

votes

0 answers

Using train_test_split to generate test and train data causes changes in underlying data

I am using trai_test_split from sklearn.cross_validation to split the source CSV data file into training and test data using simple Python code like this: from sklearn.cross_validation import train_test_split import pandas as pd dataset =…

python pandas machine-learning scikit-learn sklearn-pandas

asked Feb 07 '17 at 09:46

VS_FF

2,353
3
16
34

0

votes

1 answer

Error in implementing SVC in sklearn

I am trying to implement svc for predicting a continuous variable: print("X_train_dtm type ", type(X_train_dtm)) print("y_train type ", type(y_train)) svc = svm.SVC(kernel='linear', C=C).fit(X_train_dtm, y_train) However I am getting the following…

python python-3.x pandas scikit-learn sklearn-pandas

asked Feb 01 '17 at 14:39

Bonson

1,418
4
18
38

0

votes

0 answers

Python - sklearn making predictions on the wrong column

I'm currently trying to make predictions for the next months worth of business days for stock prices pulled from Quandl, I got this idea from a tutorial on pythonprogramming.net (which heavily influences the structure of the code here), however when…

python python-3.x pandas scikit-learn sklearn-pandas

asked Jan 26 '17 at 06:21

Connor McCluskey

43
8

0

votes

1 answer

Difference between statsmodel OLS and scikit linear regression; different models give different r square

I am new to python and trying to calculate a simple linear regression. My model has one dependent variable and one independent variable. I am using linear_model.LinearRegression() from sklearn package. I got an R square value of .16 Then I used…

python linear-regression statsmodels sklearn-pandas

asked Jan 05 '17 at 22:15

SAM244776

1,375
6
18
26

0

votes

1 answer

Divide dataframe into two sets according to a column

I have Dataframe df i choosed some coulmns of it and i want to divide them into xtrain and xtest accoring to a coulmn called Sevrice. So that raws with 1 and o into the xtrain and nan into xtest. Service 1 0 0 1 Nan Nan xtarin =…

python pandas logistic-regression sklearn-pandas

asked Jan 01 '17 at 13:22

user7308269

0

votes

1 answer

What's the difference between importing a whole module vs importing just the required method from the module in python?

When using scikit learn or other similar Python libraries, what's the difference between doing: import sklearn.cluster as sk model = sk.KMeans(n_clusters=n) And from sklearn.cluster import KMeans model = KMeans(n_clusters=n) Is there any…

python scikit-learn sklearn-pandas

asked Dec 28 '16 at 18:41

Alex Kinman

2,437
8
32
51

0

votes

1 answer

mapping back any sklearn result to the original dataframe

I'd like to analyze the predicted values of my random forest results in excel with the original test data as a reference. The predicted result comes in an array as i use this: predict = rf.predict(test[columns]) how do I map back the predicted…

python pandas scikit-learn sklearn-pandas

asked Dec 19 '16 at 08:51

galeej

535
9
23

0

votes

1 answer

Converting dataframe column of years to month day year

I'm doing this for homework. My goal is to have an entirely new column with just the days elapsed. There are 500,000+ rows of this...so my goal is to: In the Pandas dataframe, I have these two date columns which are in different formats. I'd like…

python datetime pandas dataframe sklearn-pandas

asked Dec 11 '16 at 20:13

jhub1

611
3
7
19

0

votes

1 answer

Unable to fit_transform data from csv file in sklearn

I am trying to learn some classification in Scikit-learn. However, I couldn't figure out what this error means. import pandas as pd from sklearn.feature_extraction.text import CountVectorizer data_frame = pd.read_csv('data.csv', header=0)…

pandas machine-learning scikit-learn classification sklearn-pandas

asked Dec 09 '16 at 17:27

Jhooma

3
3

0

votes

2 answers

Does use dummy value make model's performance better?

I see many feature engineering has the get_dummies step on the object features. For example, dummy the sex column which contains 'M' and 'F' to two columns and label them in one-hot representation. Why we not directly make the 'M' and 'F' as 0 and…

machine-learning feature-selection sklearn-pandas

asked Dec 02 '16 at 09:14

yanachen

3,401
8
32
64

Questions tagged [sklearn-pandas]

Resources

Learning_curve error

sklearn.feature_selection and RFECV

type wrong when using sklearn and pandas.Dataframe

Does binary log loss exclude one part of equation based on y?

Receiving a value error when using OneHotEncoder and fitting data

Using train_test_split to generate test and train data causes changes in underlying data

Error in implementing SVC in sklearn

Python - sklearn making predictions on the wrong column

Difference between statsmodel OLS and scikit linear regression; different models give different r square

Divide dataframe into two sets according to a column

What's the difference between importing a whole module vs importing just the required method from the module in python?

mapping back any sklearn result to the original dataframe

Converting dataframe column of years to month day year

Unable to fit_transform data from csv file in sklearn

Does use dummy value make model's performance better?