Highest Voted 'sklearn-pandas' Questions

5

votes

2 answers

XGBoost get classifier object form booster object?

I usually get to feature importance using regr = XGBClassifier() regr.fit(X, y) regr.feature_importances_ where type(regr) is . However, I have a pickled mXGBoost model, which when unpacked returns an object of type . This is the same object as if…

asked Aug 27 '19 at 20:16

L Xandor

1,659
4
24
48

5

votes

1 answer

How to weigh data points with sklearn training algorithms

I am looking to train either a random forest or gradient boosting algorithm using sklearn. The data I have is structured in a way that it has a variable weight for each data point that corresponds to the amount of times that data point occurs in the…

python scikit-learn sklearn-pandas

asked May 07 '19 at 19:50

Stephen Strosko

597
1
5
18

5

votes

3 answers

Making a string out of pandas DataFrame

I have pandas DataFrame which looks like this: Name Number Description car 5 red And I need to make a string out of it which looks like this: """Name: car Number: 5 Description: red""" I'm a beginner and I really don't get how…

python python-3.x pandas dataframe sklearn-pandas

asked Apr 06 '19 at 23:06

primadonna

142
4
12

5

votes

3 answers

Read multiple CSV files in Pandas in chunks

How to import and read multiple CSV in chunks when we have multiple csv files and total size of all csv is around 20gb? I don't want to use Spark as i want to use a model in SkLearn so I want the solution in Pandas itself. My code is: allFiles =…

python pandas jupyter-notebook sklearn-pandas

asked Mar 04 '19 at 16:38

pythonNinja

453
5
13

5

votes

1 answer

How to use custom scoring function in sklearn cross_val_score

I want to use Adjusted Rsquare in the cross_val_score function. I tried with make_scorer function but it is not working. from sklearn.cross_validation import train_test_split X_tr, X_test, y_tr, y_test = train_test_split(X, Y, test_size=0.2,…

python python-3.x machine-learning scikit-learn sklearn-pandas

asked Dec 19 '18 at 12:02

merkle

1,585
4
18
33

5

votes

2 answers

Cross-validation gives Negative R2?

I am partitioning 500 samples out a 10,000+ row dataset just for sake of simplicity. Please copy and paste X and y into your IDE. X = array([ -8.93, -0.17, 1.47, -6.13, -4.06, -2.22, -2.11, -0.25, 0.25, 0.49, 1.7 , -0.77, …

python scikit-learn cross-validation sklearn-pandas goodness-of-fit

asked Nov 22 '18 at 03:18

Chipmunkafy

566
2
5
17

5

votes

1 answer

python sklearn accuracy_score name not defined

x = df2.Tweet y = df2.Class from sklearn.cross_validation import train_test_split SEED = 2000 x_train, x_validation_and_test, y_train, y_validation_and_test = train_test_split(x, y, test_size=.02, random_state=SEED) x_validation, x_test,…

python-3.x pandas classification logistic-regression sklearn-pandas

asked Oct 28 '18 at 14:49

Shivam...

409
1
8
21

5

votes

3 answers

Install sklearn_pandas with conda via Windows command line

I'd like to install the sklearn_pandas library with conda via the Windows command line. The package is apparently "private" on the conda repository (admittedly this may well be why I cannot install it, but I prefer to ask for advice just in case…

windows command-line scikit-learn conda sklearn-pandas

asked Sep 19 '18 at 10:47

ongenz

890
1
10
20

5

votes

1 answer

pd.get_dummies dataframe same size when Sparse = True as when Sparse = False

I have a dataframe with several string columns that I want to convert to categorical data so that I can run some models and extract important features from. However, due to the amount of unique values, the one-hot encoded data expands into a large…

python pandas scikit-learn sklearn-pandas

asked Aug 06 '18 at 13:57

trystuff

686
1
8
18

5

votes

1 answer

How to Select Top 1000 words using TF-IDF Vector?

I have a Documents with 5000 reviews. I applied tf-idf on that document. Here sample_data contains 5000 reviews. I am applying tf-idf vectorizer on the sample_data with one gram range. Now I want to get the top 1000 words from the sample_data which…

python-3.x scikit-learn tf-idf sklearn-pandas tfidfvectorizer

asked Aug 02 '18 at 14:03

merkle

1,585
4
18
33

5

votes

2 answers

How to get feature importance in logistic regression using weights?

I have a dataset of reviews which has a class label of positive/negative. I am applying Logistic regression to that reviews dataset. Firstly, I am converting into Bag of words. Here sorted_data['Text'] is reviews and final_counts is a sparse…

machine-learning scikit-learn logistic-regression sklearn-pandas

asked Jul 22 '18 at 07:34

merkle

1,585
4
18
33

5

votes

2 answers

How to normalize dataframe by standard deviation using scikit-learn?

Given the following dataframe and left-x column: | | left-x | left-y | right-x | right-y | |-------|--------|--------|---------|---------| | frame | | | | | | 0 | 149 | 181 | 170 | 175 | | 1 …

python pandas sklearn-pandas

asked Mar 04 '18 at 23:16

JP Ventura

5,564
6
52
69

5

votes

1 answer

How to groupby and map by two columns pandas dataframe

i have a problem on python working with a pandas dataframe i'm trying to make a machine learning model predictin the surface . I have the surface column in the train dataframe and i don't have it in the test dataframe . So , i would to create some…

python pandas pandas-groupby sklearn-pandas

asked Dec 20 '17 at 19:45

John Karimov

151
1
1
9

5

votes

4 answers

Getting Error on StandardScalar Fit_Transform

import numpy as np import matplotlib.pyplot as plt import pandas as pd dataset = pd.read_csv('Position_Salaries.csv') X = dataset.iloc[:, 1:2].values y = dataset.iloc[:, 2].values from sklearn.preprocessing import StandardScaler sc_X =…

python arrays machine-learning scikit-learn sklearn-pandas

asked Dec 06 '17 at 13:28

Vikas Kyatannawar

136
1
1
8

5

votes

1 answer

How to convert Countvectorized data back to text data in Python?

how can I convert count vectorized text data back to textual form. I have text data which I had made into sparse matrix using countvectorizer for classification. Now I want the sparse martix of text data to be converted back into text data. My…

python pandas scikit-learn sklearn-pandas

asked Nov 05 '17 at 08:46

aeapen

871
1
14
28

Questions tagged [sklearn-pandas]

Resources

XGBoost get classifier object form booster object?

How to weigh data points with sklearn training algorithms

Making a string out of pandas DataFrame

Read multiple CSV files in Pandas in chunks

How to use custom scoring function in sklearn cross_val_score

Cross-validation gives Negative R2?

python sklearn accuracy_score name not defined

Install sklearn_pandas with conda via Windows command line

pd.get_dummies dataframe same size when Sparse = True as when Sparse = False

How to Select Top 1000 words using TF-IDF Vector?

How to get feature importance in logistic regression using weights?

How to normalize dataframe by standard deviation using scikit-learn?

How to groupby and map by two columns pandas dataframe

Getting Error on StandardScalar Fit_Transform

How to convert Countvectorized data back to text data in Python?