Highest Voted 'sklearn-pandas' Questions

5

votes

2 answers

How can I get the feature names from sklearn TruncatedSVD object?

I have the following code import pandas as pd import numpy as np from sklearn.decomposition import TruncatedSVD df = df = pd.DataFrame(np.random.randn(1000, 25), index=dates, columns=list('ABCDEFGHIJKLMOPQRSTUVWXYZ')) def reduce(dim): svd =…

asked Jun 19 '17 at 14:43

m.awad

187
2
13

5

votes

2 answers

How to iterate over pandas DataFrameGroupBy and select all entries per grouped variable for specific column?

Let's assume, there is a table like this: Id | Type | Guid I perform on such a table the following operation: df = df.groupby('Id') Now I would like to iterate through first n rows and for each specific Id as a list print all the corresponding…

python pandas sqlite sklearn-pandas

asked May 06 '17 at 23:03

Server Khalilov

408
2
5
20

5

votes

4 answers

Loading sklearn model in Java. Model created with DNNClassifier in python

The goal is to open in Java a model created/trained in python with tensorflow.contrib.learn.learn.DNNClassifier. At the moment the main issue is to know the name of the "tensor" to give in java on the session runner method. I have this test code…

java tensorflow tensorflow-serving sklearn-pandas

asked Apr 24 '17 at 22:56

rjpg

134
1
11

5

votes

1 answer

LabelEncoder().fit_transform vs. pd.get_dummies for categorical coding

It was recently brought to my attention that if you have a dataframe df like this: A B C 0 0 Boat 45 1 1 NaN 12 2 2 Cat 6 3 3 Moose 21 4 4 Boat 43 You can encode the categorical data automatically with…

python pandas scikit-learn sklearn-pandas

asked Sep 22 '16 at 17:16

Jonathan Bechtel

3,497
4
43
73

5

votes

1 answer

Pyspark user defined aggregate calculation on columns

I’m preparing data for input for a classifier in Pyspark. I have been using aggregate functions in SparkSQL to extract features such as average and variance. These are grouped by activity, name and window. Window has been calculated by dividing a…

pyspark apache-spark-sql sklearn-pandas

asked Jul 01 '16 at 09:52

other15

839
2
11
23

4

votes

1 answer

How to predict on a grouped DataFrame, using a dictionary of models, and return to original test DataFrame?

I have created a dictionary of regression models, indexed by values of group from a training dataset, d import numpy as np import pandas as pd from sklearn.linear_model import LinearRegression from sklearn.pipeline import Pipeline d =…

python pandas dataframe pandas-groupby sklearn-pandas

asked May 11 '22 at 20:20

langtang

22,248
1
12
27

4

votes

7 answers

how to resolve AttributeError: module 'graphviz.backend' has no attribute 'ENCODING'

I am not sure why I get an AttributeError: module 'graphviz.backend' has no attribute 'ENCODING' when I tried to export regression tree to graphviz. I tried re-installing graphviz and sklearn but it doesn't solve the problem. Appreciate any advice…

graphviz sklearn-pandas

asked Nov 16 '21 at 13:04

Rayner

41
1
1
2

4

votes

1 answer

GridSearchCV results heatmap

I am trying to generate a heatmap for the GridSearchCV results from sklearn. The thing I like about sklearn-evaluation is that it is really easy to generate the heatmap. However, I have hit one issue. When I give a parameter as None, for…

python matplotlib scikit-learn seaborn sklearn-pandas

asked Jun 26 '21 at 02:19

spockshr

372
2
14

4

votes

0 answers

Sklearn pipeline not fitted after .fit has been called?

I have a simple pipeline like this pl = Pipeline(steps=[("preprocessor", ColumnTransformer( transformers=[ ('num', Pipeline(steps=[('StandardScaler', StandardScaler())]),…

python scikit-learn pipeline sklearn-pandas

asked Jul 22 '20 at 05:30

L Xandor

1,659
4
24
48

4

votes

3 answers

Create my custom Imputer for categorical variables sklearn

I have a dataset with a lot of categorical values missing and i would like to make a custom imputer which will fill the empty values with a value equal to "no-variable_name". For example if a column "Workclass" has a Nan value, replace it with "No…

python pandas machine-learning scikit-learn sklearn-pandas

asked Apr 17 '20 at 18:45

Vasilis Iak

79
7

4

votes

2 answers

group by and calculate auc on folds

What I would like to do, based on the dataset below, is to calculate the AUC for each algorithm and also later for each dataset. I have tried something like this but it is not working: from sklearn.metrics import…

python pandas scikit-learn sklearn-pandas

asked Feb 07 '20 at 16:19

glouis

541
1
7
22

4

votes

2 answers

How do I use Decision Tree Regressor on new data? (Python, Pandas, Sklearn)

I've started learning python and machine learning very recently. I have been doing a basic Decision Tree Regressor example involving house prices. So I have trained the algorithm and found the best number of branches but how do I use this on new…

python pandas machine-learning sklearn-pandas

asked Jan 27 '20 at 18:06

ARH94

43
6

4

votes

1 answer

Increase performance of Random Forest Regressor in sklearn

There is an optimization problem where I have to call the predict function of a Random Forest Regressor several thousand times. from sklearn.ensemble import RandomForestRegressor rfr = RandomForestRegressor(n_estimators=10) rfr = rfr.fit(X, Y) for…

python scikit-learn sklearn-pandas

asked Nov 16 '19 at 21:41

Bowers

836
8
20

4

votes

1 answer

How can I set the font of the caption of a Pandas Datafrane?

I trying to display two tables side-by-side in a Jupyter notebook. I have some code that does this: header = ["Metric", "Test dataset"] table1 = [["accuracy", accuracy_test], ["precision", precision_test], …

html jupyter-notebook sklearn-pandas

asked Nov 14 '19 at 19:46

user274610

509
9
18

4

votes

3 answers

statsmodels raises TypeError: ufunc 'isfinite' not supported for the input types

I am applying backward elimination using statsmodels.api and the code gives this error `TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule…

python machine-learning statsmodels sklearn-pandas

asked Oct 19 '19 at 14:02

Anjali

187
1
4
12

Questions tagged [sklearn-pandas]

Resources

How can I get the feature names from sklearn TruncatedSVD object?

How to iterate over pandas DataFrameGroupBy and select all entries per grouped variable for specific column?

Loading sklearn model in Java. Model created with DNNClassifier in python

LabelEncoder().fit_transform vs. pd.get_dummies for categorical coding

Pyspark user defined aggregate calculation on columns

How to predict on a grouped DataFrame, using a dictionary of models, and return to original test DataFrame?

how to resolve AttributeError: module 'graphviz.backend' has no attribute 'ENCODING'

GridSearchCV results heatmap

Sklearn pipeline not fitted after .fit has been called?

Create my custom Imputer for categorical variables sklearn

group by and calculate auc on folds

How do I use Decision Tree Regressor on new data? (Python, Pandas, Sklearn)

Increase performance of Random Forest Regressor in sklearn

How can I set the font of the caption of a Pandas Datafrane?

statsmodels raises TypeError: ufunc 'isfinite' not supported for the input types