Python module providing a bridge between Scikit-Learn’s Machine Learning methods and pandas-style DataFrames
Questions tagged [sklearn-pandas]
1336 questions
4
votes
1 answer
Cannot Import LinearRegression from Sklearn
from sklearn.linear_model import LinearRegression
gives me this error in Jupyter Notebook:
---------------------------------------------------------------------------
ImportError Traceback (most recent call…

David Yang
- 2,101
- 13
- 28
- 46
4
votes
2 answers
Failed to install package 'sklearn'
After a whole day struggle, here I finally give up and ask this question. I know this may not be totally appropriate to ask this question but I'm not able to install sklearn on PyCharm and even can't install it using pip.
Config: Windows 10, Pycharm…

Vikas Meena
- 322
- 3
- 20
4
votes
1 answer
Count of each type of label in a column of a pandas data frame
I have a following data frame. I need to find count of each type of MNTPCODE for each donor.
CONTID MEDIUMCODE MNTOPCODE CLASCODE EXTRELNO CONTDIREC CONTDATE
000405402 CI CTS CT 0000020 O …

Seema Mudgil
- 365
- 1
- 7
- 15
4
votes
1 answer
Python - convert rows to columns after group by and populate zeroes for non matching rows
I have a requirement where I need to convert the rows of a dataframe column to columns, however I am facing an issue after GROUPBY.
Below is a set of 3 users that can have types between type1 to type6.
user_id1 type4
user_id1 type6
user_id1 …

Suraj
- 575
- 1
- 9
- 23
4
votes
0 answers
python scikit-learn cosine similarity value error: could not convert integer scalar
I am trying to produce a cosine similarity matrix using text descriptions of apps. The script below first reads in a csv data file (I can provide the data file if needed) which contains two columns, one with two app categories and the other with…

rangus
- 41
- 2
4
votes
4 answers
Sklearn: Categorical Imputer?
Is there a way to impute categorical values using a sklearn.preprocessing object? I would like to ultimatly create a preprocessing object which I can apply to new data and have it transformed the same way as old data.
I am looking for a way to do…

user1367204
- 4,549
- 10
- 49
- 78
4
votes
1 answer
What replaces GridSearchCV._grid_scores_ in scikit?
Since _grid_scores_ method has been replaced by cv_results_ I would like to know how do I output the tuple with the parameters and scores?
cv_results_ provides a dataframe for the score, but the tuple output was way easier to handle.
Please guide…

Ankit Bansal
- 317
- 4
- 14
4
votes
1 answer
How are features ranked in RFECV in scikit learn(sklearn)?
I used recursive feature elimination and cross-validated (rfecv) in order to find the best accuracy score for several features I had (m =154).
rfecv = RFECV(estimator=logreg, step=1, cv=StratifiedKFold(2),
…

Liam Hanninen
- 1,525
- 2
- 19
- 37
4
votes
1 answer
Sklearn-Pandas DataFrameMapper: mapper.fit_transform gives ValueError: bad input shape (8, 2)
I was able to replicate the example given in the Github repo. However, when I tried it on my own data, I got the ValueError.
Below is a dummy data that, which gives the same error as my real data.
import pandas as pd
import numpy as np
from…

wi3o
- 1,467
- 3
- 17
- 29
4
votes
2 answers
SKlearn Random Forest error on input
I am trying to run the fit for my random forest, but I am getting the following error:
forest.fit(train[features], y)
returns
---------------------------------------------------------------------------
ValueError …

rontho1992
- 116
- 1
- 11
3
votes
1 answer
python linear regression: dense vs sparse
I need to use linear regression on a sparse matrix. I have been getting poor results, so I decided to test it on a non-sparse matrix represented sparsely. The data is taken from…

Jafar Sadeq
- 45
- 5
3
votes
1 answer
K-Means classification by group
I'm trying to do a K-means analysis in a dataframe like this:
URBAN AREA PROVINCE DENSITY
0 1 TRUJILLO 0.30
1 2 TRUJILLO 0.03
2 3 TRUJILLO 0.80
3 1 LIMA 1.20
4 2 LIMA…

José Rojas
- 313
- 1
- 8
3
votes
1 answer
Problem with negative numbers in sklearn.feature_selection.SelectKBest feautre scoring module
I was trying auto feature engineering and selecting, so for that, I used the Boston house price dataset available in sklearn.
from sklearn.datasets import load_boston
import pandas as pd
data = load_boston()
x = data.data
y= data.target
y =…

Samar Pratap Singh
- 471
- 1
- 10
- 29
3
votes
2 answers
How to encode a dataset having multiple datatypes?
I have a dataset like:
e = pd.DataFrame({
'col1': ['A', 'A', 'B', 'W', 'F', 'C'],
'col2': [2, 1, 9, 8, 7, 4],
'col3': [0, 1, 9, 4, 2, 3],
'col4': ['a', 'B', 'c', 'D', 'e', 'F']
})
Here I encoded the data using…

Samar Pratap Singh
- 471
- 1
- 10
- 29
3
votes
2 answers
Python3 Pandas.DataFrame.info() Error Key: 30
So I was digging around some datasets, and trying to use pandas to analyze then and i stumbled across the following error.. and my brain froze :(
here is the snippet where the exception is being raised
import pandas as pd
from sklearn.datasets…

Rami
- 101
- 1
- 6