Questions tagged [sklearn-pandas]

Python module providing a bridge between Scikit-Learn’s Machine Learning methods and pandas-style DataFrames

Resources

1336 questions
4
votes
1 answer

Cannot Import LinearRegression from Sklearn

from sklearn.linear_model import LinearRegression gives me this error in Jupyter Notebook: --------------------------------------------------------------------------- ImportError Traceback (most recent call…
David Yang
  • 2,101
  • 13
  • 28
  • 46
4
votes
2 answers

Failed to install package 'sklearn'

After a whole day struggle, here I finally give up and ask this question. I know this may not be totally appropriate to ask this question but I'm not able to install sklearn on PyCharm and even can't install it using pip. Config: Windows 10, Pycharm…
Vikas Meena
  • 322
  • 3
  • 20
4
votes
1 answer

Count of each type of label in a column of a pandas data frame

I have a following data frame. I need to find count of each type of MNTPCODE for each donor. CONTID MEDIUMCODE MNTOPCODE CLASCODE EXTRELNO CONTDIREC CONTDATE 000405402 CI CTS CT 0000020 O …
Seema Mudgil
  • 365
  • 1
  • 7
  • 15
4
votes
1 answer

Python - convert rows to columns after group by and populate zeroes for non matching rows

I have a requirement where I need to convert the rows of a dataframe column to columns, however I am facing an issue after GROUPBY. Below is a set of 3 users that can have types between type1 to type6. user_id1 type4 user_id1 type6 user_id1 …
Suraj
  • 575
  • 1
  • 9
  • 23
4
votes
0 answers

python scikit-learn cosine similarity value error: could not convert integer scalar

I am trying to produce a cosine similarity matrix using text descriptions of apps. The script below first reads in a csv data file (I can provide the data file if needed) which contains two columns, one with two app categories and the other with…
4
votes
4 answers

Sklearn: Categorical Imputer?

Is there a way to impute categorical values using a sklearn.preprocessing object? I would like to ultimatly create a preprocessing object which I can apply to new data and have it transformed the same way as old data. I am looking for a way to do…
4
votes
1 answer

What replaces GridSearchCV._grid_scores_ in scikit?

Since _grid_scores_ method has been replaced by cv_results_ I would like to know how do I output the tuple with the parameters and scores? cv_results_ provides a dataframe for the score, but the tuple output was way easier to handle. Please guide…
Ankit Bansal
  • 317
  • 4
  • 14
4
votes
1 answer

How are features ranked in RFECV in scikit learn(sklearn)?

I used recursive feature elimination and cross-validated (rfecv) in order to find the best accuracy score for several features I had (m =154). rfecv = RFECV(estimator=logreg, step=1, cv=StratifiedKFold(2), …
Liam Hanninen
  • 1,525
  • 2
  • 19
  • 37
4
votes
1 answer

Sklearn-Pandas DataFrameMapper: mapper.fit_transform gives ValueError: bad input shape (8, 2)

I was able to replicate the example given in the Github repo. However, when I tried it on my own data, I got the ValueError. Below is a dummy data that, which gives the same error as my real data. import pandas as pd import numpy as np from…
wi3o
  • 1,467
  • 3
  • 17
  • 29
4
votes
2 answers

SKlearn Random Forest error on input

I am trying to run the fit for my random forest, but I am getting the following error: forest.fit(train[features], y) returns --------------------------------------------------------------------------- ValueError …
3
votes
1 answer

python linear regression: dense vs sparse

I need to use linear regression on a sparse matrix. I have been getting poor results, so I decided to test it on a non-sparse matrix represented sparsely. The data is taken from…
3
votes
1 answer

K-Means classification by group

I'm trying to do a K-means analysis in a dataframe like this: URBAN AREA PROVINCE DENSITY 0 1 TRUJILLO 0.30 1 2 TRUJILLO 0.03 2 3 TRUJILLO 0.80 3 1 LIMA 1.20 4 2 LIMA…
José Rojas
  • 313
  • 1
  • 8
3
votes
1 answer

Problem with negative numbers in sklearn.feature_selection.SelectKBest feautre scoring module

I was trying auto feature engineering and selecting, so for that, I used the Boston house price dataset available in sklearn. from sklearn.datasets import load_boston import pandas as pd data = load_boston() x = data.data y= data.target y =…
3
votes
2 answers

How to encode a dataset having multiple datatypes?

I have a dataset like: e = pd.DataFrame({ 'col1': ['A', 'A', 'B', 'W', 'F', 'C'], 'col2': [2, 1, 9, 8, 7, 4], 'col3': [0, 1, 9, 4, 2, 3], 'col4': ['a', 'B', 'c', 'D', 'e', 'F'] }) Here I encoded the data using…
3
votes
2 answers

Python3 Pandas.DataFrame.info() Error Key: 30

So I was digging around some datasets, and trying to use pandas to analyze then and i stumbled across the following error.. and my brain froze :( here is the snippet where the exception is being raised import pandas as pd from sklearn.datasets…