Python module providing a bridge between Scikit-Learn’s Machine Learning methods and pandas-style DataFrames
Questions tagged [sklearn-pandas]
1336 questions
2
votes
1 answer
Data normalization and rescaling value in Python
I have a dataset which contains URLs with publish date (YYYY-MM-DD), visits. I want to calculate benchmark (average) of visits for a complete year. Pages were published on different dates.....e. g. Weightage/contribution of 1st page published in Aug…

ashish1780
- 47
- 1
- 10
2
votes
0 answers
Can i use sklearn IterativeImputer to fill in missing categorical data?
I have a dataset of categorical and continuos features and a lot of them have missing elements. I was wondering if i can use the respective imputer to fill out continuos as well as categorical data.
And if it cant be done, what would be the best way…

Roberto Araya
- 21
- 1
- 4
2
votes
1 answer
Linear regression plot not giving me meaningful visualization
I am using some time-series power consumption data and trying to do a linear regression analysis on it.
The data has the following columns:
Date, Denmark_consumption, Germany_consumption, Czech_consumption, Austria_consumption.
It is time-series…

redmage123
- 413
- 8
- 15
2
votes
0 answers
ModuleNotFoundError: No module named 'sklearn.cross_validation'
I am facing an issue with the following error:
ModuleNotFoundError: No module named 'sklearn.cross_validation'
When I checked the packages installed with pip freeze i could see scikit-learn installed.
What shall I do?
Thanks

SidStack
- 59
- 1
- 9
2
votes
1 answer
Installed scikit-learn doesn't work properly
I am getting this error when I run the following code:
from sklearn.decomposition import LatentDirichletAllocation
ImportError: cannot import name '__check_build' from partially initialized module 'sklearn' (most likely due to a circular…

Sri Test
- 389
- 1
- 4
- 21
2
votes
1 answer
Dropping a column explicitly in DataFrameMapper
Consider the following artificial data:
data = pd.DataFrame({'pet':['cat', 'dog', 'dog', 'fish',
'cat', 'dog', 'cat', 'fish'],
'children': [4., 6, 3, 3, 2, 3, 5, 4],
'salary': …

Abhishek Bhatia
- 547
- 4
- 11
2
votes
3 answers
How to fill missing value using pre-trained model?
I have a time series index with few variables and humidity reading. I have already trained an ML model to predict Humidity values based on X, Y and Z. Now, when I load the saved model using pickle, I would like to fill the Humidity missing values…

Sakib Shahriar
- 121
- 1
- 12
2
votes
1 answer
How to use CalibratedClassifierCV on already trained xgboost model?
I want to calibrate my xgboost model which is already trained. According to the documentation:
If “prefit” is passed, it is assumed that base_estimator has been
fitted already and all data is used for calibration.
So I have tried to use it as…

Xaume
- 293
- 2
- 16
2
votes
1 answer
I am currently getting this error when running my code: TypeError: SparseDataFrame() takes no arguments. How do I fix this?
I am currently getting this error when running my code: TypeError: SparseDataFrame() takes no arguments. How do I fix this?
View the code below.
churndrop = churn.drop(['Churn'],axis=1) #drops churn column
x= churndrop #creates dataframe
y=…

Chris
- 41
- 1
- 4
2
votes
2 answers
Groupby and Normalize selected columns Pandas DF
I have a sample DF which I want to normalize based on 2 condtions
Creating sample DF:
sample_df = pd.DataFrame(np.random.randint(1,20,size=(10, 3)), columns=list('ABC'))
sample_df["date"]=…

data_person
- 4,194
- 7
- 40
- 75
2
votes
0 answers
UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 due to no predicted samples
I am working on a text classification problem and when I attempt to train my model with the data vectorized using TF-IDF it returns this error.
It is my understanding that this error appears when some of the labels were never predicted by the…

bls
- 351
- 2
- 12
2
votes
4 answers
Pandas - Create a column (col C) with values from another (col A) if a condition in another column (col B) is observed
I have a DataFrame as we can see in Table A with two columns. The values on column A are int starting on 1. The values in column B are binary.
I need to create column C (Table B) in which:
if the values on column B are 1, then get the values on…

Thaise
- 1,043
- 3
- 16
- 28
2
votes
1 answer
Sklearn.linear_model import LinearRegression does not work on data series but does for data frames. Why?
I used the following block of code and I got a traceback error;
Code (in the code below, X_train and y_train are data series (a single column of data)):
from sklearn.linear_model import LinearRegression
regressor =…

Prince Gooner
- 21
- 2
2
votes
2 answers
how to sort dataframe rows in pandas wrt to months from Jan to Dec
How can we sort the below rows in dataframe wrt to month from Jan to Dec,
currently this dataframe is in alphabetical order.
0 Col1 Col2 Col3 ... Col22 Col23 Col24
1 April 53.0 0.0 ... …

user190245
- 1,027
- 1
- 15
- 31
2
votes
2 answers
Merging 2 pandas tables and using them
I have 2 panda tables,
table A which is something like that:
Date a b c d e
0
...
.
.
.
2n
and table B which has something like that
Date f g k h i j
2
.
.
..
.
.
2n-3
.
the issue is that each table has totally different dates,…

secret
- 505
- 4
- 16