Questions tagged [scikit-learn-pipeline]
92 questions
0
votes
0 answers
I need to understand the following error: "CatBoostError: The left argument is not fitted, only fitted models could be compared."
I am trying to run a RandomizedSearchCV on various classification models through "For" loop for Hyperparameter tuning. There is no issue with running any other models except CatBoost. Also the issue with Catboost arises when I used Pipeline in…

Smit Jani
- 1
- 1
- 1
- 1
0
votes
1 answer
Is get_feature_names_out from scikit-learn SimpleImputer working?
I'd like to access to the name of the columns that were imputed by scikit-learn SimpleImputer and create a DataFrame. According to documentation, it should be possible with function get_feature_names_out.
However, when I try the following example…

Joseph
- 1
- 1
0
votes
1 answer
LassoCV getting axis -1 is out of bounds for array of dimension 0 and other questions
Good evening to all,
I am trying to implement for the first time LassoCV with sklearn.
My code is as follows:
numeric_features = ['AGE_2019', 'Inhabitants'] categorical_features = ['familty_type','studying','Job_42','sex','DEGREE', 'Activity_type',…

Gaspard_Boyer
- 21
- 4
0
votes
1 answer
Divide by zero encountered in true_divide f = msb / msw with SelectKBest
I tried to implement in my pipeline the SelectKBest function to improve my existing near model.
Without this new step, the model gave me the following results:
Best test negative MSE of the base model : -62.60
Best test R2 of the base model:…

Gaspard_Boyer
- 21
- 4
0
votes
1 answer
Get OOB score within a pipeline for Random Forest
I was wondering for a machine learning project: is it possible to implement RandomForestRegressor inside a pipeline?
Specifically, I need to determine the OOB score from a RandomForestRegressor. But my data requires a lot of preprocessing.
I tried…

Gaspard_Boyer
- 21
- 4
0
votes
1 answer
Fails to save model after running GridSearchCV with a scikit pipeline
I have the following toy example to replicate the issue
import numpy as np
import xgboost as xgb
from sklearn.datasets import make_regression
from sklearn.pipeline import Pipeline
from sklearn.model_selection import GridSearchCV
X, y =…

Li-Pin Juan
- 1,156
- 1
- 13
- 22
0
votes
0 answers
Specifying the columns using strings is only supported for pandas DataFrames
The code is fine and running for train & test data, but for the sample input it's showing an error while predicting data:
test1 = pd.DataFrame(data=np.array(['MBBS', 'Psychiatrist', 8, 'Dadar', 'Mumbai', 10,…
0
votes
0 answers
Building Pipelines
I've been recently trying to set up a Pipeline to produce a Machine Learning model. I have built my own data preprocessing classes and a new class with an optimized sklearn algorithm: Regresor_Model; however when I declare the pipeline steps, for…

Ernesto Lopez Fune
- 543
- 5
- 22
0
votes
0 answers
Preprocess and data transformation in machine learning
I have a problem where I have to predict a buyer using machine learning (created a dummy dataset). I need to transform the data first before I can use it for machine learning. I am aggregating information per id,visit level which gives me a list of…

PRData
- 31
- 4
0
votes
1 answer
Get features names from scikit pipelines
I am working on ML regression problem where I defined a pipeline like below based on a tutorial online.
My code looks like below
pipe1 = Pipeline([('poly', PolynomialFeatures()),
('fit', linear_model.LinearRegression())])
pipe2 =…

The Great
- 7,215
- 7
- 40
- 128
0
votes
0 answers
Sklearn manually add feature in pipeline after feature selection
I would like to add features manually after feature selection. For example, with this simple pipeline below.
pipe = Pipeline([('feature_selection', SelectFromModel(LinearSVC())),
('clf', ExtraTreesClassifier())])
After…

Darren Christopher
- 3,893
- 4
- 20
- 37
0
votes
1 answer
Incompatible row dimensions when using passthrough in GridSearch over sklearn Pipeline with FeatureUnion
I am trying to do grid search over a sklearn pipeline that uses a custom transformer in a pipeline with FeatureUnion. It works fine when the pipeline uses the custom transformer class in FeatureUnion; however, it fails when the custom class is…

MichaelU
- 125
- 7
0
votes
0 answers
Am I implementing Pipeline with GridSearchCV for Regression correctly?
I'm practising machine learning algorithms (Lasso regression and decision trees) using Sklearn.pipeline and Sklearn.model_selection.GridSearchCV. I have split my dataset into training and test set. The following is my code. I wanted to know if my…

AyeshaA
- 141
- 7
0
votes
0 answers
Clustering Step within Scikit Pipeline
I am trying to do clustering as a step in a Pipeline so that I can use the cluster as an additional feature. I have used this post as a guide but I am getting an error on the call to fit_transform() within the pipeline. My original transformer is…

this_is_david
- 123
- 8
0
votes
0 answers
scikit learn pipelines and `ColumnTransformer`
I am confused by the following Pipeline weirdness. Suppose I define a pipeline thus:
pipe = Pipeline([
('transformer', ColumnTransformer([('sc', StandardScaler(), [0, 1])])),
('model', LinearRegression())
])
now define a dataframe thus:
df…

Igor Rivin
- 4,632
- 2
- 23
- 35