Questions tagged [scikit-learn-pipeline]
92 questions
2
votes
1 answer
How to use the feature_names_out with scikit's FunctionTransformer
I am trying the use feature_names_out on scikit's FunctionTransformer to get the same feature names but I get this error:
Code:
from sklearn.preprocessing import FunctionTransformer
X = pd.Series(data=[1, 2, 3], name='numbers')
transformer =…

saad
- 31
- 3
2
votes
0 answers
How can I get the feature names after several fit_transform's from sklearn?
I'm running a machine learning model that requires multiple transformations. I applied polynomial transformations, interactions, and also a feature selection using SelectKBest:
transformer = ColumnTransformer(
transformers=[("cat",…

Aldla E Aoepql
- 69
- 3
2
votes
1 answer
Unable to load pickled custom estimator sklearn pipeline
I have a sklearn pipeline that uses custom column transformer, estimator and different lambda functions.
Because Pickle cannot serialize the lambda functions, I am using dill.
Here is the custom estimator I have:
class customOLS(BaseEstimator):
…

Obiii
- 698
- 1
- 6
- 26
2
votes
1 answer
Extracting feature importances from an sklearn pipeline containing a multioutputclassifier within gridsearchcv?
I'm wondering whether I can extract feature importances with names from a scikit-learn pipeline that I've built. The pipeline contains a Gradient Boosting Classifier wrapped in a Multi Output classifier. The pipeline is part of a GridSearchCV…

makemyDNA
- 51
- 5
2
votes
2 answers
AttributeError scikit learn pipeline based class
I am trying to write a sklearn based feature extraction pipeline. My pipeline code idea could be splitted in few parts
A parent class where all data preprocessing (if required) could happen
from sklearn.base import BaseEstimator,…

abhiieor
- 3,132
- 4
- 30
- 47
2
votes
0 answers
Specific Decision Rule from Decision Tree Classifier Pipeline With Vectorizing and Feature Union
In order to get the specific rules applied to a trained sample on a decision tree classifier, we need to use the decision_path method: decision_path(X[, check_input]).
Now, working on a short text classification model, I have pipelined a feature…

mara gato
- 21
- 3
2
votes
1 answer
How to pickle TPOT fitted pipeline?
I'm using the TPOT classifier, and after training the model, I want to save the best pipeline; I can get it using.
model.fitted_pipeline_
This is an example of one of the outputs:
Pipeline(steps=[('extratreesclassifier',
…

Rodrigo A
- 657
- 7
- 23
2
votes
1 answer
Feature mismatch: Prediction through scikit-learn Pipeline
I implemented the following scikit-learn pipeline inside a file called build.pyand later, pickled it successfully.
preprocessor = ColumnTransformer(transformers=[
('target', TargetEncoder(), COL_TO_TARGET),
('one_hot',…

eager_learner
- 152
- 1
- 9
2
votes
0 answers
scikit-learn: Retrieve model object from the pipeline
I have the following pipeline build and what I want to do is obtain the random forest model object that gets built inside the pipeline. The rf is the only initialization and it doesn't have rf.estimators_
grid_params = [{'bootstrap': [True],
…

add-semi-colons
- 18,094
- 55
- 145
- 232
1
vote
0 answers
All intermediate steps should be transformers and implement fit and transform or be the string 'passthrough'
I was studying In Depth: k-Means Clustering section from the textbook Jake VanderPlas's Python Data Science Handbook and I came across the following code block:
from sklearn.datasets import load_digits
from sklearn.manifold import TSNE
from…

nakoshimati
- 11
- 4
1
vote
0 answers
Imputation of mixed data types with pandas and Scikit-Learn
I have to create a pre-processing pipeline dynamically to impute missing values, this is, I want to go through all the columns in a pandas data frame (which I don't know before-hand), and impute their missing values.
To impute the missing values I…

Rodrigo A
- 657
- 7
- 23
1
vote
2 answers
AttributeError: 'ColumnTransformer' object has no attribute '_name_to_fitted_passthrough'
I am predicting the IPL match win probability. While deploying the model using streamlit it show this error:
AttributeError: 'ColumnTransformer' object has no attribute '_name_to_fitted_passthrough'
That's my code:
from sklearn.compose import…

Jagadeesh Pangala
- 11
- 1
- 4
1
vote
0 answers
GLMM alike solution - adding an interaction step as an element of scikit-learn Pipeline for columns transformed in previous steps
I'm trying to create a solution that will be somehow similar to the Mixed Effects Model (GLMM) that is not present in scikit-learn at the moment. Imagine a simple heart-disease dataset from…

Freejack
- 168
- 10
1
vote
1 answer
fit() missing 1 required positional argument: 'y'
X = df.drop(columns="CLASS")
y = df.CLASS
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
X_train.shape, X_test.shape, y_train.shape, y_test.shape
preprocessor = ColumnTransformer([
('numeric',…

rckoprstyo
- 11
- 1
1
vote
0 answers
How do I extract feature importances from a Sklearn pipeline
I'm wondering how I can extract feature importances from Logistic regression, GBM and XGBoost in scikit-learn with the feature names when using the classifier in a pipeline with preprocessing. I want to know how do I extract feature importances from…

MOT
- 81
- 6