Questions tagged [scikit-learn-pipeline]

92 questions
0
votes
1 answer

sklearn pipeline and grid search

from sklearn.linear_model import LogisticRegression pipe4 = Pipeline([('ss', StandardScaler()), ('clf', knn)]) grid2 = GridSearchCV(pipe4, {'clf':[ knn, LogisticRegression()]}) grid2.fit(X_train, y_train) pd.DataFrame(grid2.cv_results_).T I made…
0
votes
0 answers

Sklearn Pipeline / OneHotEncoder : consistency in getting categorical features with feature_names_in_ / get_feature_names_out()

Similar questions have been asked before, but this is a particular case, and it seems that sklearn has evolved quite a bit since then (I am using scikit-learn 1.1.2), so I think it is worth a new post. I created an sklearn Pipeline in which I apply…
0
votes
0 answers

SimpleImputer Error, instance is not fitted yet. Custom Transformer and pipeline

I am having issues creating a custom transform and pipeline. I keep getting the error after running my pipeline. This SimpleImputer instance is not fitted yet. Call 'fit' with appropriate arguments before using this estimator. I know this has…
0
votes
1 answer

Problems with pipeline GradientBoostingClassifier

I am trying with a machine learning clasification problem. The target is a multiclass, with 3 diferents class. I have some problems with this pipeline, and I can not see what the problem is. from sklearn.ensemble import…
0
votes
0 answers

How to construct a wrapper over sklearn models

I'm trying to implement a pipeline consisting of several steps and for a few of the stages I need data in pandas format. Is it possible to implement a wrapper solution in sklearn where I can get "pandas in, pandas out" as a result of sklearn…
0
votes
1 answer

Getting feature names and coefficients from lasso regression in sklearn pipeline

I have a pipeline that uses custom transformers as well. Here is what the pipeline looks like: feature_cleaner = Pipeline(steps=[ ("id_col_remover", columnDropperTransformer(id_cols)), ("missing_remover",…
Obiii
  • 698
  • 1
  • 6
  • 26
0
votes
0 answers

How to iterate over different strategies in a list and different algorithms in a list using for loop?

I would like to collect the pipeline creation, KFold, and cross_val_score inside a for-loop; then iterate over different strategies in a list and different algorithms in a list. What I did right now: from sklearn.linear_model import…
resssslll
  • 65
  • 1
  • 7
0
votes
1 answer

Sklearn Pipeline is not converting catagorical values properly

I am trying to use Sklearn Pipeline methods before training multi ML models. This is my code to for pipeline: def pipeline(self): self.numerical_features = self.X_train.select_dtypes(include='number').columns.tolist() print(f'There…
0
votes
0 answers

Can arguments be passed dynamically to outer pipeline which can be used by any steps inside it?

I have the following scenario: num_cols = ["list", "of", "column", "names"] cat_cols = ["different", "list", "of", "column", "names"] col_transformer = ColumnTransformer([ ('num', Scaler(), num_cols), ('cat', OneHotEncoder(),…
0
votes
0 answers

scikit learn print(model) parameters dont show

I'm working trough the Wine Classification Challenge and I'm not getting the same summary when training the model and printing its params: model = pipeline.fit(X_train, y_train) print (model) For some reason I get this summary of the model: what I…
0
votes
1 answer

Getting number of support vectors of a RBF SVC in a sklearn pipeline

Is it possible to get the number of support vectors and (or) their values for an RBF SVC when it is fit using a sklearn Pipeline object? My pipeline looks like this dim_reduction = TruncatedSVD( n_components = dim_reduction_n_comp, random_state =…
Shahnawaz
  • 3
  • 2
0
votes
1 answer

Very large and same predicitons by Linear Regression in Scikit pipeline

I have a LR pipeline that I train over a dataset and save it. DUring the training, I also test it on X_test and the predicitons look okay. SO I save the model as joblib and load again to do prediction on a data. The predicitons on new data gives…
Obiii
  • 698
  • 1
  • 6
  • 26
0
votes
1 answer

Custom ColumnTransformer notFittedError

I have a pipeline that consists of two custom column transformers, one of them is working while on another one it gives NotFittedError. Here is the ppl code: class SkipSimpleImputer(SimpleImputer): def __init__(self, **kwargs): …
Obiii
  • 698
  • 1
  • 6
  • 26
0
votes
1 answer

Unabl to use Lambda in Scikit learn Pipeline

I have a pipeline which uses lambda functions: preprocess_ppl = ColumnTransformer( transformers=[ ('encode', categorical_transformer, make_column_selector(dtype_include=object)), ('zero_impute', fill_na_zero_transformer, lambda…
Obiii
  • 698
  • 1
  • 6
  • 26
0
votes
0 answers

Python lzma unable to load joblib

I have a scikit learn pipeline that I serialize using: with lzma.open('outputs/baseModel_LR.joblib',"wb") as f: dill.dump(pipeline, f) When I try to open the file and load the pipeline using: with lzma.open('outputs/baseModel_LR.joblib',"rb")…
Obiii
  • 698
  • 1
  • 6
  • 26