Questions tagged [mlxtend]

85 questions
2
votes
2 answers

ColumnTransformer(s) in various parts of a pipeline do not play well

I am using sklearn and mlxtend.regressor.StackingRegressor to build a stacked regression model. For example, say I want the following small pipeline: A Stacking Regressor with two regressors: A pipeline which: Performs data imputation 1-hot…
Alberto Santini
  • 6,425
  • 1
  • 26
  • 37
2
votes
2 answers

Why won't Colab import fpgrowth from mlxtend.frequent_patterns?

When I import mlxtend.frequent_patterns, the function fpgrowth and fpmax are not there. However, they are there if I use Jupyter Notebook in Anaconda Navigator. Anyone know why Colab will not import? import pandas as pd from mlxtend.preprocessing…
Pete
  • 21
  • 3
2
votes
2 answers

Convert a list of lists to array type in Python

I have a matrix like this and want to convert it to array for processing. How to do it [[25 3 0 1 0 2 1] [ 1 21 0 0 0 0 0] [ 0 3 18 0 0 0 0] [ 1 0 0 35 2 0 0] [ 0 0 0 4 27 2 0] [ 0 0 0 0 1 27 0] [ 1 1 0 0 0 …
user567879
  • 5,139
  • 20
  • 71
  • 105
2
votes
1 answer

What does it mean AttributeError: 'ColumnSelector' object has no attribute 'n_features_in_'?

I am making a grid search for tuning hyperparameters of a stacking estimator(StackingClassifier object from sklearn.ensemble library). I making use of the scikit library for ML, and the RandomizedSearchCV function. In adition to this, the base…
2
votes
2 answers

Is it possible to set the color for the bottom region with `mlxtend.plotting`?

I am trying to reproduce the example in this post, which produces this figure. The colored regions above are plotted by mlxtend.plotting (version '0.14.0'). With the default settings on colab, this code from mlxtend.plotting import…
user11566345
2
votes
0 answers

My StackingCVClassifier Has Lower Accuracy than Base Classifiers Yet Does Very Well on Test Set

I built a simple Stacking Classifier with mlxtend and am trying different base classifiers and I am facing an interesting situation. From all my research it seems to me that stacking classifiers always perform better than their base classifiers. In…
Odisseo
  • 747
  • 1
  • 13
  • 32
2
votes
2 answers

market basket analysis in python for large transaction dataset

On applying apriori (support >= 0.01) and association_rules functions using mlxtend package of python on 4.2L+ rows transaction data (in the form of sparse matrix) , generation of frequent item sets and association rules takes too much time. Sample…
2
votes
0 answers

scikit-learn mlxtend EnsembleVoteClassifier with sample_weights

I am trying to fit an EnsembleVoteClassifier according to mlxtend documentation For normal grid.fit I can use fit_params to set sample_weight, but with the VotingClassifier it does not work. How can this be solved? from sklearn import datasets iris…
user670186
  • 2,588
  • 6
  • 37
  • 55
2
votes
1 answer

mlextend plot_decision_regions with model fit on Pandas DataFrame?

I'm a big fan of mlxtend's plot_decision_regions function, (http://rasbt.github.io/mlxtend/#examples , https://stackoverflow.com/a/43298736/1870832) It accepts an X(just two columns at a time), y, and (fitted) classifier clf object, and then…
Max Power
  • 8,265
  • 13
  • 50
  • 91
1
vote
0 answers

Slurm Cluster Python Script Not Running on Multiple Nodes using SBATCH

We recently setup a Slurm Cluster with 2 Nodes(1 headnode+compute node and 1 compute nodes) for some HPC CFD simulations.Right now i am trying to run some python script which is used for feature selection in one of our Machine learning project which…
akhil kumar
  • 1,598
  • 1
  • 13
  • 26
1
vote
1 answer

How to scan the candidate itemset by using the item matrix

I am doing a small data mining project and I encountered a problem that is, to scan the 'item matrix' and count the occurrence of each candidate itemset. This is the what candidate itemsets look like. It is a list of several frozensets. [{'', '',…
Cooper
  • 73
  • 6
1
vote
1 answer

Scaling and data leakage on cross validation and test set

I have more of a best practice question. I am scaling my data and I understand that I should fit_transform on my training set and transform on my test set because of potential data leakage. Now if I want to use both (5 fold) Cross validation on my…
1
vote
0 answers

Issue in calculate variance,bias python using mlxtend

I am using mlxtend lib for bias,variance calculation. The code is, y=df[target] x=df.drop(target,axis=1) x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=1) model = LinearRegression() mse, bias, var =…
sundarr
  • 385
  • 2
  • 8
1
vote
1 answer

Find corresponding rows with frequent itemsets

My dataset is an adjacency matrix comparable with customer buying information. An example toy dataset: p = {'A': [0,1,0,1], 'B': [1,1,1,1], 'C': [0,0,1,1], 'D': [1,1,1,0]} df = pd.DataFrame(data=p) df Now I am interested in the frequent itemset so…
Tox
  • 834
  • 2
  • 12
  • 33
1
vote
1 answer

How to interpret results of Mlxtend's association rule

I am using mlxtend to find association rules: Here is the code: df = apriori(dum_data, min_support=0.4, use_colnames=True) rules = association_rules(df, metric="lift", min_threshold=1) rules2=rules[ (rules['lift'] >= 1) & (rules['confidence'] >=…
MAC
  • 1,345
  • 2
  • 30
  • 60