Highest Voted 'feature-selection' Questions

-1

votes

1 answer

training set features different from test set features in prediction

I have two data set, train data set and test data set and I want to predict on these data sets. My train data set has these features: ID, name, age, Time of recruitment, Time fired, status My test data set has these features: ID, name, age, Time…

asked Jun 23 '17 at 16:52

martin

39
1
11

-1

votes

1 answer

Dimension Reduction for Clustering in R (PCA and other methods)

Let me preface this: I have looked extensively on this matter and I've found several intriguing possibilities to look into (such as this and this). I've also looked into principal component analysis and I've seen some sources that claim it's a poor…

r pca feature-selection dimensionality-reduction

asked Apr 05 '17 at 15:40

BlueRhapsody

93
2
13

-1

votes

1 answer

how to convert mix of text and numerical data to feature data in apache spark

I have a CSV of both textual and numerical data. I need to convert it to feature vector data in Spark (Double values). Is there any way to do that ? I see some e.g where each keyword is mapped to some double value and use this to convert. However if…

apache-spark apache-spark-mllib feature-selection

asked Jul 18 '16 at 06:04

Charls Joseph

141
2
9

-1

votes

1 answer

Select important features then impute or first impute then select important features?

I have a dataset with lots of features (mostly categorical features(Yes/No)) and lots of missing values. One of the techniques for dimensionality reduction is to generate a large and carefully constructed set of trees against a target attribute and…

pandas machine-learning feature-selection imputation

asked Jun 01 '16 at 18:59

Karup

2,024
3
22
48

-1

votes

1 answer

Getting features importance with RandomClassifier Scikit

I try to get the importance weights of every feature from my dataframe. I use this code from scikit documentation: names=['Class label', 'Alcohol', 'Malic acid', 'Ash', 'Alcalinity of ash', 'Magnesium', 'Total phenols', 'Flavanoids', 'Nonflavanoid…

machine-learning scikit-learn random-forest feature-extraction feature-selection

asked Apr 06 '16 at 11:20

mokebe

77
1
7

-1

votes

1 answer

Choosing Attributes for Data Mining Algorithm

I currently need to do risk analysis data mining on a dataset. This dataset has around 120 attributes. Although I can use common sense, is there any systematic methodology to do data reduction that it can guide us to choose which attributes are…

machine-learning weka feature-selection

asked Nov 29 '15 at 02:38

Dino

781
3
14
32

-1

votes

1 answer

Selecting samples for supervised machine learning

How does one select a sample size and sample set (for training and testing) for a binary classification problem to be solved by applying supervised learning? The current implementation is based on 15 binary features which we may expand to 20 or…

machine-learning sample feature-selection supervised-learning

asked Jun 10 '15 at 23:06

Parag Ahire

37
1
6

-1

votes

1 answer

What do the features given by a feature selection method mean in a binary classifier which has a cross validation accuracy of 0?

So I know that given a binary classifier, the farther away you are from an accuracy of 0.5 the better your classifier is. (I.e. A binary classifier that gets everything wrong can be converted to one which gets everything right by always inverting…

machine-learning classification feature-selection

asked Feb 04 '14 at 18:32

ABC

1,387
3
17
28

-2

votes

0 answers

'ps_calc_01', How does the XGBClassifier predict and calculate accuracy?

model = XGBClassifier() model.fit(X_train[['ps_calc_01']], y_train) y_pred = model.predict(X_test[['ps_calc_01']]) accuracy = accuracy_score(y_test, y_pred) print("Accuracy: %.2f%%" % (accuracy * 100.0)) I'm seeing that ('ps_calc_01') is used for…

machine-learning data-science classification xgboost feature-selection

asked Aug 28 '23 at 13:02

julia mathews

1
2

-2

votes

1 answer

How to use colors as features in machine learning?

I have colors in RGB form. There are 4 columns 'accent_color'-> (0.6901960784313725, 0.14901960784313725, 0.10588235294117647) 'dominant_colors' -> [(0.6470588235294118, 0.16470588235294117, 0.16470588235294117), (0.0, 0.0, 0.0), (1.0, 1.0,…

machine-learning nlp feature-extraction feature-selection

asked May 08 '22 at 14:29

Sandeep Kumar Kushwaha

167
7

-2

votes

1 answer

how to remove irrelevant features in document classification from Weka?

in Weka, text classification have a lot of features after applying feature selection how to remove irrelevant features in process tab quickly not one by one since in text classification the number of feature is high and it needs time to remove one…

machine-learning data-mining weka text-classification feature-selection

asked Aug 24 '21 at 22:26

user3100876

67
4

-2

votes

1 answer

I face this error;; AttributeError: module 'numpy' has no attribute 'corroef'

I try to use correlation to extract features , but I faced this problem: please help me, how I can fix it? AttributeError: module 'numpy' has no attribute 'corroef' This my code to correlate the features:: cor_list = [] feature_name =…

python correlation feature-extraction feature-selection error-correction

asked Mar 10 '21 at 23:55

Roaa

23
1
7

-2

votes

1 answer

Which features to drop during feature selection

During feature selection (after doing extensive feature engineering), is there any set of rules that govern which features to drop and which to keep ? I know that highly correlated features should be dropped or merged into newer features, however I…

data-science feature-selection

asked Sep 04 '20 at 03:54

SOURIN ROY

21
1
5

-2

votes

1 answer

What is "neg_mean_absolute_error" and where can I find it?

I am new to machine learning. I am trying to learn feature selection from this link. Here they have a line of code which is given below search = GridSearchCV(pipeline, grid, scoring='neg_mean_squared_error', n_jobs=-1, cv=cv) But whenever I try to…

python-3.x machine-learning scikit-learn feature-selection

asked Jun 09 '20 at 09:12

odbhut.shei.chhele

5,834
16
69
109

-2

votes

1 answer

Feature selection methodology to reduce Overfit in classification model

My dataset has over 200 variables and I am running a classification model on it, which is leading to a model OverFit. Which suggested for reducing the number of features? I started with Feature Importance, however due to such a large number of…

python machine-learning classification feature-selection

asked May 13 '20 at 12:46

Deeksha Mahapatra

51
1
3

Questions tagged [feature-selection]