Highest Voted 'feature-engineering' Questions

-1

votes

2 answers

How to find the difference between time and feed the difference in a new column?

I have a dataframe trades_df which looks like this - Open Time Open Price Close Time 19-08-2020 12:19 1.19459 19-08-2020 12:48 28-08-2020 03:09 0.90157 08-09-2020 12:20 It has columns open_time and close_time in the format 19-08-2020…

asked Jun 16 '22 at 02:04

user18587858

-1

votes

1 answer

Machine learning - does the independent variable data need to be balanced as well?

I know that we need to have balanced data in y to have a better model. However, I'm wondering whether we need to have balanced data in independent variable as well. In the following dataframe, X3 is a category type independent variable. X1 X2 …

machine-learning scikit-learn xgboost lightgbm feature-engineering

asked Jun 01 '22 at 09:26

John

129
12

-1

votes

1 answer

How to divide all numeric columns by each other?

I have dataframe with more than 100 features, half of it are numeric columns. I want to generate new features by dividing columns by each other. Is there an easy way to do it? Example: df = pd.DataFrame(np.random.randint(0,100,size=(100, 4)),…

python pandas dataframe machine-learning feature-engineering

asked May 12 '22 at 15:45

seems-to-work-enjoyer

13
3

-1

votes

1 answer

Should we always first perform feature normalization and then the feature reduction?

Sometimes performing feature reduction reduces number of features with methods like PCA and then we could scale only the relevant variables. Is there a rule that we need to do normalization/scaling first and then the feature reduction?

machine-learning data-science feature-engineering machine-learning-model feature-scaling

asked Apr 19 '22 at 12:00

Sharat Ainapur

19
8

-1

votes

1 answer

Modify equivalent values in a column

I'm working with Pandas, but I have a question about how to change equivalent values. I want to work with binary values in the "class" column so I have 1 and I want 2 and 3 to be changed to 0. Ah! And I don't just have these lines, I have 70 in…

python pandas feature-engineering

asked Dec 27 '21 at 18:25

StaLLoNe_CoBRa

23
6

-1

votes

1 answer

Is there any other way (to combine values of one column into different groups), instead of using 'df.replace( )' several times in the below problem?

In : char_df['Loan_Title'].unique() Out: array(['debt consolidation', 'credit card refinancing', 'home improvement', 'credit consolidation', 'green loan', 'other', 'moving and relocation', 'credit cards', 'medical expenses', 'refinance', 'credit…

python pandas machine-learning data-science feature-engineering

asked Dec 08 '21 at 15:35

Castle

9
2

-1

votes

1 answer

How to Exclude Holidays and Weekends from a Bank data in python

I have a bank data having dates and amount, and a holiday csv file given separately which has dates of holiday and I have to add the amount values from date of holiday to the next working day and make the amount of the day having holiday '0'

python data-science feature-engineering

asked Sep 28 '21 at 08:42

firestorm

1
3

-1

votes

2 answers

How to refer to other rows in Pandas DataFrame in context of a single row?

I have the following example Pandas DataFrame df UserID Total Date 1 20 2019-01-01 1 18 2019-01-02 1 22 2019-01-03 1 16 2019-01-04 1 17 2019-01-05 1 26 2019-01-06 1 30 2019-01-07 1 28 …

python python-3.x pandas dataframe feature-engineering

asked Sep 20 '21 at 11:51

Taher Elhouderi

233
2
11

-1

votes

1 answer

How to plot a scatter plot to understand the general trend in data, when we have multiple features

Here, Features are X_train Target is y_train When there is a dataset with 'n' number of features how will we select that one feature to make a scatter plot with the target variable to understand the general trend of the training data, to select a…

python machine-learning feature-selection sklearn-pandas feature-engineering

asked May 15 '21 at 17:05

yuvraj singh

88
2
7

-1

votes

1 answer

Data pre-processing and feature engineering

I have been doing some reading on data pre-processing and feature engineering including feature selection, feature importance and feature construction. My understanding is that Feature engineer is applied in data preprocessing stage. Additionally,…

feature-selection feature-engineering data-preprocessing

asked Apr 27 '21 at 19:54

Shosho

69
6

-1

votes

1 answer

how to fillna the nan value in age feature for the titanic data?

I wan to fill the nan value in age feature . In the titatic train data pclass and embarked feature are independent feature .Based on these feature I want to fill the nan value of the age feature. Pclass - (0,1,2) unique value, Embarked -…

python machine-learning feature-engineering

asked Feb 26 '21 at 16:10

Amit Saini

136
2
16

-1

votes

1 answer

sklearn ValueError: Input contains NaN

ValueError: Input contains NaN i have run from sklearn.preprocessing import OrdinalEncoderfrom data_.iloc[:,1:-1] = OrdinalEncoder().fit_transform(data_.iloc[:,1:-1]) here is data_ Age Sex Embarked Survived 0 22.0 male S …

python scikit-learn feature-engineering

asked Jan 20 '21 at 08:30

xyssyxxys

1
1

-1

votes

1 answer

Feature Extraction Using Representation Learning

I'm new to machine learning, and I've been given a task where I'm asked to extract features from a data set with continuous data using representation learning (for example a stacked autoencoder). Then I'm to combine these extracted features with the…

python feature-extraction feature-selection feature-engineering

asked Dec 29 '20 at 14:50

annatn998

75
8

-1

votes

1 answer

Pandas: how to add column representing the intersection of 2 attributes in a Dataframe

lets say i have 2 csv files (very large files), the first file represents restaurants and have 6 attributes restaurant_id, name,star_rating,city,zone,closed the second file represents the categories of the restaurants and have 2 attributes…

python pandas feature-extraction feature-engineering

asked Sep 21 '20 at 18:40

Lynn

121
8
25

-1

votes

1 answer

How should I deal with NaN values when the data isn't categorical and determining them isn't practical?

I'm currently doing the house prices kaggle, and there is a feature of the year which the garage was built in. There are houses without a garage, so the feature is NaN for them. How should I deal with this situation? Imputing those values with 0…

machine-learning data-science nan data-cleaning feature-engineering

asked May 23 '20 at 17:25

Yuval

1
1

Questions tagged [feature-engineering]