Questions tagged [standardization]

Standardization, or normalization, is a process used to make a vector of real number values have a mean of zero and a standard deviation of one. Also called standard scores or z-scores.

72 questions
1
vote
3 answers

Normalization of a nested dictionary in python

I am new to Python and I have a nested dictionary for which I want to normalize the values of the dictionary. For example: nested_dictionary={'D': {'D': '0.33', 'B': '0.17', 'C': '0.00', 'A': '0.17', 'K': '0.00', 'J': '0.03'}, 'A': {'A': '0.50',…
1
vote
0 answers

How to get back original value after standardization in sklearn

I am using StandardScaler() to standardize the inputs. How can I convert prediction back to original data? I am using the following code, but it throws me an error. X_train = sc.fit_transform(X_train) X_test = sc.transform(X_test) #custom inputs…
NormA
  • 21
  • 2
1
vote
3 answers

Faster method of standardizing DF

I have a df containing roughly 3000 variables and 14000 datapoints. I need to standardize the df both within group and within df, creating 6000 total variables. My current implementation is below: col_names =…
Redratz
  • 136
  • 7
1
vote
1 answer

How to find out StandardScaling parameters .mean_ and .scale_ when using Column Transformer from Scikit-learn?

I want to apply StandardScaler only to the numerical parts of my dataset using the function sklearn.compose.ColumnTransformer, (the rest is already one-hot encoded). I would like to see .scale_ and .mean_ parameters fitted to the training data, but…
1
vote
1 answer

Standardizing a vector in R so that values shift towards boundaries

I have vector as follows - a <- c(0.211, 0.028, 0.321, 0.072, -0.606, -0.364, -0.066, 0.172, -0.917, 0.062, 0.117, -0.136, -0.296, 0.022, 0.046, -0.19, 0.057, -0.625, -0.01, 0.158, 0.407, -0.328, -0.347, -0.512, -0.101, 0.008, -0.406, -0.014,…
Saurabh
  • 1,566
  • 10
  • 23
1
vote
1 answer

How do you remerge the response variable to the data frame after removing it for standardization?

I have a dataset with 61 columns (60 explanatory variables and 1 response variable). All the explantory variables all numerical, and the response is categorical (Default).Some of the ex. variables have negative values (financial data), and therefore…
thosed
  • 13
  • 4
1
vote
2 answers

Standardizing or Normalizing discrete variable?

When we have discrete variable such as age, number of sick leaves, number of kids in the family and number of absences within a dataframe which i wanted to make a prediction model with binary result, is it okay to include these variables along with…
1
vote
1 answer

Standardization Result is different between Patsy & Pandas - Python

I found an interesting question and I would love to hear your interpretation. from patsy import dmatrix,demo_data df = pd.DataFrame(demo_data("a", "b", "x1", "x2", "y", "z column")) Patsy_Standarlize_Output = dmatrix("standardize(x2) +…
vae
  • 132
  • 6
0
votes
1 answer

How to calculate % improvement with differing denominators?

I have several business units with a different amount of customers and we are tracking how many of these customers completed a requested task. I want to track who is performing the best but a simple % improvement will be skewed based on the total…
0
votes
0 answers

standardization of all variables LASSO/OLS

i am running two regressions on the same variables (lasso and ols regression). should i standardize all variables X (including the independent variable y)? when standardizing features/regressors and the output variable, the mean squared error is a…
0
votes
0 answers

How to apply standardization for train set inside gridsearchcv?

It is a question more about theory than a problem in code itself. I have the following Pipeline, which will then be used in a GridSearchCV: my_model = Pipeline([('scaler', MinMaxScaler()), ('model', model())]) cv = GridSearchCV(my_model ,…
0
votes
0 answers

Do Stata's lasso commands automatically standardize dummy variables?

stata's command lasso automatically standardizes independent variables. It seems to automatically standardize dummy variables as well. Is this right? then, is there a way to do lasso without standardizing the dummy variables? lasso dependent…
nnr
  • 1
  • 1
0
votes
0 answers

Pipeline including StandardScaler and Gaussian Naive Bayes with low accuracy

I have tried to do classification using below pipeline including Gaussian Naive Bayes and Standard Scaler: GNB_pipeline = Pipeline(steps=[ ('scale',StandardScaler()), ('model', GaussianNB()) ]) But when I receive an accuracy which is…
0
votes
0 answers

How to standardize data with different dilution factors and minimum values for each dilution factor?

I have been having trouble running statistical analysis on a dataset from a mouse experiment. The dataset is derived from 3 different dilution factors (1:10, 1:20, 1:40) with each having different minumum threshold values of detection (5, <10, <20…
0
votes
0 answers

Do you need to standardize WoE-encoded variables when using L1 regularized Logistic Regression?

I read that numerical variables should be standardized before a Logistic Regression with L1 Lasso Regularization is trained on them. If you have a categorical variable, it is recommended to encode it with Weight of Evidence. This transforms the…