Questions tagged [standardized]

Shifting and rescaling data to assure zero mean and unit variance.

Overview

Specifically, when xi, i =1,..., n is a batch of data, its mean is:

m=∑xi/n

and its variance is:

s2 = ∑(xi−m)2)/ν

where,

v is either n or n-1 (choices vary with application).

Standardization replaces each xi with zi = (xi-m)/s. Do not confuse standardization with normalization.


Tag usage

Questions on tag should be about implementation and programming problems, not about the statistical or theoretical properties of the technique. Consider whether your question might be better suited to Cross Validated, the StackExchange site for statistics, machine learning and data analysis.

145 questions
0
votes
0 answers

PCA on Panel Data? (Python, PCA, Scikit.learn)

Dataset looks like: all features x are numerical and scaled except for name (which is currently the indexed alongside year) [name, year, x1, x2, x3, x4, ...] josh 2001 ... #the various values for the x_features, for that name, at that time josh…
Alex
  • 188
  • 11
0
votes
1 answer

Naming a dataframe like the path

I have a lot of CSV that need to be standardized. I created a dictionary for doing so and so far the function that I have looks like this: inputpath <- ("input") files<- paste0(inputpath, "/", list.files(path = inputpath, pattern…
0
votes
0 answers

Standardizing (z-score) multiple variabels at once

I have a dataset that looks something like this: but with hundred of variables set.seed(123) df <- data.frame(id= c(1,1,1,2,2,2,3,3,3), time=c(1,2,3,1,2,3,1,2,3),y = rnorm(9), x1 = rnorm(9), x2 = c(0,0,0,0,1,0,1,1,1), x3 = rnorm(9), c1 = rnorm(9), …
Alex
  • 1,207
  • 9
  • 25
0
votes
1 answer

Why do I get different results when using the StandardScaler in GridSearchCV?

I want to optimize the hyperparameters of an SVM by GridSearchCV. But the score of the best estimator is very different from the score when run the svm with the best parameters. #### Hyperparameter search with GridSearchCV### pipeline = Pipeline([ …
Code Now
  • 711
  • 2
  • 9
  • 20
0
votes
1 answer

Training Keras model without validation set and normalization of images

I'm using Keras on Python to train a CNN autoencoder. In the fit() method I have to provide validation_split or validation_data. First, I would like to use 80% of my data as training data and 20% as validation data (random split). As soon as I have…
machinery
  • 5,972
  • 12
  • 67
  • 118
0
votes
1 answer

Generate internal age and sex z-scores

I have the following data frame, with data from 1000 people on sex, three repeated height measures and the age at each measure. data <- data.frame( child_id = 1:1000, sex = rbinom(n = 1000, size = 1, prob = 0.5), height_5 = rnorm(1000, mean = 80, sd…
aelhak
  • 441
  • 4
  • 14
0
votes
2 answers

How to standardize categorical variables associated with timestamps

I have a dataset that has 8 mixed features (6 numeric and 2 categorical). Since the numeric values have different ranges, I will have to normalize the dataset as a whole to be able to perform farther actions such as machine learning algorithms,…
Alex Davies
  • 191
  • 2
  • 11
0
votes
1 answer

How does SAS proc stdize method=range work?

How does PROC STDIZE METHOD = RANGE work? I thought that it would work like this: Score = (Observation - Min) / ( Max - Min) However, the range is [1,100] and there is never a 0 i.e. when you would substract the min observation from itself on the…
78282219
  • 159
  • 1
  • 12
0
votes
1 answer

Is there a standardized Message format?

There are several different messaging apps and services available such as: Slack, HipChat, IRC, Zoom Chat, etc... Is there a standardized (or common) message format being used (or available) to represent these messages to ease developer…
Benjamin Dean
  • 1,218
  • 11
  • 11
0
votes
1 answer

Standardized mean difference in R

I have got a data frame like the following: region group mid_pop 1 2 1146 2 4 1682 3 3 2891 4 1 7654 5…
Eshmel
  • 49
  • 2
  • 9
0
votes
2 answers

Standardize variable by group - why is the mean always zero?

I have the following data: df = pd.DataFrame({'sound': ['A', 'B', 'B', 'A', 'B', 'A'], 'score': [10, 5, 6, 7, 11, 1]}) print(df) sound score 0 A 10 1 B 5 2 B 6 3 A 7 4 B 11 5 A …
Simon
  • 9,762
  • 15
  • 62
  • 119
0
votes
1 answer

Unable to scope variables into R functions for `standardize::standardize`

I am trying to create a custom function that allows me to apply mixed effects standardization to a large dplyr data frame using the standardize package. I have been unsuccessful is parsing function arguments into the standardize function despite…
romsdocs
  • 68
  • 6
0
votes
0 answers

slicing StandardScaler for unrolled time steps

Here's a simplified explanation: I have a dataframe of multiple unrolled time-steps that is used to predict price2 and volume2 (the next step). Before I train my network, I want to use StandardScaler. However, when I went to invert the predicted…
johan bender
  • 53
  • 1
  • 1
  • 5
0
votes
2 answers

Different Significance in Stargazer for Standardised/Unstandardised Coefficients

I've performed a multiple linear regression on a large data set using m1 <- lm(y ~ x + x1 + x2..., dataset) added standardised beta coefficients using lm.beta m1_stnd <- lm.beta(m1) and tabulated the results using…
0
votes
0 answers

Java Weka - how to standardize a single Instance

I need to implement a Weka classifier that standaridizes the input data before processing. I use the following code for this: private Filter standardize = new Standardize(); ... public void buildClassifier(Instances instances) throws Exception { …