Firstly, i apologize for a vague question . Let me explain. I have a pandas dataframe containing 2 columns namely square feet and number of bedrooms. I am trying to compute the price using linear regression and want to run the matrix to compute Gradient Descent. Since square feet are 1000 times larger than number of bedrooms, and Gradient Descent does not converge nicely , I am trying to handle this scale variance in attributes by normalizing.
The particular normalization I am doing is to subtract the individual column cell for bedrooms and squarefeet by their respective mean and divide the result by their respective standard deviation. The code I have written is this:-
meanb= X[['bedrooms']].mean()
meanFeet=X[['sqrfeet']].mean()
stdb=X[['bedrooms']].std()
stdFeet=X[['sqrfeet']].std()
norb=lambda x: (x-meanb)/stdb
nors=lambda x: (x-meanFeet)/stdFeet
X['bedrooms']=X['bedrooms'].apply(norb)
X['sqrfeet']= X['sqrfeet'].apply(nors)
The question is there an easier way of doing this as this won't scale if I have 1000's of columns. I am wondering if there is a dataframe.applymap() method that would compute the mean and std for respective individual column and execute the normalization on respective cells for each column. Note that each of the column can have different ranges of values but are all numeric.