Why number of coefficients is larger than number of features when using linear regression by Graphlab

Question

In linear regression model , When the number of features is 1, through Graphlab, the number of coefficients is 2. But when the number of features is 3, the number of coefficients is large,even 40. Why not 4? So ,what does the coefficients mean? And why these coefficients have the same name? Sorry , my English is not well...

score 0 · Answer 1 · answered Oct 09 '17 at 12:34

0

If we are looking at your screenshot we can see index for each coefficient bathrooms.

But according the documentation: "Note that the index column in the coefficients is only applicable for categorical features, lists, and dictionaries."

Also: "All SFrame columns of type str are automatically transformed into categorical variables. Notice that the number of coefficients and the number of features aren't the same."

Looks like you are putting data as string type. Try to check type of bathrooms column at your train_data.

PS. example from documentation: image

answered Oct 09 '17 at 12:34

Anton Alekseev

542
8
18

Thank you for your answer. But I still have a question. From the documentation, the number of these dummy coefficients is equal to the total number of categories minus 1?But why ? I think each category should have a dummy coefficient. – csorg Oct 10 '17 at 05:18
For usual linear reg model you are using some bias like beta0 (y=β0+β1d1+β2d2+β3d3+ε) where dn - input var, βn - hyper-parameter and ε - Gaussian noise. Therefore reference category can be predicted like this bias. Detailed description you can find [there](https://stats.stackexchange.com/a/115052). – Anton Alekseev Oct 10 '17 at 07:00

Why number of coefficients is larger than number of features when using linear regression by Graphlab

1 Answers1