Questions tagged [regression]

Regression analysis is a collection of statistical techniques for modeling and predicting one or multiple variables based on other data.

Wiki

Regression is a common applied statistical technique and a cornerstone of machine learning. Various algorithms and software packages can be used to fit and use regression models.

In other words, regression is a statistical measure that attempts to determine the strength of the relationship between one dependent variable (usually denoted by Y) and a series of other changing variables (known as independent variables). Typically the dependent variables are modeled with probability distributions whose parameters are assumed to vary (deterministically) with the independent variables.

Tag usage

Questions on should be about implementation and programming problems, not about the statistical or theoretical properties of the technique. Consider whether your question might be better suited to Cross Validated, the StackExchange site for statistics and machine learning.

Read more:

9532 questions
15
votes
3 answers

Regression tree in R

I am having trouble making a regression tree in R. I have a data frame with 17 attributes library(rpart) rt.model <- rpart(razlika ~ ., learn) I get an error: Error in `[.data.frame`(frame, predictors) : undefined columns selected Seems weird…
Borut Flis
  • 15,715
  • 30
  • 92
  • 119
15
votes
2 answers

Error in dataframe *tmp* replacement has x data has y

I'm a beginner in R. Here is a very simple code where I'm trying to save the residual term: # Create variables for child's EA: dat$cldeacdi <- rowMeans(dat[,c('cdcresp', 'cdcinv')],na.rm=T) dat$cldeacu <- rowMeans(dat[,c('cucresp',…
Marishka Usacheva
  • 347
  • 1
  • 3
  • 13
15
votes
1 answer

Mixed Effects Models in Spark or other technology

Is it possible to run a mixed-effects regression model in Spark? (as we can do with lme4 in R, with MixedModels in Julia or with Statsmodels MixedLM in Python). Any example would be great. I've read there is a GLMix function but I don't know if the…
skan
  • 7,423
  • 14
  • 59
  • 96
15
votes
3 answers

Using categorical data as features in sklean LogisticRegression

I'm trying to understand how to use categorical data as features in sklearn.linear_model's LogisticRegression. I understand of course I need to encode it. What I don't understand is how to pass the encoded feature to the Logistic regression so it's…
15
votes
5 answers

How to implement the Softmax derivative independently from any loss function?

For a neural networks library I implemented some activation functions and loss functions and their derivatives. They can be combined arbitrarily and the derivative at the output layers just becomes the product of the loss derivative and the…
danijar
  • 32,406
  • 45
  • 166
  • 297
15
votes
1 answer

Predict.lm() in R - how to get nonconstant prediction bands around fitted values

So I am currently trying to draw the confidence interval for a linear model. I found out I should use predict.lm() for this, but I have a few problems really understanding the function and I do not like using functions without knowing what's…
lisa
  • 640
  • 5
  • 10
  • 26
15
votes
2 answers

Fit a no-intercept model in caret

In R, I specify a model with no intercept as follows: data(iris) lmFit <- lm(Sepal.Length ~ 0 + Petal.Length + Petal.Width, data=iris) > round(coef(lmFit),2) Petal.Length Petal.Width 2.86 -4.48 However, if I fit the same model…
Zach
  • 29,791
  • 35
  • 142
  • 201
14
votes
4 answers

Orthogonal regression fitting in scipy least squares method

The leastsq method in scipy lib fits a curve to some data. And this method implies that in this data Y values depends on some X argument. And calculates the minimal distance between curve and the data point in the Y axis (dy) But what if I need to…
Vladimir
  • 601
  • 2
  • 7
  • 13
14
votes
2 answers

How to change points and add a regression to a cloudplot (using R)?

To make clear what I'm asking I've created an easy example. Step one is to create some data: gender <- factor(rep(c(1, 2), c(43, 41)), levels = c(1, 2),labels = c("male", "female")) numberofdrugs <- rpois(84, 50) + 1 geneticvalue <-…
MarkDollar
  • 143
  • 1
  • 6
14
votes
2 answers

Multivariate polynomial regression with Python

Recently I started to learn sklearn, numpy and pandas and I made a function for multivariate linear regression. Im wondering, is it possible to make multivariate polynomial regression? This is my code for multivariate polynomial regression, it shows…
taga
  • 3,537
  • 13
  • 53
  • 119
14
votes
2 answers

Is LASSO regression implemented in Statsmodels?

I would love to use a linear LASSO regression within statsmodels, so to be able to use the 'formula' notation for writing the model, that would save me quite some coding time when working with many categorical variables, and their interactions.…
famargar
  • 3,258
  • 6
  • 28
  • 44
14
votes
3 answers

How to use formula in R to exclude main effect but retain interaction

I do not want main effect because it is collinear with a finer factor fixed effect, so it is annoying to have these NA. In this example: lm(y ~ x * z) I want the interaction of x (numeric) and z (factor), but not the main effect of z.
wolfsatthedoor
  • 7,163
  • 18
  • 46
  • 90
14
votes
2 answers

Add Regression Plane to 3d Scatter Plot in Plotly

I am looking to take advantage of the awesome features in Plotly but I am having a hard time figuring out how to add a regression plane to a 3d scatter plot. Here is an example of how to get started with the 3d plot, does anyone know how to take it…
Josh
  • 1,800
  • 3
  • 15
  • 21
14
votes
1 answer

How to get comparable and reproducible results from LogisticRegressionCV and GridSearchCV

I want to score different classifiers with different parameters. For speedup on LogisticRegression I use LogisticRegressionCV (which at least 2x faster) and plan use GridSearchCV for others. But problem while it give me equal C parameters, but not…
14
votes
3 answers

How to fit a polynomial curve to data using scikit-learn?

Problem context Using scikit-learn with Python, I'm trying to fit a quadratic polynomial curve to a set of data, so that the model would be of the form y = a2x^2 + a1x + a0 and the an coefficients will be provided by a model. The problem I don't…
Juan Carlos Coto
  • 11,900
  • 22
  • 62
  • 102