Questions tagged [regression]

Regression analysis is a collection of statistical techniques for modeling and predicting one or multiple variables based on other data.

Wiki

Regression is a common applied statistical technique and a cornerstone of machine learning. Various algorithms and software packages can be used to fit and use regression models.

In other words, regression is a statistical measure that attempts to determine the strength of the relationship between one dependent variable (usually denoted by Y) and a series of other changing variables (known as independent variables). Typically the dependent variables are modeled with probability distributions whose parameters are assumed to vary (deterministically) with the independent variables.

Tag usage

Questions on should be about implementation and programming problems, not about the statistical or theoretical properties of the technique. Consider whether your question might be better suited to Cross Validated, the StackExchange site for statistics and machine learning.

Read more:

9532 questions
17
votes
2 answers

Plot logistic regression curve in R

I want to plot a logistic regression curve of my data, but whenever I try to my plot produces multiple curves. Here's a picture of my last attempt: last attempt Here's the relevant code I am using: fit = glm(output ~ maxhr, data=heart,…
cafemolecular
  • 525
  • 2
  • 6
  • 13
17
votes
3 answers

ValueError: endog must be in the unit interval

While using statsmodels, I am getting this weird error: ValueError: endog must be in the unit interval. Can someone give me more information on this error? Google is not helping. Code that produced the error: """ Multiple regression with dummy…
Edward Yu
  • 400
  • 1
  • 4
  • 13
17
votes
4 answers

Best way to plot interaction effects from a linear model

In an effort to help populate the R tag here, I am posting a few questions I have often received from students. I have developed my own answers to these over the years, but perhaps there are better ways floating around that I don't know about. The…
Jake
  • 743
  • 1
  • 5
  • 10
16
votes
5 answers

Weighted logistic regression in Python

I'm looking for a good implementation for logistic regression (not regularized) in Python. I'm looking for a package that can also get weights for each vector. Can anyone suggest a good implementation / package? Thanks!
user5497
  • 243
  • 1
  • 2
  • 10
16
votes
2 answers

How to calculated the adjusted R2 value using scikit

I have a dataset for which I have to develop various models and compute the adjusted R2 value of all models. cv = KFold(n_splits=5,shuffle=True,random_state=45) r2 = make_scorer(r2_score) r2_val_score = cross_val_score(clf, x, y,…
Ahamed Moosa
  • 1,395
  • 7
  • 16
  • 30
16
votes
1 answer

Drawing regression line, confidence interval, and prediction interval in Python

I'm new to the regression game and hope to plot a functionally arbitrary, nonlinear regression line (plus confidence and prediction intervals) for a subset of data that satisfies a certain condition (i.e. with mean replicate value exceeding a…
neither-nor
  • 1,245
  • 2
  • 17
  • 30
16
votes
1 answer

Multi-output neural network combining regression and classification

If you have both a classification and regression problem that are related and rely on the same input data, is it possible to successfully architect a neural network that gives both classification and regression outputs? If so, how might the loss…
16
votes
2 answers

Naive Bayes For Regression

I was wondering, if I can apply naive bayes, to a regression problem and how will it be done. I have 4096 image features and 384 text features and, it won't be very bad if I assume independence between them. Can anyone tell me how to proceed?
Deven
  • 617
  • 2
  • 6
  • 20
16
votes
1 answer

Weighted linear regression with Scikit-learn

My data: State N Var1 Var2 Alabama 23 54 42 Alaska 4 53 53 Arizona 53 75 65 Var1 and Var2 are aggregated percentage values at…
KubiK888
  • 4,377
  • 14
  • 61
  • 115
16
votes
3 answers

How to get R-squared for robust regression (RLM) in Statsmodels?

When it comes to measuring goodness of fit - R-Squared seems to be a commonly understood (and accepted) measure for "simple" linear models. But in case of statsmodels (as well as other statistical software) RLM does not include R-squared together…
Primer
  • 10,092
  • 5
  • 43
  • 55
16
votes
1 answer

Getting statsmodels to use heteroskedasticity corrected standard errors in coefficient t-tests

I've been digging into the API of statsmodels.regression.linear_model.RegressionResults and have found how to retrieve different flavors of heteroskedasticity corrected standard errors (via properties like HC0_se, etc.) However, I can't quite…
sparc_spread
  • 10,643
  • 11
  • 45
  • 59
16
votes
1 answer

Different Robust Standard Errors of Logit Regression in Stata and R

I am trying to replicate a logit regression from Stata to R. In Stata I use the option "robust" to have the robust standard error (heteroscedasticity-consistent standard error). I am able to replicate the exactly same coefficients from Stata, but I…
chl111
  • 468
  • 3
  • 14
16
votes
3 answers

Trend lines ( regression, curve fitting) java library

I'm trying to develop an application that would compute the same trend lines that excel does, but for larger datasets. But I'm not able to find any java library that calculates such regressions. For the linera model I'm using Apache Commons math,…
Fgblanch
  • 5,195
  • 8
  • 37
  • 51
16
votes
3 answers

geom_smooth on a subset of data

Here is some data and a plot: set.seed(18) data = data.frame(y=c(rep(0:1,3),rnorm(18,mean=0.5,sd=0.1)),colour=rep(1:2,12),x=rep(1:4,each=6)) ggplot(data,aes(x=x,y=y,colour=factor(colour)))+geom_point()+…
Remi.b
  • 17,389
  • 28
  • 87
  • 168
16
votes
4 answers

Only run unit tests which's respective source code has changed?

I am running unit tests and Selenium tests in our Jenkins CI server. As we all know, tests take long to run in a large project. Is there a tool/framework for Java which could only trigger tests whose respective source code has changed? This because…
user1340582
  • 19,151
  • 35
  • 115
  • 171