Questions tagged [regression]

Regression analysis is a collection of statistical techniques for modeling and predicting one or multiple variables based on other data.

Wiki

Regression is a common applied statistical technique and a cornerstone of machine learning. Various algorithms and software packages can be used to fit and use regression models.

In other words, regression is a statistical measure that attempts to determine the strength of the relationship between one dependent variable (usually denoted by Y) and a series of other changing variables (known as independent variables). Typically the dependent variables are modeled with probability distributions whose parameters are assumed to vary (deterministically) with the independent variables.

Tag usage

Questions on should be about implementation and programming problems, not about the statistical or theoretical properties of the technique. Consider whether your question might be better suited to Cross Validated, the StackExchange site for statistics and machine learning.

Read more:

9532 questions
2
votes
2 answers

Apply glm() by filtering a column by its value in R

I have a dataframe with let's call it dependent variable, various independent variables (indicators) and a filtering variable. My goal is to run regressions by filtering different categories in my filtering variable. For example, if I want to run…
rg4s
  • 811
  • 5
  • 22
2
votes
2 answers

Using a loop to run a regression using different datasets in R?

I have the following dataset: n <- 2 strata <- rep(1:4, each=n) y <- rnorm(n = 8) x <- 1:8 df <- cbind.data.frame(y, x, strata) I want to perform the following processes using a loop data_1 <- subset(df, strata == 1) data_2 <- subset(df, strata ==…
Muhammad Kamil
  • 635
  • 2
  • 15
2
votes
0 answers

Get Altair regression parameters

I am trying to get the Altair regression line parameters in a variable, but I can't quite figure out how to do it. I managed to show them on my plot, but I can't access them in any other way. I read this post, but after trying to use the…
Eboyer
  • 85
  • 1
  • 6
2
votes
1 answer

Having trouble with overfitting in simple R logistic regression

I am a newbie to R and I am trying to perform a logistic regression on a set of clinical data. My independent variable is AGE, TEMP, WBC, NLR, CRP, PCT, ESR, IL6, and TIME. My dependent variable is binomial CRKP. After using glm.fit, I was given…
2
votes
1 answer

How to repeat univariate regression and extract P values?

I am using lapply to perform several glm regressions on one dependent variable by one independent variable at a time. but I'm not sure how to extract the P values at a time. There are 200 features in my dataset, but the code below only gave me the P…
Shykp
  • 35
  • 4
2
votes
0 answers

XGBoostRegressor outputs warning about parameters that might not be used

Im doing a model with XGBRegressor with multiple outputs: from xgboost import XGBRegressor from sklearn.multioutput import MultiOutputRegressor XBC = XGBRegressor(booster = 'gbtree',learning_rate…
2
votes
1 answer

Stats Models out of sample prediction of new data where features have been transformed

I'm intrigued on why I'm unable to arrived at the same values the model is predicting. Consider the below model. I'm trying to understand the relations between features insurance charges, age and if a client is or not a smoker. Notice age variable…
Francisco
  • 492
  • 4
  • 19
2
votes
1 answer

Regression in data frame in R

Hey I have following test data: test = data.frame(Date = c(as.Date("2010-10-10"), as.Date("2010-10-10"), as.Date("2010-12-10"), as.Date("2010-12-10")), Rate = c(0.1, 0.15, 0.16, 0.2), FCF = c(100,200,150,250)) Now I want to group the data by Date,…
2
votes
1 answer

Convert coefficients from log odds to marginal effects with imputed data in R

I am using multiple imputation on missing data and then using the pool_mi function to get coefficients. Since my data is clustered I also had to calculate cluster robust SE my regression model using the lm.cluster function. However the output for…
2
votes
0 answers

Error in cbind2(1, newx) %*% nbeta : Cholmod error- Lasso Regression Error in R

I'm working on predicting a regression model on Lasso. I have a total of 137 train data, and 100000 test data to predict the total revenue. to build the model I split train data into train and test ( train = 96, test = 97-137). When I ran the lasso…
Ekram Diab
  • 21
  • 1
2
votes
2 answers

How to keep just one variable in stargazer regression output? (oposite of "omit")

Does anyone know what could be the opposite of stargazer's argument "omit" when making a regression table output? I'm trying to show just one (or a few) covariates from a regression. I know one could use "omit" and then list all the variable's names…
2
votes
1 answer

making predictions from 2D data

I am working on making prediction from 2d data. Data size is 7640x200x2; for each 200x2 matrix, I want a 2x1 array predicted from it. I am a beginner, and I am confused how to bulid a useful model. I have tried a cnn+lstm model, but the result was…
2
votes
1 answer

Boundary (singular) fit in lmer

I know this error has already been issued in stackoverflow, but the solution for the other questions doesn't seem to apply to my problem. I have a very simple model that predicts energy expenditure based on the number of days. a<-lmer(energy ~ days…
JMarcelino
  • 904
  • 4
  • 13
  • 30
2
votes
1 answer

Exponential regression using nls

I am having trouble fitting an exponential curve to my data. Here's my code: x<-c(0.134,0.215,0.345,0.482,0.538,0.555) y<-c(0,0,0.004,0.291,1.135,1.684) plot(x,y) estimates<- list(b1 = 0.1, b2 = 5e-7) nlfit <- nls(y ~ b1 * (exp(x/b2)-1),…
M00nQ
  • 23
  • 3
2
votes
1 answer

GEKKO - parameter estimation with custom objective function - error code -13

I have been successful in performing a steady-state parameter estimation employing the same techniques presented in the Gekko tutorials (linear and non-linear regression). Below is the code: # -*- coding: utf-8 -*- """ Spyder Editor This is a…