Questions tagged [regression]

Regression analysis is a collection of statistical techniques for modeling and predicting one or multiple variables based on other data.

Wiki

Regression is a common applied statistical technique and a cornerstone of machine learning. Various algorithms and software packages can be used to fit and use regression models.

In other words, regression is a statistical measure that attempts to determine the strength of the relationship between one dependent variable (usually denoted by Y) and a series of other changing variables (known as independent variables). Typically the dependent variables are modeled with probability distributions whose parameters are assumed to vary (deterministically) with the independent variables.

Tag usage

Questions on should be about implementation and programming problems, not about the statistical or theoretical properties of the technique. Consider whether your question might be better suited to Cross Validated, the StackExchange site for statistics and machine learning.

Read more:

9532 questions
2
votes
1 answer

Plotting Polynomial Regression Curves in R

I have run polynomial regressions on the data that I am including from Quadratic to Septic but I am stuck trying to plot these regression curves on my scatter plot. I am asking for help creating code that will work for each polynomial order. Time <-…
Ben G.
  • 35
  • 4
2
votes
0 answers

Using geom_dl to label multiple geom_smooth plots

I have a dataset of all football (soccer) transfers made in the last fifteen seasons. I am doing a regression analysis with the placement of the teams. I am plotting the lines via geom_smooth for all of the five respective leagues. I have chosen to…
Antlserum
  • 21
  • 1
2
votes
1 answer

How to effectively use Categorical Variables in a regression?

I'm trying to understand how to use categorical variables in a linear regression in R. I have some insurance data that has a categorical variable of coverage type (basic, extended and premium). When I run a simple linear…
Josh Ortega
  • 179
  • 8
2
votes
2 answers

Keras Sequential Model Non-linear Regression Model Bad Prediction

To test a nonlinear sequential model using Keras, I made some random data x1,x2,x3 and y = a + b*x1 + c*x2^2 + d*x3^3 + e (a,b,c,d,e are constants). Loss is getting low really quickly but the model actually predicts a pretty wrong number. I've done…
2
votes
1 answer

Non linear regression using Xgboost

I have a dataframe with 36540 rows. the objective is to predict y HITS_DAY. #data https://github.com/soufMiashs/Predict_Hits I am trying to train a non-linear regression model but model doesn't seem to learn much. X_train, X_test, y_train, y_test =…
SoufianeS
  • 59
  • 1
  • 9
2
votes
1 answer

How do I make a year index that statsmodels vector Autoregression can recognize?

I am struggling to make that statsmodels.tsa.api.VAR recognize my index as an annual frequency I have a data frame, that is a panel, with a country (panel dimension) and a year variable (time dimension) df = df.set_index([‘country’, ‘year’]) Then I…
Jorge Alonso
  • 103
  • 11
2
votes
2 answers

TypeError: unsupported operand type(s) for -: ‘str’ and ‘int’ in PyCaret regression

I read multiple available questions about this topic, but still do not understand my problem. I am trying to build a regression, using PyCaret: from pycaret.regression import * fooPy = setup(data = foo, target = 'pts', session_id = 123) I receive…
Anakin Skywalker
  • 2,400
  • 5
  • 35
  • 63
2
votes
2 answers

Return regression line for all groups in ggplot scatterplot

I'm creating a scatterplot in ggplot where I am classifying the points based on company point. I would like to add a single trend line which shows the regression of all points. However, when I add geom_smooth() it adds a trend line for each class.…
Danny
  • 554
  • 1
  • 6
  • 17
2
votes
0 answers

Using tensorflow.data to generate dataset of images and multiple labels

I am trying to train a neural network to draw a bounding box around an object. I have generated the data myself, 256x256 rgb images and five labels per image (two corners of bounding box + a rotational component). In order to not run out of memory…
2
votes
1 answer

Panel regression gives error "exog does not have full column rank"

I am trying to estimate a panel regression (see: https://bashtage.github.io/linearmodels/doc/panel/examples/examples.html) My data is formatted like that (thats just an example snippet; in the orginal file there are 11 columns plus the timestamp and…
CSBossmann
  • 193
  • 2
  • 11
2
votes
1 answer

Robust linear regression with scipy?

Is there a function in scipy for doing robust linear regression? My current solution: slope, intercept, r_value, p_value, std_err = stats.linregress(income, exp)
walter
  • 51
  • 1
  • 3
2
votes
0 answers

xgboost regression predictions

I have a logistic regression xgboost model trained with the following hyperparameters (obtained with a grid search) in Python: Hyperparams selected {'gamma': 0, 'learning_rate': 0.1, 'max_depth': 3, 'min_child_weight': 1, 'n_estimators': 125} This…
2
votes
1 answer

Constraining OLS (or WLS) coeffecients using statsmodels

I have a regression of the form model = sm.GLM(y, X, w = weight). Which ends up being a simple weighted OLS. (note that specificying w as the error weights array actually works in sm.GLM identically to sm.WLS despite it not being in the…
wj wj
  • 75
  • 6
2
votes
1 answer

How can i generate random n-dimensional dataset for my regression task?

I need to generate a random n-dimensional dataset having m tuples. The first four dimensions are expected to be correlated with the ground truth vector y and the remaining ones are to be arbitrarily generated. I will use the dataset for my…
2
votes
1 answer

fbprophet yearly seasonality volatility

I am new to using fbprophet and have a question about using the predict function. As an example, I am using fbprophet to extrapolate Apples revenue for the next 5 years. Below is the code using the default settings. m =…
1 2 3
99
100