0

Is it possible to somehow increase the number of my sample size for a logistic regression.

enter image description here

The red data dots are false cases and the green ones are true cases. I want to create more data (let's say 500 points) which are sampled from the data in the figure. The reason for this is that the logistic regression line will tilt more to the right, like this: enter image description here

What is an easy way to do this?

Regards,

Dante

  • You can calculate average from the set of interest and than generate set of random data points around it, whereas their probability of occurence will be based on the data normal distribution. – Calcium Owl Oct 07 '22 at 11:57
  • This is basically what [SMOTE](https://arxiv.org/abs/1106.1813) does. See the [over-sampling guide](https://imbalanced-learn.org/stable/over_sampling.html) in the `imbalanced-learn` documentation. – Alexander L. Hayes Oct 07 '22 at 12:11

1 Answers1

0

There are multiple ways of doing it but the quickest one i would say is to generate data with random x (btw 25 and 200) and y (btw 0.5 and 1.5) . It's definitely not a clean way to do it, but it is fast

iven
  • 9
  • 4