0

I have a binary classification problem where the dataset is imbalanced, I don't know what to use between undersampling and oversampling!!

  • Welcome to SO, which is about *specific coding* questions; your question is way too broad, please do take some time to read [How to Ask](https://stackoverflow.com/help/how-to-ask) and [What topics can I ask about here?](https://stackoverflow.com/help/on-topic). – desertnaut Oct 18 '19 at 11:11
  • 1
    This question fits better to https://stats.stackexchange.com/ . Btw. if your question would have a correct answer there would be either no oversampling methods or no undersampling methods out there. In other words: post details in your question that people can answer it, i.e. how imbalanced? how many datapoints? how many of the minor class, how many of your major class? what ml algorithm will you use? how many features? – Florian H Oct 18 '19 at 11:11
  • @FlorianH In this level of (no) detail, I highly doubt that the question is a good fit even for Cross Validated, where most probably it will also get closed as too broad. – desertnaut Oct 18 '19 at 11:12
  • @desertnaut Yes i'm with you on that, thats why i said he needs to post more details. But if he comes up with the details its still not gonna be a specific programming question and will fit better for cross validated. – Florian H Oct 18 '19 at 11:15
  • @FlorianH agree, *if*... I'm saying simply that starting a comment as "*This question fits better to CV*" may easily give the wrong impression of "post to CV as-is"; there are already 2 votes for migration there, which IMHO should not be the case (as is, the question is simply too broad, and should be voted for closure as such). – desertnaut Oct 18 '19 at 11:19
  • @desertnaut agree, but i can't edit the commend anymore :) – Florian H Oct 18 '19 at 11:30
  • @FlorianH you can delete & repost - just sayin'... ;) – desertnaut Oct 18 '19 at 11:38
  • @desertnaut Yes I could but i think the discussion after that comment clarifies it enough :) – Florian H Oct 18 '19 at 12:02

1 Answers1

0

Try to explain your dataset with more clarification