I can't decide how to balance my dataset on "distress situations" since it isn't something that can be measured as "the percentage of rotten apples in a factory".
For now, I've chosen to just use "50%-50%" of distress voice snippets and random none-distress snippets.
I'll be glad for some advice from the community, what are the best practises in this situation? I've chosen the 50-50 approach to avoid statistical biases and I'm using a Sequential (Keras) model.