2

How to Determine the best threshold value for deep learning model. I am working on predicting seizure epilepsy using CNN. I want to determine the best threshold for my deep learning model in order to get best results.

I am trying for more than 2 weeks to find how I can do it.

Any help would be appreciated.

code

history=model.fit_generator(generate_arrays_for_training(indexPat, filesPath, end=75), #end=75),
                                validation_data=generate_arrays_for_training(indexPat, filesPath, start=75),#start=75),
                                steps_per_epoch=int((len(filesPath)-int(len(filesPath)/100*25))),#*25), 
                                validation_steps=int((len(filesPath)-int(len(filesPath)/100*75))),#*75),
                                verbose=2,
                                epochs=50, max_queue_size=2, shuffle=True, callbacks=[callback,call])
Eda
  • 565
  • 1
  • 7
  • 18
  • what threshold? – Proko Jun 14 '20 at 18:22
  • @Proko a Classification threshold or decision threshold for example the value above that threshold indicates "spam"; a value below indicates "not spam." – Eda Jun 14 '20 at 18:27

2 Answers2

2

In general, choosing right classification threshold depends on the use case. You should remember that choosing threshold is not a part of hyperparameters tuning. The value of classification threshold greatly impacts the behaviour of model after you train it.

If you increase it, you want your model to be very sure about prediction which means you will be filtering out false positives - you will be targeting precision. This might be the case when your model is a part of a mission-critical pipeline where decision made based on positive output of model is costly (in terms of money, time, human resources, computational resources etc...)

If you decrease it, your model will say that more examples are positives which will allow you to explore more examples that are potentially positive (you target recall. This is important when a false negative is disastrous e.g in medical cases (You would rather check whether low-probability patient has cancer rather than ignoring him and find out later that he was indeed sick)

For more examples please see When is precision more important over recall?

Now, choosing between recall and precision is a trade-off and you have to choose it based on you situation. Two tools to help you achieve this are ROC and Recall-Precision Curves How to Use ROC Curves and Precision-Recall Curves for Classification in Python which indicates how model handles false positives and false negatives depending on classification threshold

Proko
  • 1,871
  • 1
  • 11
  • 23
1

Many ML algorithms are capable of predicting a score for a class membership which needs to be interpreted before it can be plotted to a class label. And you achieve this by using a threshold, such as 0.5, whereby values >= than the threshold are mapped to one class and the rest mapped to another class.

Class 1 = Prediction < 0.5; Class 0 = Prediction => 0.5

It’s crucial to find the best threshold value for the kind of problem you're on and not just assume a classification threshold e.g. a 0.5;

Why? The default threshold can often result in pretty poor performance for classification problems with severe class imbalance.

See, ML thresholds are problem-specific and must be fine-tuned. Read a short article about it here

One of the best ways to determine the best threshold for your deep learning model in order to get the best results is to tune the threshold used to map probabilities to a class.

The best threshold for the CNN can be calculated directly using ROC Curves and Precision-Recall Curves. In some cases, you can use a grid search to fine-tune the threshold and find the optimal value.

The code below will help you check the option that will give the best results. GitHub link:

from deepchecks.checks.performance import PerformanceReport
check = PerformanceReport()
check.run(ds, clf)
Norman97
  • 11
  • 1