I have a function F, [bool] = F(DATASET, tresh1, tresh2), that take input a DATASET and some parameters, for exasemple 2 treshold value -tresh1 e tresh2-, and returns a boolean: 1 if DATASET is "good", 0 otherwise. The answer depends on the values tresh1 e tresh2 of course.
Suppose I have 100 DATASETs avaiable and I know which ones are good and which are not. I would like to "train" my function F, i.e. teach it a couple of value tresh1_ and tresh2_ such that F(DATASET, tresh1_, tresh2_) returns "true" for all (or most of) DATASETs "good" and "false" otherwize.
I expect that F(DATASET_, tresh1_, tresh2_), where DATASET_ is a new one (different from the previous 100), return me true if DATASET_ is really "good".
I could see that problem as a clustering problem: for every DATASET in the training set I choose random tresh1 and tresh2 and I mark which values makes sure that F returns correct value and which not. Hence I select a region where tresh1 and tresh2 values are "good". Is that a good method? Are there better ones?
In general, it seems to me a "parameters calibration problem". Does exist some classic tecniques to solve it?