Calculate threshold for vector

Question

I have a vector for which I need to calculate a threshold to convert it to a binary vector (above threshold=1, below=0). The values of the vector are either close to zero or far from it. So if plotted the vector, values either lie near X-axis or shoot up high(so there is a clear difference between the values). Each time, the values in the vector change so I need to calculate the threshold dynamically. There is no limit on max or min values that the vector can take. I know that otsu's method is used for grayscale images but since the range values for my vector is varying, I think I cannot use it. Is there any standard way to calculate threshold for my case? If not, are there any good workarounds?

Could you post samples of your data (both standard cases and extreme cases) and the relevant code you have got so far, please? — kkuilla, Mar 03 '14 at 09:55
You could specify the proportion of values you want to become 1. For example, for 50% your threshold would be `median(vector)`. — Luis Mendo, Mar 03 '14 at 09:55
Normalize each vector to [0 1] and for thresholding use 0.5? — Divakar, Mar 03 '14 at 10:00
Unless you define how you would like to separate the values in more detail it's arbitrary. But it sounds like a classification problem. Matlab has loads of toolboxes for that. — bdecaf, Mar 03 '14 at 13:43
Ok. For now, I am normalizing the vector and applying otsu's method with an increment of 0.01. It works fine. Thank you — BaluRaman, Mar 04 '14 at 03:51

Luis Mendo · Answer 1 · 2014-03-03T13:31:05.783

I suggest you specify the percentage of values that will become 1, and use the corresponding percentile value as the threshold (computed with prctile function from the Statistics Toolbox):

x = [3 45 0.1 0.4 10 5 6 1.2];
p = 70; %// percent of values that should become 1

threshold = prctile(x,p);
x_quant = x>=threshold;

This approach makes the threshold automatically adapt to your values. Since your data are unbounded, using percentiles may be better than using averages, because with the average a single large value can deviate your threshold more than desired.

In the example,

x_quant =

     0     1     0     0     1     0     0     0

score 0 · Answer 2 · answered Mar 03 '14 at 10:00

if the limits dont differ in a single vector and the 0 and 1 values are nearly equal in probability, why dont you simply use the mean of the vector as a threshold?

>> X=[6 .5 .9  3 .4 .6 7]

X =

    6.0000    0.5000    0.9000    3.0000    0.4000    0.6000    7.0000

>> X>=mean(X)

ans =

     1     0     0     1     0     0     1

if the probability is different for ones and zeros you might want to multiply the mean in the comparison to fit again. note that this is a very simplistic aproach, which can surly be improved to better fit your problem

Calculate threshold for vector

2 Answers2