0

I have a vector for which I need to calculate a threshold to convert it to a binary vector (above threshold=1, below=0). The values of the vector are either close to zero or far from it. So if plotted the vector, values either lie near X-axis or shoot up high(so there is a clear difference between the values). Each time, the values in the vector change so I need to calculate the threshold dynamically. There is no limit on max or min values that the vector can take. I know that otsu's method is used for grayscale images but since the range values for my vector is varying, I think I cannot use it. Is there any standard way to calculate threshold for my case? If not, are there any good workarounds?

BaluRaman
  • 265
  • 5
  • 16
  • Could you post samples of your data (both standard cases and extreme cases) and the relevant code you have got so far, please? – kkuilla Mar 03 '14 at 09:55
  • You could specify the proportion of values you want to become 1. For example, for 50% your threshold would be `median(vector)`. – Luis Mendo Mar 03 '14 at 09:55
  • Normalize each vector to [0 1] and for thresholding use 0.5? – Divakar Mar 03 '14 at 10:00
  • Unless you define how you would like to separate the values in more detail it's arbitrary. But it sounds like a classification problem. Matlab has loads of toolboxes for that. – bdecaf Mar 03 '14 at 13:43
  • Ok. For now, I am normalizing the vector and applying otsu's method with an increment of 0.01. It works fine. Thank you – BaluRaman Mar 04 '14 at 03:51

2 Answers2

3

I suggest you specify the percentage of values that will become 1, and use the corresponding percentile value as the threshold (computed with prctile function from the Statistics Toolbox):

x = [3 45 0.1 0.4 10 5 6 1.2];
p = 70; %// percent of values that should become 1

threshold = prctile(x,p);
x_quant = x>=threshold;

This approach makes the threshold automatically adapt to your values. Since your data are unbounded, using percentiles may be better than using averages, because with the average a single large value can deviate your threshold more than desired.

In the example,

x_quant =

     0     1     0     0     1     0     0     0
Luis Mendo
  • 110,752
  • 13
  • 76
  • 147
0

if the limits dont differ in a single vector and the 0 and 1 values are nearly equal in probability, why dont you simply use the mean of the vector as a threshold?

>> X=[6 .5 .9  3 .4 .6 7]

X =

    6.0000    0.5000    0.9000    3.0000    0.4000    0.6000    7.0000

>> X>=mean(X)

ans =

     1     0     0     1     0     0     1

if the probability is different for ones and zeros you might want to multiply the mean in the comparison to fit again. note that this is a very simplistic aproach, which can surly be improved to better fit your problem

ben
  • 1,380
  • 9
  • 14