I have a vector for which I need to calculate a threshold to convert it to a binary vector (above threshold=1, below=0). The values of the vector are either close to zero or far from it. So if plotted the vector, values either lie near X-axis or shoot up high(so there is a clear difference between the values). Each time, the values in the vector change so I need to calculate the threshold dynamically. There is no limit on max or min values that the vector can take. I know that otsu's method is used for grayscale images but since the range values for my vector is varying, I think I cannot use it. Is there any standard way to calculate threshold for my case? If not, are there any good workarounds?
-
Could you post samples of your data (both standard cases and extreme cases) and the relevant code you have got so far, please? – kkuilla Mar 03 '14 at 09:55
-
You could specify the proportion of values you want to become 1. For example, for 50% your threshold would be `median(vector)`. – Luis Mendo Mar 03 '14 at 09:55
-
Normalize each vector to [0 1] and for thresholding use 0.5? – Divakar Mar 03 '14 at 10:00
-
Unless you define how you would like to separate the values in more detail it's arbitrary. But it sounds like a classification problem. Matlab has loads of toolboxes for that. – bdecaf Mar 03 '14 at 13:43
-
Ok. For now, I am normalizing the vector and applying otsu's method with an increment of 0.01. It works fine. Thank you – BaluRaman Mar 04 '14 at 03:51
2 Answers
I suggest you specify the percentage of values that will become 1, and use the corresponding percentile value as the threshold (computed with prctile
function from the Statistics Toolbox):
x = [3 45 0.1 0.4 10 5 6 1.2];
p = 70; %// percent of values that should become 1
threshold = prctile(x,p);
x_quant = x>=threshold;
This approach makes the threshold automatically adapt to your values. Since your data are unbounded, using percentiles may be better than using averages, because with the average a single large value can deviate your threshold more than desired.
In the example,
x_quant =
0 1 0 0 1 0 0 0

- 110,752
- 13
- 76
- 147
if the limits dont differ in a single vector and the 0 and 1 values are nearly equal in probability, why dont you simply use the mean of the vector as a threshold?
>> X=[6 .5 .9 3 .4 .6 7]
X =
6.0000 0.5000 0.9000 3.0000 0.4000 0.6000 7.0000
>> X>=mean(X)
ans =
1 0 0 1 0 0 1
if the probability is different for ones and zeros you might want to multiply the mean in the comparison to fit again. note that this is a very simplistic aproach, which can surly be improved to better fit your problem

- 1,380
- 9
- 14