0

Consider example

a=rand(5,1)
b=rand(5,1);
bs=sum(b);
B=b./bs;
cB=cumsum(B)

%OUTPUT

a =

0.7803
0.3897
0.2417
0.4039
0.0965


cB =

0.0495
0.4030
0.7617
0.9776
1.0000

now i want the position of the number in cB which is immediately greater than the number in a. that is to say i want 5 positions corresponding to each number in a. So my output should be

P= [4;2;2;3;2]

Please help.

Amro
  • 123,847
  • 25
  • 243
  • 454
Misha
  • 173
  • 1
  • 6
  • Bit late, but what is the defintion of 'immediate greater'? First number that is greater, or smallest number that is greater? Or could `cB` be assumed to be strictly increasing? – Dennis Jaheruddin Sep 16 '13 at 15:41

2 Answers2

5

The suggestions by the others are decent, but both miss the point as they are inefficient for large problems. This is a job best done by histc. (I admit that histc is not obviously the tool you would lookfor to solve this problem. I wish they had chosen some more obvious name, because few people seem to know about it. histc is used for histogramming, but also for the evaluation of splines.)

For your test case...

a = [0.7803 0.3897 0.2417 0.4039 0.0965];
cB = [0.0495 0.4030 0.7617 0.9776 1.0000];

[~,b] = histc(a,cB);
b = b + 1
b =
     4     2     2     3     2

Histc returns the index of the elements just UNDER your target, so you need to add 1.

Edit:

Eitan points out that IF cB is not monotonic, then there are problems. However, there are problems with ANY solution in that case, since the solution will not be unique. Without more information provided, such as do you desire the first or last qualifying index, there is no valid answer to the problem for completely general cB. For example, if we had:

cB = [1 3 2 4];
a = 2.5;

There are two possible solutions one might arrive at, thus an index of either 2 or 4. Note that I have had to provide solutions for exactly this problem in the past for clients, long before histc was provided as a tool in MATLAB. For example, in splines codes, the common problem is of locating the knot interval a point falls in. Of course then, the knots must lie in a sorted order. (I'll ignore the issue of replicated breaks.) There is also a case in splines codes where the bin edges are not in a sorted order, and this is the case of finding the inverse value for a spline, which then need not be monotonic at all. In that case, it may well be appropriate to solve for the rightmost solution. Only the customer would make that decision.

Since cB was generated in the example to be strictly monotone, I can only presume that monotonicity is part of the assumptions for this question.

  • +1: Great approach, but you have to make sure that `cB` is non-decreasing. While it is so in the given example, I'm not sure it is the general case, and it has to be verified in case it is not. – Eitan T Sep 15 '13 at 16:20
  • Yes. a monotonic cB is necessary, but since it was constructed from cumsum, unless some of the elements were zero or negative, this will not happen. –  Sep 15 '13 at 16:21
  • I'm not sure the generation of `cB` reflects the general case. For the sake of completness, I believe this should be address by the answer. – Eitan T Sep 15 '13 at 16:24
  • The problem is, if the cB vector is unsorted, then the solution need not be unique. Only if cB is strictly monotonic is the question well posed without providing more information. –  Sep 15 '13 at 16:29
  • The solution should be unique if all members in `cB` are unique. The monotonicity of `cB` is a requirement of `histc` alone... – Eitan T Sep 15 '13 at 16:32
  • +1 for large vectors, `bsxfun` will build a huge intermediate matrix. This is definitely the way to go (assuming the input is well behaved of course) – Amro Sep 15 '13 at 16:33
  • @EitanT - No. The solution is NOT unique for general cB. cB = [1 3 2 4], a = 2.5. Which element of cB is immediately greater than 2.5? It is either the element at index 2 or 4, but both are valid choices in this vector. –  Sep 15 '13 at 16:39
  • @EitanT - I will point out that for at least one client in my career, their choice for this problem was indeed the LAST solution, not the first. –  Sep 15 '13 at 16:41
  • @woodchips Perhaps we interpret the question differently. I was under the impression that "_the number in_ `cB` _which is immediately greater than the number in_ `a`" means "the smallest number in `cB` which is greater than the number in `a`" (implying that order is not important). – Eitan T Sep 15 '13 at 16:43
  • @EitanT - but I'll claim that is strictly your presumption, something that you have added to the problem without any inference at all that it is true. My argument is that since cB was clearly created to be monotonic, the implication is of monotonicity. –  Sep 15 '13 at 16:57
  • @EitanT - I should also note that the solutions posed by Amro ALSO fail for non-monotonic cB. Perhaps you should be having this strenuous argument with him too? (Test out my example on the bsxfun solution of his.) –  Sep 15 '13 at 17:00
  • @woodchips At the time of reading the question, the wording seemed pretty clear. Apparently it is not, if you infer otherwise, so discussing it any further is futile :) and who's arguing? ;) – Eitan T Sep 15 '13 at 17:08
  • @EitanT - my point is still, why have you not been arguing this with Amro? His solutions also fail on non-monotone cB. –  Sep 15 '13 at 17:13
  • @woodchips Indeed they fail, but I haven't been aware of that until we descended into the discussion about monotonicity. – Eitan T Sep 15 '13 at 17:16
4

Try this:

pos = sum(bsxfun(@le, cB, a')) + 1

Another one (equivalent to a loop):

pos = arrayfun(@(x) find(x < cB, 1, 'first'), a)
Amro
  • 123,847
  • 25
  • 243
  • 454