2

I have a vector a=[1 2 3 1 4 2 5]'

I am trying to create a new vector that would give for each row, the occurence number of the element in a. For instance, with this matrix, the result would be [1 1 1 2 1 2 1]': The fourth element is 2 because this is the first time that 1 is repeated.

The only way I can see to achieve that is by creating a zero vector whose number of rows would be the number of unique elements (here: c = [0 0 0 0 0] because I have 5 elements). I also create a zero vector d of the same length as a. Then, going through the vector a, adding one to the row of c whose element we read and the corresponding number of c to the current row of d.

Can anyone think about something better?

teaLeef
  • 1,879
  • 2
  • 16
  • 26

5 Answers5

9

This is a nice way of doing it

C=sum(triu(bsxfun(@eq,a,a.')))

My first suggestion was this, a not very nice for loop

for i=1:length(a)
    F(i)=sum(a(1:i)==a(i));
end
David
  • 8,449
  • 1
  • 22
  • 32
  • This is a for loop indeed but this is nice and much simpler than what I do! – teaLeef Oct 30 '13 at 23:27
  • I added a different method which might be more efficient. – David Oct 30 '13 at 23:30
  • +1 for using `triu`. Very clever! You could even turn it into a one-liner if you skipped B. – nispio Oct 30 '13 at 23:30
  • @nispio Good idea, nothing like a one-liner! I will edit my answer! – David Oct 30 '13 at 23:33
  • +1 Very nice. Just a quibble: you are taking the complex conjugate of `a` (with `'`) – Luis Mendo Oct 30 '13 at 23:45
  • 1
    @Luis That's true! I haven't seen that before! Thanks. So as I understand it, using `'` gives the complex conjugate transpose, and `.'` is a "normal" transpose then. I will update my answer. – David Oct 30 '13 at 23:48
  • Caution with this solution - if a is big you will get "??? Error using ==> bsxfun. Out of memory. Type HELP MEMORY for your options." –  Oct 31 '13 at 04:34
  • @chappjc _Everybody_ ignores that detail. It has become a kind of pet peeve for me :-) – Luis Mendo Oct 31 '13 at 10:26
4

This does what you want, without loops:

m = max(a);
aux = cumsum([ ones(1,m); bsxfun(@eq, a(:), 1:m) ]);
aux = (aux-1).*diff([ ones(1,m); aux ]);
result = sum(aux(2:end,:).');
Luis Mendo
  • 110,752
  • 13
  • 76
  • 147
4

My first thought:

M = cumsum(bsxfun(@eq,a,1:numel(a)));
v = M(sub2ind(size(M),1:numel(a),a'))
chappjc
  • 30,359
  • 6
  • 75
  • 132
4

on a completely different level, you can look into tabulate to get info about the frequency of the values. For example:

tabulate([1 2 4 4 3 4])

  Value  Count  Percent
  1      1      16.67%
  2      1      16.67%
  3      1      16.67%
  4      3      50.00%
bla
  • 25,846
  • 10
  • 70
  • 101
  • 5
    Thanks, I didn't know about `tabulate`. I would give +1 if this answered the question. :) – nispio Oct 30 '13 at 23:40
0

Please note that the solutions proposed by David, chappjc and Luis Mendo are beautiful but cannot be used if the vector is big. In this case a couple of naïve approaches are:

% Big vector
a = randi(1e4, [1e5, 1]);
a1 = a;
a2 = a;

% Super-naive solution
tic
x = sort(a);
x = x([find(diff(x)); end]);
for hh = 1:size(x, 1)
  inds = (a == x(hh));
  a1(inds) = 1:sum(inds);
end
toc

% Other naive solution
tic
x = sort(a);
y(:, 1) = x([find(diff(x)); end]);
y(:, 2) = histc(x, y(:, 1));
for hh = 1:size(y, 1)
  a2(a == y(hh, 1)) = 1:y(hh, 2);
end
toc

% The two solutions are of course equivalent:
all(a1(:) == a2(:))

Actually, now the question is: can we avoid the last loop? Maybe using arrayfun?