1

How to obtain the coordinates of the first and the last appearances (under column-major ordering) of each label present in a matrix?

Example of a label matrix (where labels are 1 to 4):

L = [    
     1 1 1 1 0 0 0 0
     0 0 0 0 2 2 0 0
     0 0 0 0 0 0 2 0
     0 0 0 0 0 0 0 0
     0 0 0 0 0 3 0 0
     0 0 0 0 0 0 3 3
     0 0 0 4 0 0 0 0
     4 4 4 0 0 0 0 0
    ];

For the above example L, I would like to obtain a matrix of coordinates like:

M = [
    1 1 1
    1 4 1
    2 5 2
    3 7 2
    5 6 3
    6 8 3
    8 1 4
    7 4 4 ];

Where the 1st column of M contains horizontal coordinates, the 2nd contains vertical coordinates, and the 3rd column contains the label. There should be 2 rows for each label.

Dev-iL
  • 23,742
  • 7
  • 57
  • 99
Mac.
  • 303
  • 1
  • 12

4 Answers4

5

With for-loop you can do it like that:

M=zeros(2*max(L(:)),3);
for k=1:max(L(:))
   [r,c]=find(L==k);
   s=sortrows([r c],2);
   M(k*2-1:k*2,:)=[s(1,:) k; s(end,:) k];
end

M =
 1     1     1
 1     4     1
 2     5     2
 3     7     2
 5     6     3
 6     8     3
 8     1     4
 7     4     4

Maybe somehow with regionprops options you can do it without the loop...

Adiel
  • 3,071
  • 15
  • 21
  • I totally agree with what you propose but I do not understand why for some points the matrix comes out to me [255 255 label], the label is good but why 255 for x and y? : -o – Mac. Mar 23 '17 at 10:56
  • 1
    It's ok I find why! I wasn't a double matrix but an uint8 ^^. Thanks for your help :-) – Mac. Mar 23 '17 at 11:01
5

If you're looking for a vectorized solution, you can do this:

nTags = max(L(:));
whois = bsxfun(@eq,L,reshape(1:nTags,1,1,[]));
% whois = L == reshape(1:nTags,1,1,[]); % >=R2016b syntax.
[X,Y,Z] = ind2sub(size(whois), find(whois));
tmp = find(diff([0; Z; nTags+1])); tmp = reshape([tmp(1:end-1) tmp(2:end)-1].',[],1);
M = [X(tmp), Y(tmp), repelem(1:nTags,2).'];

Or with extreme variable reuse:

nTags = max(L(:));
Z = bsxfun(@eq,L,reshape(1:nTags,1,1,[]));
[X,Y,Z] = ind2sub(size(Z), find(Z));
Z = find(diff([0; Z; nTags+1])); 
Z = reshape([Z(1:end-1) Z(2:end)-1].',[],1);
M = [X(Z), Y(Z), repelem(1:nTags,2).'];

Here's my benchmarking code:

function varargout = b42973322(isGPU,nLabels,lMat)
if nargin < 3
  lMat = 1000;
end
if nargin < 2
  nLabels = 20; % if nLabels > intmax('uint8'), Change the type of L to some other uint.
end
if nargin < 1
  isGPU = false;
end
%% Create L:
if isGPU
  L = sort(gpuArray.randi(nLabels,lMat,lMat,'uint8'),2);
else
  L = sort(randi(nLabels,lMat,lMat,'uint8'),2);
end
%% Equality test:
M{3} = DeviL2(L);
M{2} = DeviL1(L);
M{1} = Adiel(L);
assert(isequal(M{1},M{2},M{3}));
%% Timing:
% t(3) = timeit(@()DeviL2(L)); % This is always slower, so it's irrelevant.
t(2) = timeit(@()DeviL1(L));
t(1) = timeit(@()Adiel(L));
%% Output / Print
if nargout == 0
  disp(t);
else
  varargout{1} = t;  
end

end

function M = Adiel(L)
  M=[];
  for k=1:max(L(:))
     [r,c]=find(L==k);
     s=sortrows([r c],2);
     M=[M;s(1,:) k; s(end,:) k];
  end
end

function M = DeviL1(L)
  nTags = max(L(:));
  whois = L == reshape(1:nTags,1,1,[]); % >=R2016b syntax.
  [X,Y,Z] = ind2sub(size(whois), find(whois));
  tmp = find(diff([0; Z; nTags+1])); tmp = reshape([tmp(1:end-1) tmp(2:end)-1].',[],1);
  M = [X(tmp), Y(tmp), repelem(1:nTags,2).'];
end

function M = DeviL2(L)
  nTags = max(L(:));
  Z = L == reshape(1:nTags,1,1,[]);
  [X,Y,Z] = ind2sub(size(Z), find(Z));
  Z = find(diff([0; Z; nTags+1])); 
  Z = reshape([Z(1:end-1) Z(2:end)-1].',[],1);
  M = [X(Z), Y(Z), repelem(1:nTags,2).'];
end
Graham
  • 7,431
  • 18
  • 59
  • 84
Dev-iL
  • 23,742
  • 7
  • 57
  • 99
  • +1 for the creativity! But in addition to the complicate-to-read code, I'm not sure that it's faster... Sometimes `bsxfun` can be terrible in that manner. – Adiel Mar 23 '17 at 11:06
  • Wow ok I am not used at all to this kind of writing but thank you it will allow me to understand the vectorization :-D. Thank you! – Mac. Mar 23 '17 at 11:06
  • Thanks guys! @Adiel care to back these claims with a benchmark (with a larger dataset...)? :) – Dev-iL Mar 23 '17 at 11:08
  • 1
    So, I checked it for L of 800x800, with 20100 different labels. My solution takes 12.03 sec., your first takes 17.64 sec. and your second takes 21.88 sec. Final solutions are identical. So `bsxfun` is beautiful but not always the best way... – Adiel Mar 23 '17 at 12:01
  • 1
    @Adiel I also benchmarked it. The preference is a function of the amount of labels. I tried it on 4000x4000 with 20 labels, and my 2nd method is better by about 30%. That is unless the matrices are `gpuArray`, in which case the loop loses every time (it is slower by x1.3 - x4). – Dev-iL Mar 23 '17 at 12:35
  • I sticked to the labels density of the original question, but it makes sense that the method should be chosen depending on the specific case – Adiel Mar 23 '17 at 12:45
5

I just had to try it with accumarray:

R = size(L, 1);
[rowIndex, colIndex, values] = find(L);  % Find nonzero values
index = (colIndex-1).*R+rowIndex;        % Create a linear index
labels = unique(values);                 % Find unique values
nLabels = numel(labels);
minmax = zeros(2, nLabels);
minmax(1, :) = accumarray(values, index, [nLabels 1], @min);  % Collect minima
minmax(2, :) = accumarray(values, index, [nLabels 1], @max);  % Collect maxima
temp = ceil(minmax(:)/R);
M = [minmax(:)-R.*(temp-1) temp repelem(labels, 2, 1)];  % Convert index to subscripts

M =

     1     1     1
     1     4     1
     2     5     2
     3     7     2
     5     6     3
     6     8     3
     8     1     4
     7     4     4

Here's what I got for timing with Dev-iL's script and Adiel's newest code (Note that the number of labels can't go above 127 due to how Adiel's code uses the uint8 values as indices):

                       |   Adiel |  Dev-iL | gnovice
-----------------------+---------+---------+---------
  20 labels, 1000x1000 |  0.0753 |  0.0991 |  0.0889
20 labels, 10000x10000 | 12.0010 | 10.2207 |  8.7034
 120 labels, 1000x1000 |  0.1924 |  0.3439 |  0.1387

So, for moderate numbers of labels and (relatively) smaller sizes, Adiel's looping solution looks like it does best, with my solution lying between his and Dev-iL's. For larger sizes or greater numbers of labels, my solution starts to take the lead.

Community
  • 1
  • 1
gnovice
  • 125,304
  • 15
  • 256
  • 359
0

You can retrive the uniqe values (your labels) of the matrix with unique.

Having them retrived you can use find to get their indices.

Put together your matrix with it.

Adriaan
  • 17,741
  • 7
  • 42
  • 75
  • With the other three answer providing a total approach, as opposed to this very general guideline, I suggest either to extend this answer to also give a total approach, or removing it altogether if you thinks it would add no value to the question in light of the other answers. – Adriaan Mar 23 '17 at 15:37