3

Based on a matrix which contains several rows of the beginning (first column) and the end (second column) of an interval of index, I would like to create a vector of all the index. For instance, if A = [2 4; 8 11 ; 12 16], I would like to have the following vector index = [2 3 4 8 9 10 11 12 13 14 15 16].

I'm looking for the fastest way to do that. For now, I found only two possibilities:

1) with a loop

index = [];
for n = 1:size(A, 1)
    index = [index A(n, 1):A(n, 2)];
end

2) with arrayfun

index = cell2mat(arrayfun(@(n) A(n, 1):A(n, 2), 1:size(A, 1), 'uni', 0));

Interestingly, arrayfun is much faster than the loop version, and I don't know why. Plus I use a conversion from cell to mat, so that's weird. What do you think about that? Do you have another suggestions?

Thanx for your help

Robert Seifert
  • 25,078
  • 11
  • 68
  • 113
andrew_077
  • 31
  • 1
  • The first sentence has been removed! It was Hello everyone ;) and sorry I made a mistake, the vector index would be equal to [2 3 4 8 9 10 11 12 13 14 15 16] – andrew_077 Sep 10 '16 at 19:46
  • The first sentence was removed because it constitutes noise in your question, salutations are unnecessary. You can (and should) edit your question using the [edit] link under it. `arrayfun` is just a wrapper around a loop, the only reason it's faster is that you failed to pre-allocate in your loopy version, so in each iteration some (slow) memory allocation is going on. – Andras Deak -- Слава Україні Sep 10 '16 at 20:24
  • @Divakar Your duplicate answer does not seem to give the right result – Robert Seifert Sep 11 '16 at 07:21
  • @thewaywewalk Ah thanks, was a bug indeed! Should be fixed now. – Divakar Sep 11 '16 at 08:51
  • Welcome to StackOverflow! Please consider accepting one of the answers by clicking the green check mark on the left to indicate the system that your problem is solved. Thank you! – Robert Seifert Oct 20 '16 at 13:15

2 Answers2

2

Hard to tell how fast that is, at least there is no looping:

A = [1,3;11,13;31,33;41,42;51,54;55,57;71,72];

%// prepare A
A = A.';

%// create index matrix
idx = bsxfun(@plus, A, [0; 1]);
%// special case: 54 and 55 are basically superfluous
%// need to be removed, but 71 and 72 shouldn't
A = A(:); 
dA = diff(A); dA(1:2:end) = 0;
idx = idx(~( [0;dA] == 1 | [dA;0] == 1 ));

%// create mask
mask = zeros(max(A),1);
mask(idx(:)) = (-1).^(0:numel(idx)-1);

%// index vector
out = find(cumsum(mask))

out.' =

      1  2  3 11 12 13 31 32 33 41 42 51 52 53 54 55 56 57 71 72
Robert Seifert
  • 25,078
  • 11
  • 68
  • 113
  • Very nice! Isn't writing `idx = [A(1,:); A(2,:)+1]` is much simpler/faster then calling `bsxfun`? Also you have a small typo in `infex`, and `out` is a column vector. – EBH Sep 10 '16 at 21:00
  • @EBH I'd say `idx = [A(1,:); A(2,:)+1]` is slower, but simpler - probably. out is a column vector, true, I just transposed it for displaying purposes. – Robert Seifert Sep 10 '16 at 22:11
  • your solution generates partial results please test with A =[1, 3;11,15;21,25;31,33;41,48; 51,54;61,67;71,72;81,82;91,94]; – rahnema1 Sep 11 '16 at 08:27
  • @thewaywewalk seems that works!.added to the benchmark – rahnema1 Sep 12 '16 at 12:46
2

Here is a collection of methods:

Method 1 from https://stackoverflow.com/a/39422485/6579744 :

lo = A(:,1);
up=A(:,2);
index=cumsum(accumarray(cumsum([1;up(:)-lo(:)+1]),[lo(:);0]-[0;up(:)]-1)+1);
index= index(1:end-1);

Method 2: this is from https://stackoverflow.com/a/38507276/6579744 . I also provided the same answer but because Divakar's answer is before mine his (modified) answer preferred:

start_idx = A(:,1)';
end_idx = A(:,2)';
lens = end_idx - start_idx + 1;
shift_idx = cumsum(lens(1:end-1))+1;
id_arr = ones(1,sum(lens));
id_arr([1 shift_idx]) = [start_idx(1) start_idx(2:end) - end_idx(1:end-1)];
index = cumsum(id_arr);

Method 3: this is mine

N = A(:,2) - A(:,1) +1;
s=cumsum([ 1; N]);
index=(1:s(end)-1) -repelem(s(1:end-1),N) + repelem(A(:,1),N);
%octave    index=(1:s(end)-1) -repelems(s(1:end-1),[1:numel(N);N']) + repelems(A(:,1),[1:numel(N);N']);

Method 4: another naswer from this thread by thewaywewalk

A = A.';
idx = bsxfun(@plus, A, [0; 1]);
A = A(:); 
dA = diff(A); dA(1:2:end) = 0;
idx = idx(~( [0;dA] == 1 | [dA;0] == 1 ));
mask = zeros(max(A),1);
mask(idx(:)) = (-1).^(0:numel(idx)-1);
index = find(cumsum(mask));

Method 5 your second method:

index = cell2mat(arrayfun(@(n) A(n, 1):A(n, 2), 1:size(A, 1), 'uni', 0));

Method 6 from https://stackoverflow.com/a/39423102/6579744 :

sz= size(A, 1);
index_c = cell(1,sz);
for n = 1:sz
    index_c{n} = [A(n, 1):A(n, 2)];
end
index = cell2mat(index_c);

Method 7 only works in Octave:

idx = 1:size(A ,1);
index_a =bsxfun(@(a,b) (a(b):A (b,2))',A (:,1),idx);
index = index_a(index_a ~= 0); 

Method 8 your first method:

index = [];
for n = 1:size(A, 1)
    index = [index A(n, 1):A(n, 2)];
end

Test data:

i= 1:500:10000000;
j= i+randi([1 490],1, numel(i));
A = [i', j'];

Result tested in Octave, in Matlab may be different

method1: 0.077063 seconds
method2: 0.094579 seconds
method3: 0.145004 seconds
method4: 0.180826 seconds
method5: 0.317095 seconds
method6: 0.339425 seconds
method7: 3.242287 seconds
method8: doesn't complete in 15 seconds

the code that used in bechmark is in Online Demo

Community
  • 1
  • 1
rahnema1
  • 15,264
  • 3
  • 15
  • 27