10

I have given a list of indices, e.g. i = [3 5] and a vector v = 1:6. I need a function f which returns the logical map for the vector v given the indices i, e.g.:

f(i, length(v)) = [0 0 1 0 1 0]

Since I will call this function several million times, I would like to make it as fast as possible. Is there a builtin function which performs this task?

Eitan T
  • 32,660
  • 14
  • 72
  • 109
blubb
  • 9,510
  • 3
  • 40
  • 82

7 Answers7

10

I know I'm late in the game, but I really wanted to find a faster solution which is just as elegant as ismember. And indeed there is one, that employs the undocumented ismembc function:

ismembc(v, i)

Benchmark

N = 7;
i = [3 5];

%// slayton's solution
tic
for ii = 1:1e5
    clear idx;
    idx(N) = false;
    idx(i) = true;
end
toc

%// H.Muster's solution
tic
for ii = 1:1e5
    v = 1:N;
    idx = ismember(v, i);
end
toc

%// Jonas' solution
tic
for ii = 1:1e5
    idx = sparse(i, 1, true, N, 1);
end
toc

%// ismembc solution
tic
for ii = 1:1e5
    v = 1:N;
    idx = ismembc(v, i);
end
toc

Here's what I got:

Elapsed time is 1.482971 seconds.
Elapsed time is 6.369626 seconds.
Elapsed time is 2.039481 seconds.
Elapsed time is 0.776234 seconds.

Amazingly, ismembc is indeed the fastest!

Edit:
For very large values of N (i.e. when v is a large array), the faster solution is actually slayton's (and HebeleHododo's, for that matter). You have quite a variety of strategies to choose from, pick carefully :)

Edit by H.Muster:
Here's are benchmark results including _ismemberoneoutput:

Slayton's solution:
   Elapsed time is 1.075650 seconds.
ismember:
   Elapsed time is 3.163412 seconds.
ismembc:
   Elapsed time is 0.390953 seconds.
_ismemberoneoutput:
   Elapsed time is 0.477098 seconds.

Interestingly, Jonas' solution does not run for me, as I get an Index exceeds matrix dimensions. error...

Edit by hoogamaphone:
It's worth noting that ismembc requires both inputs to be numerical, sorted, non-sparse, non-NaN values, which is a detail that could be easily missed in the source documentation.

Eitan T
  • 32,660
  • 14
  • 72
  • 109
  • Please note that my answer also includes the faster solution `builtin('_ismemberoneoutput', v, i)` which might be basically the same as `ismembc`. Nevertheless, the `ismembc` is a nice find. – H.Muster Jan 30 '13 at 15:40
  • @H.Muster I'm getting a `Cannot find builtin function '_ismemberoneoutput'` error when trying to use `_ismemberoneoutput`. If it works for you, can you benchmark all four solutions then? – Eitan T Jan 30 '13 at 15:42
  • Interesting. On which version of Matlab are you traveling? I am on R2012a (64 Bit) – H.Muster Jan 30 '13 at 15:43
  • My MATLAB version is R2010b 64-bit. – Eitan T Jan 30 '13 at 15:45
  • 1
    I tested `_ismemberoneoutput` with your benchmark and it is slightly slower than `ismembc`. Might be also interesting to know for @blubb. – H.Muster Jan 30 '13 at 15:47
  • Hmmph. I can't even Google `_ismemberoneoutput`. Oh well, can you at least post your benchmark results for clarification? This could be used as a valuable reference in the future! – Eitan T Jan 30 '13 at 15:50
  • I found it while looking into the code of `ismember`. Is it ok if I edit the benchmark results into your answer (at the bottom after your benchmarks)? – H.Muster Jan 30 '13 at 15:52
  • I forgot to benchmark HebeleHododo's solution, which performs almost as fast as slayton's. Also, make sure you're using the right variables in Jonas' solution (I got that error too at first). – Eitan T Jan 30 '13 at 16:11
  • 1
    Thank you for this wonderful compilation. This is a perfect example of what is making the SE community so unique! – blubb Jan 30 '13 at 16:44
  • 1
    `ismembc` works fine if the indices are sorted, but it fails if they are not. – hoogamaphone Apr 03 '14 at 13:24
  • @Chris right, but it is implied in the question that `v` is sorted. – Eitan T Apr 03 '14 at 21:35
5

Simply create a vector of logical indices and set the desired locations to true/false

idx = false( size( v) );
idx( i ) = true;

This can be wrapped in a function like so:

function idx = getLogicalIdx(size, i)
  idx = false(size);
  idx(i) = true;
end

If you need a indexing vector of the same size for each of your million operations allocated the vector once and then operate on it each iteration:

idx = false(size(v)); % allocate the vector
while( keepGoing)

  idx(i) = true; % set the desired values to true for this iteration

  doSomethingWithIndecies(idx);

  idx(i) = false; % set indices back to false for next iteration

end

If you really need performance than you can write a mex function to do this for you. Here is a very basic, untested function that I wrote that is about 2x faster than the other methods:

#include <math.h>
#include <matrix.h>
#include <mex.h>

void mexFunction(int nlhs, mxArray *plhs[],
                 int nrhs, const mxArray *prhs[])
{
    double M;
    double *in;

    M = mxGetScalar(prhs[0]);
    in = mxGetPr(prhs[1]);
    size_t N = mxGetNumberOfElements(prhs[1]);



    plhs[0] = mxCreateLogicalMatrix( M,1 );
    mxLogical *out= mxGetLogicals( plhs[0] );


    int i, ind;
    for (i=0; i<N; i++){
        out[ (int)in[i] ] = 1;
    }

}

There are several different ways to allocate a vector in matlab. Some are faster than others, see this Undocumented Matlab post for a good summary:

Here are some quick benchmarks comparing the different methods. The last method is by far the fastest but it requires you to use the same size logical indexing vector for each operation.

N = 1000;
ITER = 1e5;

i = randi(5000,100,1);
sz = [N, 1];

fprintf('Create using false()\n');
tic;
for j = 1:ITER
    clear idx;
    idx = false( N, 1 );
    idx(i) = true;
end
toc;

fprintf('Create using indexing\n');
tic;
for j = 1:ITER
    clear idx;
    idx(N) = false;
    idx(i) = true;
end
toc;

fprintf('Create once, update as needed\n');
tic;
idx = false(N,1);
for j = 1:ITER
    idx(i) = true;
    idx(i) = false;
end
toc;

fprintf('Create using ismembc\n');
a = ones(N,1);
tic;
for j = 1:ITER

    idx = ismembc(1:N, i);
end
toc;
slayton
  • 20,123
  • 10
  • 60
  • 89
  • Thank you. This works reasonably well, however, I will perform this operation several million times. Is there any way to speed this up? – blubb Jan 30 '13 at 15:00
  • @blubb Yes you can speed this up in a number of ways. Most of them have to do with preallocation, for example if you are going to creating logical indexing vectors of the same length, create that first and then operate off of that... – slayton Jan 30 '13 at 15:13
5

You can use ismember

 i = [3 5];
 v = 1:6;

 ismember(v,i)

will return

ans =

     0     0     1     0     1     0

For a probably faster version, you can try

builtin('_ismemberoneoutput', v, i)

Note that I tested this only for row vectors like specified by you.

H.Muster
  • 9,297
  • 1
  • 35
  • 46
  • This is what I was looking for. It should be noted though, that at least in my particular case, @slayton's solution is about 30% faster if implemented with preallocation. – blubb Jan 30 '13 at 15:21
  • If you don't already have the vector `v` then this is going to be slow, as you'll need to allocate a vector with `1:N` each time you want to call this. – slayton Jan 30 '13 at 15:27
  • @blubb: please note Eithan's answer, which is as nice as `ismember` but significantly faster. It seems fair to accept his answer, rather than mine. – H.Muster Jan 30 '13 at 15:52
  • @H.Muster thanks, but it's slower than slayton's solution when `v` is a large array. – Eitan T Jan 30 '13 at 16:07
  • @EitanT: yes, but the OP liked `ismember` over slayton's solution for whatever reasons, although it is slower. Hence, he should like your solution even more... – H.Muster Jan 30 '13 at 16:09
2

Just address a new variable with the idx matrix, it wil fill in the zeros for you:

idx = [3 5];
a(idx) = true

No need for a function, nor for passing the length in unless you want trailing zeros too.

Dan
  • 45,079
  • 17
  • 88
  • 157
2

I expect that @slayton's solution is fastest. However, here's a one-liner alternative, that may at least save you some memory if the vectors are large.

vecLen = 6;
logicalIdx = sparse(idx,1,true,vecLen,1);
Jonas
  • 74,690
  • 10
  • 137
  • 177
1

You can write a function like this:

function logicalIdx = getLogicalIdx(idx, v)
    logicalIdx = zeros(1,size(v,2));
    logicalIdx(idx) = 1;
end

When you call the function:

v = 1:6;
idx = [3 5];
getLogicalIdx(idx,v)

The output will be:

ans =

     0     0     1     0     1     0
HebeleHododo
  • 3,620
  • 1
  • 29
  • 38
1

Can you simply do v(i) =1 ?

for example if you say x = zeros(1,10); and a = [1 3 4];

x(a) = 1 will return 1 0 1 1 0 0 0 0 0 0

QuantumLicht
  • 2,103
  • 3
  • 23
  • 32