3

I have a variable pth which is a cell array of dimension 1xn where n is a user input. Each of the elements in pth is itself a cell array and length(pth{k}) for k=1:n is variable (result of another function). Each element pth{k}{kk} where k=1:n and kk=1:length(pth{k}) is a 1D vector of integers/node numbers of again variable length. So to summarise, I have a variable number of variable-length vectors organised in a avriable number of cell arrays.

I would like to try and find all possible intersections when you take a vector at random from pth{1}, pth{2}, {pth{3}, etc... There are various functions on the File Exchange that seem to do that, for example this one or this one. The problem I have is you need to call the function this way:

mintersect(v1,v2,v3,...)

and I can't write all the inputs in the general case because I don't know explicitly how many there are (this would be n above). Ideally, I would like to do some thing like this;

mintersect(pth{1}{1},pth{2}{1},pth{3}{1},...,pth{n}{1})
mintersect(pth{1}{1},pth{2}{2},pth{3}{1},...,pth{n}{1})
mintersect(pth{1}{1},pth{2}{3},pth{3}{1},...,pth{n}{1})
etc...
mintersect(pth{1}{1},pth{2}{length(pth{2})},pth{3}{1},...,pth{n}{1})
mintersect(pth{1}{1},pth{2}{1},pth{3}{2},...,pth{n}{1})
etc...

keep going through all the possible combinations, but I can't write this in code. This function from the File Exchange looks like a good way to find all possible combinations but again I have the same problem with the function call with the variable number of inputs:

allcomb(1:length(pth{1}),1:length(pth{2}),...,1:length(pth{n}))

Does anybody know how to work around this issue of function calls with variable number of input arguments when you can't physically specify all the input arguments because their number is variable? This applies equally to MATLAB and Octave, hence the two tags. Any other suggestion on how to find all possible combinations/intersections when taking a vector at random from each pth{k} welcome!

EDIT 27/05/20

Thanks to Mad Physicist's answer, I have ended up using the following which works:

disp('Computing intersections for all possible paths...')
grids = cellfun(@(x) 1:numel(x), pth, 'UniformOutput', false);
idx = cell(1, numel(pth));
[idx{:}] = ndgrid(grids{:});
idx = cellfun(@(x) x(:), idx, 'UniformOutput', false);
idx = cat(2, idx{:});
valid_comb = [];
k = 1;

for ii = idx'
    indices = reshape(num2cell(ii), size(pth));
    selection = cellfun(@(p,k) p{k}, pth, indices, 'UniformOutput', false);
    if my_intersect(selection{:})
       valid_comb = [valid_comb k];
    endif
    k = k+1;
end

My own version is similar but uses a for loop instead of the comma-separated list:

disp('Computing intersections for all possible paths...')
grids = cellfun(@(x) 1:numel(x), pth, 'UniformOutput', false);
idx = cell(1, numel(pth));
[idx{:}] = ndgrid(grids{:});
idx = cellfun(@(x) x(:), idx, 'UniformOutput', false);
idx = cat(2, idx{:});
[n_comb,~] = size(idx);
temp = cell(n_pipes,1);
valid_comb = [];
k = 1;

for k = 1:n_comb
  for kk = 1:n_pipes
    temp{kk} = pth{kk}{idx(k,kk)};
  end
  if my_intersect(temp{:})
    valid_comb = [valid_comb k];
  end
end

In both cases, valid_comb has the indices of the valid combinations, which I can then retrieve using something like:

valid_idx = idx(valid_comb(1),:);
for k = 1:n_pipes
  pth{k}{valid_idx(k)} % do something with this
end

When I benchmarked the two approaches with some sample data (pth being 4x1 and the 4 elements of pth being 2x1, 9x1, 8x1 and 69x1), I got the following results:

>> benchmark

Elapsed time is 51.9075 seconds.
valid_comb =  7112

Elapsed time is 66.6693 seconds.
valid_comb =  7112

So Mad Physicist's approach was about 15s faster.

I also misunderstood what mintersect did, which isn't what I wanted. I wanted to find a combination where no element present in two or more vectors, so I ended writing my version of mintersect:

function valid_comb = my_intersect(varargin)

  % Returns true if a valid combination i.e. no combination of any 2 vectors 
  % have any elements in common

  comb_idx = combnk(1:nargin,2);
  [nr,nc] = size(comb_idx);
  valid_comb = true;
  k = 1;

  % Use a while loop so that as soon as an intersection is found, the execution stops
  while valid_comb && (k<=nr)
    temp = intersect(varargin{comb_idx(k,1)},varargin{comb_idx(k,2)});
    valid_comb = isempty(temp) && valid_comb;
    k = k+1;
  end

end
am304
  • 13,758
  • 2
  • 22
  • 40
  • Not sure why none of the answers mention this, but the canonical way to create a function that takes a variable number of arguments is through [varargin](https://octave.org/doc/v5.2.0/Variable_002dlength-Argument-Lists.html) – Tasos Papastylianou May 23 '20 at 09:38
  • @TasosPapastylianou. Because it has literally nothing to do with the question. The question is how to process a variable number of arguments without doing stuff you shouldn't do – Mad Physicist May 23 '20 at 23:48
  • @MadPhysicist I disagree, but it did help me realise I had misunderstood the structure of the initial cell array by one dimension, so thank you for the comment :) I have updated my answer. As to why I disagree, to me varargin is clearly a part of the solution, as shown in the functions OP links to. As I see it, passing variable arguments consists of two parts: a. being able to generate them programmatically, and b. having a function that can deal with them. Comma-separated-lists deal with the first, and varargin deals with the second. 'b' matters as the linked functions don't do what OP asks. – Tasos Papastylianou May 24 '20 at 10:42

3 Answers3

2

Couple of helpful points to construct a solution:

  • This post shows you how to construct a Cartesian product between arbitrary arrays using ndgrid.
  • cellfun accepts multiple cell arrays simultaneously, which you can use to index specific elements.
  • You can capture a variable number of arguments from a function using cell arrays, as shown here.

So let's get the inputs to ndgrid from your outermost array:

grids = cellfun(@(x) 1:numel(x), pth, 'UniformOutput', false);

Now you can create an index that contains the product of the grids:

index = cell(1, numel(pth));
[index{:}] = ndgrid(grids{:});

You want to make all the grids into column vectors and concatenate them sideways. The rows of that matrix will represent the Cartesian indices to select the elements of pth at each iteration:

index = cellfun(@(x) x(:), index, 'UniformOutput', false);
index = cat(2, index{:});

If you turn a row of index into a cell array, you can run it in lockstep over pth to select the correct elements and call mintersect on the result.

for i = index'
    indices = num2cell(i');
    selection = cellfun(@(p, i) p{i}, pth, indices, 'UniformOutput', false);
    mintersect(selection{:});
end

This is written under the assumption that pth is a row array. If that is not the case, you can change the first line of the loop to indices = reshape(num2cell(i), size(pth)); for the general case, and simply indices = num2cell(i); for the column case. The key is that the cell from of indices must be the same shape as pth to iterate over it in lockstep. It is already generated to have the same number of elements.

Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
  • @CrisLuengo. Something like this? – Mad Physicist May 23 '20 at 03:49
  • Yes, like this. – Cris Luengo May 23 '20 at 15:25
  • @CrisLuengo. Do you know if there's a way to optimize the stuff surrounding `index = cat(2, index{:});`. I wrote this on mobile, so I don't have a place to test any of it reliably. – Mad Physicist May 23 '20 at 19:29
  • I suppose you could have added an extra singleton dimension in front during the ndgrid construction, then vertcat and index the whole thing via (:,:). – Tasos Papastylianou May 24 '20 at 16:36
  • Hi, thanks for your answer. It's got me a long way towards the solution and I certainly wouldn't have been able to figure this out on my own. It falls over though at the line `selection = cellfun(@(p, i) p(i), pth, indices, 'UniformOutput', false)` with the error message `error: cellfun: dimensions mismatch` (I tried both in Octave and MATLAB). Likewise, `pth(indices)` or even `pth{indices}` results in the error message `error: pth(): subscripts must be either integers 1 to (2^63)-1 or logicals`. Could it be because `pth` is a 2D cell array so you need something like `pth{k}{k}`? – am304 May 26 '20 at 10:01
  • For reference, with some example data `size(pth)` is `3x1`, `size (pth{1})` is `75x1`, `size(pth{2})` is `216x1` and `size(pth{3})` is `60x1`, each of the `pth{k}`being a cell array. – am304 May 26 '20 at 10:33
  • 1
    @am304 hang on a bit. I wrote this on mobile, so it's completely untested. I'll fix it up as soon as I get to my matlab. – Mad Physicist May 26 '20 at 14:06
  • @MadPhysicist Thank you, I'm working on it as well and am close to finding a solution using the first part of what you proposed. – am304 May 26 '20 at 14:45
  • 1
    @am304. The answer was written with the assumption that `pth` is a row array, so you can do `indices = reshape(num2cell(i'), size(pth));` – Mad Physicist May 26 '20 at 14:54
  • 1
    @am304, I've updated my answer with a little blurb at the bottom. – Mad Physicist May 26 '20 at 14:57
  • Thanks, I found that for `mintersect` to work, I had to change `p(i)` in `selection = cellfun(@(p, i) p{i}, pth, indices, 'UniformOutput', false);` to `p{i}`. My own method which I developed in parallel is a bit slower compared to yours when I benchmarked it with some sample data. I also misunderstood what `mintersect` did, which is not what I wanted, so I ended up doing my own version. Anyhow, I wouldn't have been to solve it without your help, so will accept your answer and update my question with what I ended up using. – am304 May 27 '20 at 13:54
0

I believe this does the trick. Calls mintersect on all possible combinations of vectors in pth{k}{kk} for k=1:n and kk=1:length(pth{k}).

Using eval and messing around with sprintf/compose a bit. Note that typically the use of eval is very much discouraged. Can add more comments if this is what you need.

% generate some data
n = 5;
pth = cell(1,n);

for k = 1:n
    pth{k} = cell(1,randi([1 10]));
    for kk = 1:numel(pth{k})
        pth{k}{kk} = randi([1 100], randi([1 10]), 1);
    end
end

% get all combs
str_to_eval = compose('1:length(pth{%i})', 1:numel(pth));
str_to_eval = strjoin(str_to_eval,',');
str_to_eval = sprintf('allcomb(%s)',str_to_eval);
% use eval to get all combinations for a given pth
all_combs = eval(str_to_eval);

% and make strings to eval in intersect
comp = num2cell(1:numel(pth));
comp = [comp ;repmat({'%i'}, 1, numel(pth))];
str_pattern = sprintf('pth{%i}{%s},', comp{:});
str_pattern = str_pattern(1:end-1); % get rid of last ,

strings_to_eval = cell(length(all_combs),1);
for k = 1:size(all_combs,1)
    strings_to_eval{k} = sprintf(str_pattern, all_combs(k,:));
end

% and run eval on all those strings 
result = cell(length(all_combs),1);
for k = 1:size(all_combs,1)
    result{k} = eval(['mintersect(' strings_to_eval{k} ')']);
    %fprintf(['mintersect(' strings_to_eval{k} ')\n']); % for debugging
end

For a randomly generated pth, the code produces the following strings to evaluate (where some pth{k} have only one cell for illustration):

mintersect(pth{1}{1},pth{2}{1},pth{3}{1},pth{4}{1},pth{5}{1})
mintersect(pth{1}{1},pth{2}{1},pth{3}{1},pth{4}{2},pth{5}{1})
mintersect(pth{1}{1},pth{2}{1},pth{3}{1},pth{4}{3},pth{5}{1})
mintersect(pth{1}{1},pth{2}{1},pth{3}{2},pth{4}{1},pth{5}{1})
mintersect(pth{1}{1},pth{2}{1},pth{3}{2},pth{4}{2},pth{5}{1})
mintersect(pth{1}{1},pth{2}{1},pth{3}{2},pth{4}{3},pth{5}{1})
mintersect(pth{1}{2},pth{2}{1},pth{3}{1},pth{4}{1},pth{5}{1})
mintersect(pth{1}{2},pth{2}{1},pth{3}{1},pth{4}{2},pth{5}{1})
mintersect(pth{1}{2},pth{2}{1},pth{3}{1},pth{4}{3},pth{5}{1})
mintersect(pth{1}{2},pth{2}{1},pth{3}{2},pth{4}{1},pth{5}{1})
mintersect(pth{1}{2},pth{2}{1},pth{3}{2},pth{4}{2},pth{5}{1})
mintersect(pth{1}{2},pth{2}{1},pth{3}{2},pth{4}{3},pth{5}{1})
mintersect(pth{1}{3},pth{2}{1},pth{3}{1},pth{4}{1},pth{5}{1})
mintersect(pth{1}{3},pth{2}{1},pth{3}{1},pth{4}{2},pth{5}{1})
mintersect(pth{1}{3},pth{2}{1},pth{3}{1},pth{4}{3},pth{5}{1})
mintersect(pth{1}{3},pth{2}{1},pth{3}{2},pth{4}{1},pth{5}{1})
mintersect(pth{1}{3},pth{2}{1},pth{3}{2},pth{4}{2},pth{5}{1})
mintersect(pth{1}{3},pth{2}{1},pth{3}{2},pth{4}{3},pth{5}{1})
mintersect(pth{1}{4},pth{2}{1},pth{3}{1},pth{4}{1},pth{5}{1})
mintersect(pth{1}{4},pth{2}{1},pth{3}{1},pth{4}{2},pth{5}{1})
mintersect(pth{1}{4},pth{2}{1},pth{3}{1},pth{4}{3},pth{5}{1})
mintersect(pth{1}{4},pth{2}{1},pth{3}{2},pth{4}{1},pth{5}{1})
mintersect(pth{1}{4},pth{2}{1},pth{3}{2},pth{4}{2},pth{5}{1})
mintersect(pth{1}{4},pth{2}{1},pth{3}{2},pth{4}{3},pth{5}{1})
rinkert
  • 6,593
  • 2
  • 12
  • 31
  • 2
    Using `eval` is a [very bad idea](https://blogs.mathworks.com/loren/2005/12/28/evading-eval/). – Sardar Usama May 22 '20 at 23:28
  • @SardarUsama I know, but isn't there some some for it? In this case, typing all combinations manually is infeasible. So perhaps some efficiency is lost, but it takes me less time. The alternative for `eval` in my answer would be to write all strings to an m file and then run that, but if you create this m file dynamically, wouldn't it result in the same. What would be a better approach in this case? – rinkert May 22 '20 at 23:49
  • 2
    There is no need for `eval`, nor to convert the data to string. You could instead collect the inputs in a cell array and then do `mintersect(params{:});`. I’m not going to downvote this, but I don’t think it’s a good solution. – Cris Luengo May 22 '20 at 23:52
  • Thanks for your answer. I am well aware of the "evils of `eval`" and if there were no other option, I would have considered and probably accepted your answer, but I feel MadPhysicist's answer is superior precisely because it doesn't rely on `eval` so I have accepted his answer instead. Thanks again. – am304 May 27 '20 at 13:58
0

As Madphysicist pointed out, I misunderstood the initial structure of your initial cell array, however the point stands. The way to pass an unknown number of arguments to a function is via comma-separated-list generation, and your function needs to support it by being declared with varargin. Updated example below.

Create a helper function to collect a random subcell from each main cell:

% in getRandomVectors.m
function Out = getRandomVectors(C)   % C: a double-jagged array, as described
    N   = length(C);
    Out = cell(1, N);
    for i = 1 : length(C)
        Out{i} = C{i}{randi( length(C{i}) )};
    end
end

Then assuming you already have an mintersect function defined something like this:

% in mintersect.m
function Intersections = mintersect( varargin )
    Vectors = varargin;
    N = length( Vectors );
    for i = 1 : N;    for j = 1 : N
        Intersections{i,j} = intersect( Vectors{i}, Vectors{j} );
    end; end
end

Then call this like so:

C = { { 1:5, 2:4, 3:7 }, {1:8}, {2:4, 3:9, 2:8} }; % example double-jagged array

In  = getRandomVectors(C);   % In is a cell array of randomly selected vectors
Out = mintersect( In{:} );   % Note the csl-generator syntax

PS. I note that your definition of mintersect differs from those linked. It may just be you didn't describe what you want too well, in which case my mintersect function is not what you want. What mine does is produce all possible intersections for the vectors provided. The one you linked to produces a single intersection which is common to all vectors provided. Use whichever suits you best. The underlying rationale for using it is the same though.

PS. It is also not entirely clear from your description whether what you're after is a random vector k for each n, or the entire space of possible vectors over all n and k. The above solution does the former. If you want the latter, see MadPhysicist's solution on how to create a cartesian product of all possible indices instead.

Tasos Papastylianou
  • 21,371
  • 2
  • 28
  • 57
  • OP is not looking for a definition of `mintersect`. They are looking for the part that comes before that. They are not looking for a random selection, but the Cartesian product of all selections. Furthermore, `mintersect` is named so exactly because it's the "minimal intersection": a single output for all input arrays. – Mad Physicist May 24 '20 at 15:19
  • @MadPhysicist quite possibly. In which case the phrase "find all possible combinations/intersections when taking a vector at random from each pth{k}" is simply bad wording. In the original question the way it is worded it seemed like the only reason for the cartesian product was to save it and then take a random component from it representing a particular combination at random. In which case the random component can be obtained without the need for the full a cartesian product in the first place. (i.e. an XY problem). – Tasos Papastylianou May 24 '20 at 16:03
  • I agree that there may be different interpretations to the question. – Mad Physicist May 25 '20 at 14:54
  • 1
    Thanks for your answer and apologies if my question wasn't clear enough. It is not however how to *define* a function with mulptiple input variables (I am well aware of `varargin` and know how to use it) but rather how to *call* a function with has been defined with `varargin` when the number of input arguments is not explicitly known. I have checked out the comma-separated list but I can't make it work without some major restructring of the data. For example if I try `pth{:}{1}` to takes the first `k` across all `n`, I get the error message: `error: a cs-list cannot be further indexed`. – am304 May 26 '20 at 08:39
  • 1
    To clarify the question, I am trying to find the single intersection (which may well be empty) across all `n` for each possible combination of `k`. – am304 May 26 '20 at 08:42
  • 1
    @am304 hi and thanks for clarifying! Yes, in that case, MadP's answer is what you're after. As for cs-lists, yes, it's true, unfortunately they cannot be broadcasted in the way you hoped. Think of them more as a convenient way of dumping a comma separated list of arguments into functions (or into brackets). It's comparable to python's `fun(*list)` syntax, or R's (dplyr-specific) `fun(!!!list)` syntax, or julia's `fun(list...)` so-called 'splat' syntax, etc. In other words you need to construct your cell array of arguments before you can dump it into a function. – Tasos Papastylianou May 26 '20 at 08:56