1

I generated an X-by-10 array of numbers with Matlab. This array is 'mentally' divided into columns sets of 4, 3 and 3. Two rows if this array are given below

[1 2 3 4 ; 5 6 7 ; 8 9 10] [1 2 3 4 ; 8 9 10 ; 5 6 7]

The semi-colons are the mental divisions. I will need to process this array further, but 'mental column' permutations give the same information. The second row is a permutation of the second and third 'mental row' of the first one.

Is there any simple way I can get rid of the permutations with built in functions of Matlab ? Sort of like a unique that recognizes permutations.

Mathusalem
  • 837
  • 9
  • 21
  • Would `[1 2 4 3 ; 5 6 7 ; 8 9 10]` also be considered a permutation of `[1 2 3 4 ; 5 6 7 ; 8 9 10]`? – Eitan T Jul 25 '13 at 16:18
  • 1
    In principle yes, but those have been already deleted. Within the 'mental rows', the permutations are taken care of by constructon. – Mathusalem Jul 25 '13 at 16:23
  • So the solution needs only to take care of permutations of "mental" columns, right? – Eitan T Jul 25 '13 at 16:25
  • Yes, exactly. I could in principle map this problem onto the one you linked before if I transform the number sequences into integers (like 1 2 3 4 is mapped onto 10203040 or sth like that), and then kill the permutations by sorting and 'unique'. But it seems a bit tedious – Mathusalem Jul 25 '13 at 16:27
  • related question: [Find permutations of rows in matrix](http://stackoverflow.com/questions/16758058/find-permutations-of-rows-in-matrix) (doesn't solve this specific problem, though) – Eitan T Jul 25 '13 at 16:40

2 Answers2

2

Suppose your rows are stored in a matrix A, and the column set widths are stored in len (in your case that would be len = [4, 3, 3]). First we should represent this data properly in a cell array:

X = mat2cell(A, ones(size(A, 1), 1), len);

Then we find all possible combinations of columns in such a cell array (without repetition):

cols = perms(1:numel(len));

Now, for given two rows from X with indices r1 and r2, we check if one is a permutation of the other (i.e reordered "mental" columns):

any(arrayfun(@(n)isequal(X(r1, :), X(r2, cols(n, :))), 1:size(cols, 1)))

Following this, we can now find all possible pairs of rows (without repetition), and for each pair of rows check if they are a permutation of each other:

rows = nchoosek(1:size(A, 1), 2);
N = size(cols, 1);
isperm = @(ii, jj)any(arrayfun(@(n)isequal(X(ii, :), X(jj, cols(n, :))), 1:N));
remove_idx = arrayfun(isperm, rows(:, 1), rows(:, 2));

And removing them is as easy as pie:

A(remove_idx, :) = [];

Example

Let's take the following data as input:

A = [1:10; 11:20; 1:4 8:10 5:7];
len = [4 3 3];

That is:

A =
    1    2    3    4    5    6    7    8    9   10
   11   12   13   14   15   16   17   18   19   20
    1    2    3    4    8    9   10    5    6    7

len =
   4   3   3

And run the following code:

X = mat2cell(A, ones(size(A, 1), 1), len);
cols = perms(1:numel(len))
rows = nchoosek(1:size(A, 1), 2)
N = size(cols, 1)
isperm = @(ii, jj)any(arrayfun(@(n)isequal(X(ii, :), X(jj, cols(n, :))), 1:N));
remove_idx = arrayfun(isperm, rows(:, 1), rows(:, 2));
A(remove_idx, :) = [];

The result is:

remove_idx =
   0
   1
   0

A =
    1    2    3    4    5    6    7    8    9   10
    1    2    3    4    8    9   10    5    6    7
Eitan T
  • 32,660
  • 14
  • 72
  • 109
  • Glad to help :) Keep in mind that `arrayfun` might be slow though, for large matrices you should consider replacing it with a for loop (which would probably be better JIT-accelerated)... – Eitan T Jul 25 '13 at 18:41
0

this is not much better than transform the numbers into number sequences. but it is the same idea, i.e. mapping groups of numbers to a unique identifier. Here is my code.

M=[1 2 3 4 5 6 7 8 9 10;1 2 3 4 8 9 10 5 6 7;5 6 7 8 9 10 11 1 2 3;5 6 7 8 1 2 3 9 10 11];
%chop matrix into mental columns
M4=M(:,1:4);
M3=[M(:,5:7);M(:,8:10)]; % stack the 3s together

[u_M4,ia_M4,ic_M4]=unique(M4,'rows');% give each unique row of 4 a one digit id 
[u_M3,ia_M3,ic_M3]=unique(M3,'rows');% give each unique row of 3 a one digit id 
idx3=[ic_M3(1:length(ic_M3)/2) ic_M3(length(ic_M3)/2+1:end)]; % reshape the ids for the 3 so it matches the original format
sort_idx3=sort(idx3,2); % sort
idx=[ic_M4 sort_idx3]; % construct the idx matrix consist of the one digit ids
[u_idx,ia_idx,ic_idx]=unique(idx,'rows'); %find of unique id ROWs (that's why we need to sort before)
R=M(ia_idx,:); % now filter the original matrix 
Cici
  • 1,407
  • 3
  • 13
  • 31