1

I have a big matrix with 3 rows and many columns. I want to remove the columns which have zeroes in all its rows except for the first occurrence.

For example, given the following matrix:

    1 1 0 0 0 0 0 0
A = 1 0 0 1 0 0 0 0
    1 0 1 1 0 0 0 0

It would thus be transformed to:

    1 1 0 0 0 
A = 1 0 0 1 0 
    1 0 1 1 0 
rayryeng
  • 102,964
  • 22
  • 184
  • 193
Sardar Usama
  • 19,536
  • 9
  • 36
  • 58
  • Your example still has a zero column, is this intentional? – sco1 Mar 24 '16 at 20:33
  • yes, i want to keep the first column from where zeros are starting – Sardar Usama Mar 24 '16 at 20:37
  • Will there always be or do you always want a column of all zeros? – TroyHaskin Mar 24 '16 at 20:38
  • There is an important edge case you have to consider. What happens if you have a column of zeroes in between relevant data? For example, let's say your matrix `A` had a column of zeroes in between the second and third column and these both have at least one row that is non-zero. What would be the desired output? Would you keep this column of zeroes while throwing out the other zero columns? – rayryeng Mar 24 '16 at 20:39
  • it's not necessary that the zeros will come. but if they come, i want to keep the first column of it! The algorithm is such that zeros will not come in between two non-zero columns – Sardar Usama Mar 24 '16 at 20:40

2 Answers2

3

One approach would be to use all and search along all rows for every column to see if every element in a column is equal to 0. We use find to determine these column locations. As soon as you do that, make a copy of the original matrix, then dump all columns that have zeroes except for the first time we encounter such a column:

ind = find(all(A == 0, 1));
out = A;
out(:,ind(2:end)) = [];

With your example, we get:

>> out
out =
     1     1     0     0     0
     1     0     0     1     0
     1     0     1     1     0

A nice point about this approach is that even if there are no columns of complete zero, find will return an empty array and slicing into an empty array will also produce an empty array. The removal step at the last line of code will thus have no effect and you maintain the same matrix as you did before.


If the constraint is maintained such that you will only see columns of zeroes at the end of your matrix and they don't appear in between valid data, we can do this by combining any and all with logical indexing:

out = A(:,any(A,1) | diff([false all(A == 0, 1)]));

We build a mask where the first part of it consists of all of the columns that are non-zero. any in this context will find all columns that are non-zero. This should happen at the very beginning of your data thus building the first half of the mask. The next part uses diff to find pairwise differences in combination with the array that is output by the same all call that we have seen before. We are assuming that the first column will never be non-zero (which is your case), and so padding an array where the first element is false followed by the same all call will determine a logical array where there will be only one time that the difference returned is non-zero, which is the point where the first zero column is returned. We set this location in the mask to be true as well as all other locations that are non-zero, and we finally subset into your matrix thus achieving your result.

rayryeng
  • 102,964
  • 22
  • 184
  • 193
2
 b = any(A);
 b(find(b == 0,1)) = 1;
 A=A(:,b)

fixed per comments to match op

hiandbaii
  • 1,276
  • 8
  • 14
  • this will also remove the first column from where zeros are starting. I want to keep that column as I explained by example! – Sardar Usama Mar 24 '16 at 20:38
  • This removes all columns of complete zero. The OP also wants to retain the first occurrence of a column of all zero. – rayryeng Mar 24 '16 at 20:52