0

I am getting crazy due to error message. I used exactly the script but with another matrix and I cannot compute the rowsum anymore.

I got this annoying error message:

x must be an array of at least two dimension

I want to compute the row sum of the 15 th column in matrix impact.

share <- rowSums(impact[,15],na.rm=T)

head(impact)
        ID  key bank    group   iob X2014.01    X2014.02    X2014.03    X2014.04    X2014.05    X2014.06    X2014.07    X2014.08    X2014.09    X2014.10    X2014.11    X2014.12    X2015.01    X2015.02    X2015.03    X2015.04    X2015.05    X2015.06    X2015.07    X2015.08
2   1   NA  NA  2   1   0.445205069 0.472390737 0.870477062 0.217721722 0.45105155  0.081988816 0.787682077 0.117770855 0.140369528 0.369301296 0.134638046 0.317541225 0.119500371 0.04335953  0.21347215  0.98924849  0.056345003 0.630135217 0.775518542 0.497615742
10  1   NA  NA  2   1   0.168419591 0.425645354 0.646613563 0.664511712 0.750356605 0.93621874  0.535499019 0.654868051 0.346500111 0.257706661 0.538854079 0.440520153 0.902426669 0.62364293  0.034292533 0.164502657 0.708733663 0.416106117 0.55308097  0.961736416
18  1   NA  NA  2   1   0.619040555 0.831943026 0.502364121 0.897383629 0.161324917 0.645435861 0.381065769 0.144287435 0.211246426 0.824972697 0.966528838 0.084932473 0.401207104 0.828860666 0.094734978 0.998390905 0.761376766 0.544001075 0.901412357 0.611515683
26  1   NA  NA  2   1   0.650375963 0.82854139  0.678481275 0.053565344 0.725918141 0.462696627 0.781661878 0.247926698 0.896495716 0.067714926 0.854996151 0.007778748 0.087166199 0.162193333 0.337942796 0.924925652 0.629788632 0.199940498 0.394249739 0.296213669
34  1   NA  NA  2   1   0.550807858 0.422672911 0.975977621 0.686356795 0.161541393 0.51490188  0.206613536 0.042012755 0.625714656 0.260060599 0.920103236 0.995255399 0.155289084 0.361658753 0.911763522 0.671250837 0.993388857 0.390214068 0.945968449 0.274847887
42  1   NA  NA  2   1   0.934880255 0.920203832 0.432055682 0.598642825 0.175905258 0.533883496 0.002016901 0.001015627 0.14724496  0.655515358 0.659772253 0.102383326 0.59884333  0.949273788 0.656322346 0.87928498  0.676120876 0.834748556 0.657029437 0.877257774
zx8754
  • 52,746
  • 12
  • 114
  • 209
richpiana
  • 411
  • 1
  • 7
  • 16
  • 1
    `impact` seems to be a vector. Yet without data we cannot say more than that. Can you provide data (the head of impact for example)? – DeveauP Apr 12 '16 at 13:44
  • @DeveauP impact is a matrix according to R – richpiana Apr 12 '16 at 13:47
  • 4
    Yes, but you are only providing the 15th column (`impact[,15]`), which is a vector, not a matrix. If you want the sums for each column, use `colSums(impact)`. If you just want the sum for the 15th column, simply use `sum(impact[,15])`. – slamballais Apr 12 '16 at 13:49
  • Nice catch @Laterow . I was about to post this as an answer. – RHertel Apr 12 '16 at 13:49
  • 1
    @RHertel well, looks like two people beat you to it :P I wanted to report it as [a dupe](http://stackoverflow.com/questions/17293999/rowsum-for-matrix-over-specified-number-of-columns-in-r), but too late now. (well, not too late..) – slamballais Apr 12 '16 at 13:51
  • @Laterow THANK YOU, i didn't understand why it was not working for one single column. Your explanation is great – richpiana Apr 12 '16 at 13:54
  • 3
    OP should use `rowSums(impact[,15, drop=FALSE])` if building a programmatic approach where 15 can be replaced by any vector > 0 indicating columns to be summed. – Pierre L Apr 12 '16 at 13:55
  • @RHertel Laterow is suggesting that OP use `impact[,x]` when x is length 1 and use `rowSums(impact[,x])` when x is greater than 1. It is better to have one function to handle all cases agnostic of length`rowSums(impact[,x,drop=FALSE])` – Pierre L Apr 12 '16 at 14:16
  • @PierreLafortune but, again, where is the difference between your suggestion and a simple subset of the form `impact[,15]`? I think the OP wants to perform at least some kind of sum; and in this case, I think that `colSums`, as suggested by Laterow, would be the function of choice - instead of `rowSums` which was probably used mistakenly by the OP. Moreover, I don't see any suggestion by Laterow to use `rowSums`. – RHertel Apr 12 '16 at 14:19
  • @RHertel OP has used the `rowSums` function successfully in the past and recently encountered the error message `must be an array of at least two dimensions` and did not understand what was different about this case from the others. The issue is `drop=TRUE` default behavior of `'['`. When the parameter is switched off, the function will work as expected. – Pierre L Apr 12 '16 at 14:24
  • @PierreLafortune Let's leave it at that. On my computer your solution doesn't work. It yields a named vector with the entries of the column 15. No sum. I usually upvote and appreciate your answers; in this case I have to disagree. No hard feelings ;-) – RHertel Apr 12 '16 at 14:27
  • None taken. The sum of a row with one column is the single value in the row. `rowSums(data.frame(1:5))` is `1:5`. – Pierre L Apr 12 '16 at 14:34

3 Answers3

3

Instead of using multiple functions for the same operation depending on the number of columns selected, you should address the default behavior directly. ?`[` informs the process saying, "the result is coerced to the lowest possible dimension", to mean if there is one column being subsetted it will be coerced to vector. We can cancel the effect with drop=FALSE. Example:

rowSums(impact[, 15, drop=FALSE])

#Or subset without commas
rowSums(impact[15])

This is advantageous compared to changing the function used when being used programmatically, we can replace 15 with any index for subsetting:

col_seq <- 1:ncol(impact)
indx <- sample(col_seq, sample(col_seq), replace=TRUE)
rowSums(impact[indx])

Update

Let's further explain why with another example:

df <- head(mtcars)
df[10:11]
#                  gear carb
#Mazda RX4            4    4
#Mazda RX4 Wag        4    4
#Datsun 710           4    1
#Hornet 4 Drive       3    1
#Hornet Sportabout    3    2
#Valiant              3    1

If we wanted to get the row sums of this subset we have a few options. Keep in mind what a row sum is, the sum of each row (i.e. 4+4 4+4 4+1 3+1 ...):

rowSums(df[10:11])
        Mazda RX4     Mazda RX4 Wag        Datsun 710    Hornet 4 Drive Hornet Sportabout           Valiant 
                8                 8                 5                 4                 5                 4 

Let's verify that the answer is correct:

all(rowSums(df[10:11]) == df[10] + df[11])
[1] TRUE

If we had one column, the row sums would simply be the column itself:

df[10]
#                  gear
#Mazda RX4            4
#Mazda RX4 Wag        4
#Datsun 710           4
#Hornet 4 Drive       3
#Hornet Sportabout    3
#Valiant              3

We can ask, what are the row sums of this subset? It is the same definition as the other, the sum of each row. But in this case, we can just return the column itself.

Why would we also use rowSums when it isn't even needed here? Because sometimes we are building functions programmatically. We may not know in advance that the index will be of length one. If we had one function that would find the sum whether it was many columns or one, we could program without worrying about the length of the index:

all(rowSums(df[,10, drop=FALSE]) == df[10])
[1] TRUE
Pierre L
  • 28,203
  • 6
  • 47
  • 69
  • It is not a mistake. I'll provide more details. – Pierre L Apr 12 '16 at 14:37
  • Not necessary, I got it, thanks. Sorry for the confusion. We interpreted the OP's intention very differently. Now I understand how you see it - although I tend to stick to my interpretation. If the OP had posted the desired output with a small example (I think that the expression "compute the row sum of the 15 th column" is not very clear), the misunderstanding could have been avoided. – RHertel Apr 12 '16 at 14:38
  • 1
    @RHertel I added a further explanation – Pierre L Apr 12 '16 at 14:48
2

The problem here is that you are trying to take the rowSums of just a column vector.

test_matrix <- matrix(1, nrow = 3, ncol = 2)

If we just grab the 2nd column here we end up with just a vector.

test_matrix[,2]

[1] 1 1 1

You cannot take the rowSums of a vector, which is why you are getting an error. You are effectively telling R to grab only the data in the 15th column (giving you a numeric vector, try class(impact[,15]) and you will see this to be true), and then trying to put that into the rowSums function, which requires a matrix (not a vector). If you just want the sum of the 15th column, then you just just take the sum of that subset (i.e. sum(impact[,15]).

mfidino
  • 3,030
  • 1
  • 9
  • 13
0

The row sums of a single column are just the values of that column itself.

Therefore, impact[, 15] is what you want.

If you want the sum of that column, sum(impact[, 15]) is what you want.

NewNameStat
  • 2,474
  • 1
  • 19
  • 26