why does `sum` on a Matrix return Matrix, not Vector?

Question

If I do

mat = rand(8,8)
sum(mat, 1)

the return type is a Matrix with a single row, whereas sum(mat, 2) gives a Matrix with a single column. This surprises me, as singleton dimensions are generally dropped in 0.5, so I would expect the return type of both operations would be a Vector. Why is the singleton dimension not dropped here?

I might expect this was in order to preserve the orientation (e.g. sum(mat, 1) is a row Vector), but the behaviour is the same on 0.6, which has explicit 1-d RowVectors, so this does not appear to be an explanation.

Thanks!

score 4 · Accepted Answer · edited May 23 '17 at 12:16

4

Yes, reductions like sum preserve the dimensionality of the array. This is intentional as it enables broadcasting the result back across the original array. This means that you can, for example, normalize the columns of an array with ./:

julia> A = rand(1:100, 4, 3)
4×3 Array{Int64,2}:
 94  50  32
 46  15  78
 34  29  41
 79  22  58

julia> A ./ sum(A, 1)
4×3 Array{Float64,2}:
 0.371542  0.431034  0.15311
 0.181818  0.12931   0.373206
 0.134387  0.25      0.196172
 0.312253  0.189655  0.277512

While the two-dimensional case might be able to be handled by RowVectors, that approach does not generalize to higher dimensions.

That said, there are other cases where dropping dimensions would be similarly useful. This is an open design question on the issue tracker.

edited May 23 '17 at 12:16

Community

1
1

answered Feb 20 '17 at 19:59

mbauman

30,958
4
88
123

Nice, thanks! I did think about the broadcast, but not that it wouldn't generalize to higher dimensions. I see the issue, so, thanks! This came up because I am defining functions on row and column sums of matrices, and it seems natural to define the input arguments as `Vector`s, but that means I have to call, e.g. `myfunc(vec(sum(mat,1)))`, which looks clumsy. But allowing the arguments to be Matrices seems like it could cause problems. I guess I can address this with dispatch, though. – Michael K. Borregaard Feb 20 '17 at 20:10
1

There's often no need to restrict argument types so strictly. Instead of `Vector`, you could almost certainly use `AbstractVector` — that'll include views and reshaped vectors and many other custom vector types. And it's common to go even wider than that to all dimensionalities and maybe even to `Any`. Sure, passing something nonsensical might error a little later than you might otherwise like, but it enables someone to pass you something that [looks like a duck](https://en.wikipedia.org/wiki/Duck_typing#In_Julia) and it would still quack just fine. – mbauman Feb 20 '17 at 20:32
Thanks, yes I am on board with that it is not normally necessary to restrict types in function arguments (and important inside types), and I like the principle! Also, I did use AbstractVector (sorry for the unclarity). In this case, though, a row matrix and a column matrix have different behavior from a Vector (e.g. when passed to a function such as `size`), so might it not lead to unexpected bugs if I ignored the dimensionality? In the end I did something along `f(x::AbstractMatrix) = f(vec(x)); f(x::AbstractVector)...` – Michael K. Borregaard Feb 21 '17 at 09:34

why does `sum` on a Matrix return Matrix, not Vector?

1 Answers1