What is the recommended way to iterate a matrix over rows?

Question

Given a matrix m = [10i+j for i=1:3, j=1:4], I can iterate over its rows by slicing the matrix:

for i=1:size(m,1)
    print(m[i,:])
end

Is this the only possibility? Is it the recommended way?

And what about comprehensions? Is slicing the only possibility to iterate over the rows of a matrix?

[ sum(m[i,:]) for i=1:size(m,1) ]

@jverzani mapslices does the job, although in some cases it will require I define an anonymous function. Thanks for the suggestions. — Nico, Feb 14 '14 at 21:29
For any new readers, make sure you check out the answer by Seanny123, as it contains a good solution for v1.1+ that was not originally available when this question was asked and answered. — Colin T Bowers, Oct 26 '22 at 10:37

tholy · Accepted Answer · 2022-06-26T10:53:09.100

68

The solution you listed yourself, as well as mapslices, both work fine. But if by "recommended" what you really mean is "high-performance", then the best answer is: don't iterate over rows.

The problem is that since arrays are stored in column-major order, for anything other than a small matrix you'll end up with a poor cache hit ratio if you traverse the array in row-major order.

As pointed out in an excellent blog post, if you want to sum over rows, your best bet is to do something like this:

msum = zeros(eltype(m), size(m, 1))
for j = 1:size(m,2)
    for i = 1:size(m,1)
        msum[i] += m[i,j]
    end
end

We traverse both m and msum in their native storage order, so each time we load a cache line we use all the values, yielding a cache hit ratio of 1. You might naively think it's better to traverse it in row-major order and accumulate the result to a tmp variable, but on any modern machine the cache miss is much more expensive than the msum[i] lookup.

Many of Julia's internal algorithms that take a dims keyword, like sum(m; dims=2), handle this for you.

edited Jun 26 '22 at 10:53

answered Feb 15 '14 at 14:11

tholy

11,882
1
29
42

1

I think this answers my question, I will wait another day to accept the answer. I like this answer very much because it's made me realise that since Julia is column-major, I better arrange my data vectors as columns rather than rows. – Nico Feb 15 '14 at 16:27
1

The blog post you've linked to no longer exists. See http://docs.julialang.org/en/release-0.4/manual/performance-tips/#access-arrays-in-memory-order-along-columns instead. – aventurin May 09 '16 at 18:42
3

404 seems due to trailing slash. This URL works: http://julialang.org/blog/2013/09/fast-numeric – Isaiah Norton May 25 '16 at 20:03
1

That was correct. but what about a 3-dimensional array? e.g: `A[i,j,k]` . What is the order? `k --> j --> i` or `i --> j --> k` ? – Alireza Ghavaminia May 26 '18 at 03:27
Dimensions are ordered fastest-to-slowest. – tholy Jun 26 '22 at 10:50

score 26 · Answer 2 · answered Feb 21 '19 at 22:23

26

As of Julia 1.1, there are iterator utilities for iterating over the columns or rows of a matrix. To iterate over rows:

M = [1 2 3; 4 5 6; 7 8 9]

for row in eachrow(af)
    println(row)
end

Will output:

[1, 2, 3]
[4, 5, 6]
[7, 8, 9]

answered Feb 21 '19 at 22:23

Seanny123

8,776
13
68
124

Is there any way to also get the row indices using this method? That is, not just each row itself, but also its index. – Skumin Mar 13 '20 at 16:48
1

Skumin, you can use `for (i, row) in enumerate(eachrow(M))`. In fact, you can apply `enumerate` to any iterator – Reiner Martin Mar 02 '21 at 07:55
1

What are the performance implications of using `eachrow(df)`? Is it comparable to a naive loop? – zeawoas Sep 03 '21 at 10:09

score 4 · Answer 3 · answered Mar 16 '15 at 16:36

4

According to my experiences, explicit iterations are much faster than comprehensions.

And iterating over columns are also a good advice.

Besides, you can use the new macros @simd and @inbounds to further accelerate it.

answered Mar 16 '15 at 16:36

Sisyphuss

101
1
1

Jake Ireland · Answer 4 · 2021-10-04T21:32:05.553

In my case, I could not use the eachrow iterator, or nested loops, as I needed to zip eachindex with something else, and iterate over that zip iterator. Hence, I wrote:

ncols = size(m, 2)
for i in eachindex(m)
    rowi, coli = fldmod1(i, ncols)
    elem = m[rowi, coli]
end

Note that this will only work where eachindex returns linear indexing. If eachindex returns an iterator of Cartesian coordinates, you may need to iterate over 1:prod(size(m)) instead.

What is the recommended way to iterate a matrix over rows?

4 Answers4

Linked