Efficient implementation of summed area table/integral image in R

Question

I am trying to construct a summed area table or integral image given an image matrix. For those of you who dont know what it is, from wikipedia:

A summed area table (also known as an integral image) is a data structure and algorithm for quickly and efficiently generating the sum of values in a rectangular subset of a grid

In other words, its used to sum up values of any rectangular region in the image/matrix in constant time.

I am trying to implement this in R. However, my code seems to take too long to run.

Here is the pseudo code from this link. in is the input matrix or image and intImg is whats returned

for i=0 to w do
   sum←0

   for j=0 to h do
      sum ← sum + in[i, j]

      if i = 0 then
         intImg[i, j] ← sum
      else
         intImg[i, j] ← intImg[i − 1, j] + sum
      end if
   end for 
end for

And here is my implementation

w = ncol(im)
h = nrow(im)
intImg = c(NA)
length(intImg) = w*h

for(i in 1:w){ #x
  sum = 0;
  for(j in 1:h){ #y
    ind = ((j-1)*w)+ (i-1) + 1 #index
    sum = sum + im[ind]
    if(i == 1){
      intImg[ind] = sum
    }else{
      intImg[ind] = intImg[ind-1]+sum
    }
  }
}
intImg = matrix(intImg, h, w, byrow=T)

Example of input and output matrix:

enter image description here

However, on a 480x640 matrix, this takes ~ 4 seconds. In the paper they describe it to take on the order of milliseconds for those dimensions.

Am I doing something inefficient in my loops or indexing?

I considered writing it in C++ and wrapping it in R, but I am not very familiar with C++.

Thank you

a link to `im` would make this reproducible. it's hard for me to parse what you're doing without an example, but you can probably use `colSums()` and `rowSums()` or some other vectorized function(s). — Chase, May 14 '13 at 14:50
`im` is just any matrix. I've added an example of the input and output in the question — Omar Wagih, May 14 '13 at 14:55
How much improvement do you get by changing `intImg = c(NA)` to `intImg <- rep(NA, 480*460)`? — joran, May 14 '13 at 14:59
Hi joran, I've tried that. Doesn't change much. What's the difference? — Omar Wagih, May 14 '13 at 15:01
You need to start being much more specific. "Doesn't change much" and providing images of example matrices isn't very helpful. — joran, May 14 '13 at 15:02
@by0 - it relates to how R has to (re)allocate memory for every iteration of a loop. — Chase, May 14 '13 at 15:02
Doesn't change much, meaning its still around 4 seconds. 4.02 seconds to be exact. Here, you can just copy paste this in your R console for an example matrix that isn't an image: `m = matrix(c(4,1,2,2,0,4,1,3,3,1,0,4,2,1,3,2), 4,4,byrow=T)` — Omar Wagih, May 14 '13 at 15:05

score 6 · Accepted Answer · answered May 14 '13 at 15:15

6

You could try to use apply (isn't faster than your for-loops if you pre-allocating the memory):

areaTable <- function(x) {
  return(apply(apply(x, 1, cumsum), 1, cumsum))
}

areaTable(m)
#      [,1] [,2] [,3] [,4]
# [1,]    4    5    7    9
# [2,]    4    9   12   17
# [3,]    7   13   16   25
# [4,]    9   16   22   33

answered May 14 '13 at 15:15

sgibb

25,396
3
68
74

1

It will be quicker - though probably not by a huge margin - as you are doing 8 `cumsum` calls (plus the loop forming code and allocation etc) whereas the OP's code is doing lots of calls to `+` to give the equivalent of your `cumsum` calls. As it is interpreted code all those function calls will add up. – Gavin Simpson May 14 '13 at 15:26
@GavinSimpson I'm actually surprised at how much faster it is, based on a quick timing of a 500x500 example. – joran May 14 '13 at 15:30
Yes, its vastly faster. I gave up waiting for two nested `for` loops, the double-apply took less than a second. – Spacedman May 14 '13 at 15:32
@joran I should have stuck to my initial thought - I went back and added the caveat before submitting the comment in a moment of self-doubt. `cumsum` is very efficient code and all those `+` calls will mount up as the dimensions of the input increase. Profiling the OP's code will show that. – Gavin Simpson May 14 '13 at 15:33
@GavinSimpson: you are right, I didn't thought about this fact of interpreted code. – sgibb May 14 '13 at 16:06
@sgibb, explain how can we make it fast in c++? – Abc Dec 27 '16 at 07:47

Efficient implementation of summed area table/integral image in R

1 Answers1