Water capacity of a 2D array

Question

I have to do a little exercise at my university but I am already stuck for a while. The exercise is about calculating the water capacity of a 2D array, the user has to enter the width (w) and the height (h) of the 2D array, and then all the elements of the array, which represent the height at that location. Really simple example:

10 10 10
10 2 10
10 10 10

The output will then be 8, because that is the maximum water that fits in there. Another example is:

 6 4
 1 5 1 5 4 3
 5 1 5 1 2 4
 1 5 1 4 1 5
 3 1 3 6 4 1

Output will be 14.

What also important to mention is: The width and height of the array can not be larger than 1000 and the heights of the element cannot be larger than 10^5.

Now I basically have the solution, but it is not fast enough for larger inputs. What I did is the following: I add the heights to a TreeSet and then every time I poll the last one (the highest) and then I go through the array (not looking at the edges) and use DFS and check for every position if the water can stay in there. If the water doesn't go out of the array than calculate the positions that are under water, if it goes out of the array then poll again and do the same.

I also tried looking at the peaks in the array, by going vertically and horizontally. For the example above you get this:

0 5 0 5 4 0
5 0 5 0 0 4
0 5 0 4 0 5
3 1 3 6 4 0

What I did with this was give the peaks a color let say (black) and then for all the white colors take the minimum peak value with DFS again and then take that minimum to calculate the water capacity. But this doesn't work, because for example:

Now 3 is a peak, but the water level is 7 everywhere. So this won't work.

But because my solution is not fast enough, I am looking for a more efficient one. This is the part of the code where the magic happens:

    while (p.size() != 0 || numberOfNodesVisited!= (w-2)*(h-2)) {
        max = p.pollLast();
        for (int i=1; i < h-1; i++) {
            for (int j=1; j < w-1; j++) {
                if (color[i][j] == 0) {
                    DFSVisit(profile, i, j);
                    if (!waterIsOut) {
                        sum+= solveSubProblem(heights, max);
                        numberOfNodesVisited += heights.size();
                        for(int x = 0; x < color.length; x++) {
                            color2[x] = color[x].clone();
                        }
                    } else {
                        for(int x = 0; x < color2.length; x++) {
                            color[x] = color2[x].clone();
                        }
                        waterIsOut = false;
                    }
                    heights.clear();
                }
            }
        }
   }

Note I am resetting the paths and the colors every time, I think this is the part that has to be improved.

And my DFS: I have three colors 2 (black) it is visited, 1 (gray) if it is an edge and 0 (white) if is not visited and not an edge.

 public void DFSVisit(int[][] profile, int i, int j) {
    color[i][j] = 2; // black
    heights.add(profile[i][j]);
    if (!waterIsOut && heights.size() < 500) { 
        if (color[i+1][j] == 0 && max > profile[i+1][j]) { // up
            DFSVisit(profile, i+1, j);
        } else if (color[i+1][j] == 1 && max > profile[i+1][j]) {
            waterIsOut = true;
        }
        if (color[i-1][j] == 0 && max > profile[i-1][j]) { // down
            DFSVisit(profile, i-1, j);
        } else if (color[i-1][j] == 1 && max > profile[i-1][j]) {
            waterIsOut = true;
        }
        if (color[i][j+1] == 0 && max > profile[i][j+1]) { // right
            DFSVisit(profile, i, j+1);
        } else if (color[i][j+1] == 1  && max > profile[i][j+1]) {
            waterIsOut = true;
        }
        if (color[i][j-1] == 0  && max > profile[i][j-1]) { //left
            DFSVisit(profile, i, j-1);
        } else if (color[i][j-1] == 1  && max > profile[i][j-1]) {
            waterIsOut = true;
        }
    }
}

UPDATE @dufresnb referred to talentbuddy.co where the same exercise is given at https://www.talentbuddy.co/challenge/526efd7f4af0110af3836603. However I tested al lot of solutions and a few of them actually make it through my first four test cases, most of them however already fail on the easy ones. Talent buddy did a bad job on making test cases: in fact they only have two. If you want to see the solutions they have just register and enter this code (language C): it is enough to pass their test cases

#include <stdio.h>

void rain(int m, int *heights, int heights_length) {
    //What tests do we have here?
    if (m==6)
        printf("5");
    else if (m==3)
        printf("4");
    //Looks like we need some more tests.
}

UPDATE @tobias_k solution is a working solution, however just like my solution it is not efficient enough to pass the larger input test cases, does anyone have an idea for an more efficient implementation?

Any ideas and help will be much appreciated.

@moffeltje see it as 3D block building, or was this meant to be funny? — Chantal, Mar 23 '15 at 15:07
No it was not, I don't understand why the first example has output 8. — moffeltje, Mar 23 '15 at 15:12
How the "volume" of water is calculated is unclear to me as well. The formula to calculate that, besides getting at the answer to 'solving' an array, may also help to reframe the problem. — Surreal Dreams, Mar 23 '15 at 15:13
@moffeltje the border of the "jar" is then "elements" high. The center is two "elements" high. So you can add 8 units of water until it overflows the jars. See the numbers as the third dimension of a jar or something like that. — Tom, Mar 23 '15 at 15:16
I think you could use a sort of repeated [flood-fill](http://en.wikipedia.org/wiki/Flood_fill) approach. Repeat for all number from `0` to the highest in the array: Starting from the borders, flood-fill all cells that have a lower-or-equal than the current number, and for each cell first reached in that turn memorize that number. That's how high the water can stand in that cell. Then just substract from those numbers the original number. Complexity for NxN array with highest number K should be `O(K*N^2)`, prob. `O(N^2)` with some optimization. Not sure whether that better or worse than yours. — tobias_k, Mar 23 '15 at 15:25
@tobias_k The approach is basically the same as mine, and it works but it's not fast enough. — Chantal, Mar 23 '15 at 20:34
@Chantal Well, since your array has NxM cells, I'm afraid _O(NxM)_ is as fast as it possibly gets... — tobias_k, Mar 23 '15 at 21:00
@tobias_k Mine is not O(NxM), nor is yours I think. Because you have to go from the highest to lowest in array (in worst case) and then for every value you have to go through the array and check if you can fill it with DFS. This will be in O(n^3). — Chantal, Mar 23 '15 at 21:13
Yes, it you see it this way, you could say the array is in fact three-dimensional, and we are iterating the height levels. But we do not have to repeat the entire process for each new height again. We can just start from where we left in the previous iteration. I'll try to write something up... — tobias_k, Mar 23 '15 at 21:16
Look into the watershed algorithm, http://cmm.ensmp.fr/~beucher/wtshed.html — dranxo, Mar 25 '15 at 00:20

tobias_k · Accepted Answer · 2015-03-27T10:53:17.167

1

Here's my take on the problem. The idea is as follows: You repeatedly flood-fill the array using increasing "sea levels". The level a node is first flooded will be the same level that the water would stay pooled over that node when the "flood" retreats.

for each height starting from the lowest to the highest level:
- put the outer nodes into a set, called fringe
- while there are more nodes in the fringe set, pop a node from the set
  - if this node was first reached in this iteration and its height is lesser or equal to the current flood height, memorize the current flood height for tha tnode
  - add all its neighbours that have not yet been flooded and have a height lesser or equal to the current flood height to the fringe

As it stands, this will have compexity O(nmz) for an n x m array with maximum elevation z, but with some optimization we can get it down to O(nm). For this, instead of using just one fringe, and each time working our way from the outside all the way inwards, we use multiple fringe sets, one for each elevation level, and put the nodes that we reach in the fringe corresponding to their own height (or the current fringe, if they are lower). This way, each node in the array is added to and removed from a fringe exactly once. And that's as fast as it possibly gets.

Here's some code. I've done it in Python, but you should be able to transfer this to Java -- just pretend it's executable pseudo-code. You can add a counter to see that the body of the while loop is indeed executed 24 times, and the result, for this example, is 14.

# setup and preparations
a = """1 5 1 5 4 3
       5 1 5 1 2 4
       1 5 1 4 1 5
       3 1 3 6 4 1"""
array = [[int(x) for x in line.strip().split()] 
         for line in a.strip().splitlines()]
cols, rows = len(array[0]), len(array)
border = set([(i, 0     ) for i in range(rows)] + 
             [(i, cols-1) for i in range(rows)] + 
             [(0, i     ) for i in range(cols)] + 
             [(rows-1, i) for i in range(cols)])
lowest  = min(array[x][y] for (x, y) in border) # lowest on border
highest = max(map(max, array))                  # highest overall

# distribute fringe nodes to separate fringes, one for each height level
import collections
fringes = collections.defaultdict(set) # maps points to sets
for (x, y) in border:
    fringes[array[x][y]].add((x, y))

# 2d-array how high the water can stand above each cell
fill_height = [[None for _ in range(cols)] for _ in range(rows)]
# for each consecutive height, flood-fill from current fringe inwards
for height in range(lowest, highest + 1):
    while fringes[height]: # while this set is non-empty...
        # remove next cell from current fringe and set fill-height
        (x, y) = fringes[height].pop()
        fill_height[x][y] = height
        # put not-yet-flooded neighbors into fringe for their elevation
        for x2, y2 in [(x-1, y), (x, y-1), (x+1, y), (x, y+1)]:
            if 0 <= x2 < rows and 0 <= y2 < cols and fill_height[x2][y2] is None:
                # get fringe for that height, auto-initialize with new set if not present
                fringes[max(height, array[x2][y2])].add((x2, y2))

# sum of water level minus ground level for all the cells
volume = sum(fill_height[x][y] - array[x][y] for x in range(cols) for y in range(rows))
print "VOLUME", volume

To read your larger test cases from files, replace the a = """...""" at the top with this:

with open("test") as f:
    a = f.read()

The file should contain just the raw array as in your question, without dimension information, separated with spaces and line breaks.

edited Mar 27 '15 at 10:53

answered Mar 23 '15 at 21:43

tobias_k

81,265
12
120
179

I don't see why this would work. And I don't get why you would start from the lowest to the highest, then you would have to do a lot more assignments (I think). Also what are you doing with the bounds? you cannot flood fill their neighbors. I never did anything in python, but I compiled your code and tested in on some test cases and well the small ones worked. But the larger ones take too much time to copy in your array a. I have no idea how to read a file with python though, maybe you can implement that it reads from a text file? Than I can test if your code is faster than mine. – Chantal Mar 24 '15 at 10:54
I thought over this a bit and now the complexity is down to O(nm), i.e. it looks at each cell of the array _exactly once_. This is as fast as it gets, and it worked for my tests. You should be able to use other tests by simply pasting the array in the format used in your quesiton between the `"""` at the top of the code. Make sure not to add empty lines, though. – tobias_k Mar 24 '15 at 11:06
your code works for the large test case. Xcode crashed when copying it into a, however using text editor worked fine, so I was able to test without reading a file. I am now just trying to understand your code, basically you have a set of (x,y) every time which has a height which used as the index of the array also (direct address table)? and with fill height you just use a set as a index, I don't think that is possible with java though haha. I think I will make in java just an object. – Chantal Mar 24 '15 at 12:21
I will write the code in java and then hopefully it will be enough to get through the hidden test cases of my teacher, thanks for your solution! Will keep you updated. – Chantal Mar 24 '15 at 12:23
The indexes to the dictionary (`HashMap` in Java) are tuples. In Java, there are no tuples, but you could just create your own `Point` class, or use `java.awt.Point` (it's from a graphics lib, but works just as well), or just make it another 2d-array. In Java, this will be a whole lot more code, but it is definitely doable. I also added a few more comments to make the transition to Java easier. – tobias_k Mar 24 '15 at 12:33
Sorry haven't start coding yet in java, I am busy with an other exercise right now, which has a deadline today. Don't worry though, I will keep you updated if it gets through the hidden test cases!:) – Chantal Mar 24 '15 at 18:08
Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/73780/discussion-between-chantal-and-tobias-k). – Chantal Mar 25 '15 at 14:56
your solution works, but it is also not efficient enough to pass the larger input test cases. – Chantal Mar 26 '15 at 18:24
@Chantal I'm sorry to hear, but I also find it hard to believe. When I tried my Python code on your 1000x1000 test, it completed in a few seconds, and the Java version should be even faster. However, when I tried the Java version on the test cases, I notices that the result for test 5 was off. I did not know then, and I do not have the code at hand now, but I think this might have been an integer overflow. Try changing the data type of the sum from `int` to `long`, then it should work, _if_ this is the problem. – tobias_k Mar 26 '15 at 18:49
@Chantal Also, as I said (and can be observed with the counter) the algorithm has complexity O(nm) for an n x m matrix. And that's definitely as fast as it gets, since you have to look at each element once just to determine the sum! There is literally no way to make this significantly faster. – tobias_k Mar 26 '15 at 18:51
I think you are wrong your loop while fringes[height] has complexity O(nm) which is the size of the matrix however the loop surrounding that loop from lowest to highest doesn't make this algorithm O(nm), it makes it O(n^2). See our little chat btw. – Chantal Mar 26 '15 at 19:53
@Chantal Excuse me, but you are wrong. The _entire_ `for height in range(lowest, highest + 1):` has complexity O(nm), as each node is added _exactly once_ to a fringe, i.e. all the fringes together hold exactly n*m elements, however those may be distributed to the iterations of the outer loop. Just check the counter. The initialization code has O(nm), the main-loop has O(nm), and adding the sum has another O(nm), makes O(3nm) = O(nm) in total. I would _really_ be interested in how you or your teacher would improve on this. – tobias_k Mar 26 '15 at 20:01
I don't think so. Set a counter in your first loop and a counter in your second loop and then try this test case: 3 3 1 100 1 1 100 100 100 1 100 then you will see it for yourself. And again have a little look at our chat, or can't you see it anymore? – Chantal Mar 26 '15 at 21:00
@Chantal I see what you mean: If the "height" is much larger than the width or height. However, this will just add a constant factor. i.e. for height _z_, the compelxity would then be _O(nm+z)_. Unless z is much larger than n+m, this should not matter at all. About the running time on the grading server: Have you removed all those system.out.printlns from the code? They eat up lots of time. – tobias_k Mar 26 '15 at 21:35
Yes ofc I did, but with the system outs the output is incorrect, so the server will just say wrong output, so keeping the system outs is not even an option. I agree though with the height being a constant, however your assumption that it doesn't matter when z is not larger than n+m is wrong. z can be in the worst case from 1 to 10000 and if the heights in the matrix with size in worst case 1000*1000 only contain heights from lets say {1, 9999, 9999, 9999, ... ,10000} it iterates for no reason from 2 to 9998, because the matrix has no such height. You don't like the chat do you?:) – Chantal Mar 26 '15 at 22:00
1

@Chantal I found the bottleneck. Replacing the hash map with a simple 2d-integer-array I got the execution time down dramatically, particularly in Java. Even though hash maps should have O(1) lookup, update, etc., there's still some overhead, e.g. for checking equality etc. – tobias_k Mar 27 '15 at 10:59
Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/73961/discussion-between-chantal-and-tobias-k). – Chantal Mar 27 '15 at 18:25

score 0 · Answer 2 · answered Mar 23 '15 at 15:29

talentbuddy.co has this problem as one of their coding tasks. It's called rain, if you make an account you can view other peoples solutions.

#include <iostream>
#include <vector>

bool check(int* myHeights, int x, int m, bool* checked,int size)
{
    checked[x]=true;
    if(myHeights[x-1]==myHeights[x] && (x-1)%m!=0 && !checked[x-1])
    {
        if(!check(myHeights,x-1,m,checked,size))return false;
    }
    else if((x-1)%m==0 && myHeights[x-1]<=myHeights[x])
    {
        return false;
    }
    if(myHeights[x+1]==myHeights[x] && (x+1)%m!=m-1 && !checked[x+1])
    {
        if(!check(myHeights,x+1,m,checked,size))return false;
    }
    else if((x+1)%m==m-1 && myHeights[x+1]<=myHeights[x])
    {
        return false;
    }
    if(myHeights[x-m]==myHeights[x] && (x-m)>m && !checked[x-m])
    {
        if(!check(myHeights,x-m,m,checked,size))return false;
    }
    else if((x-m)<m && myHeights[x-m]<=myHeights[x])
    {
        return false;
    }
    if(myHeights[x+m]==myHeights[x] && (x+m)<size-m && !checked[x+m])
    {
        if(!check(myHeights,x+m,m,checked,size))return false;
    }
    else if((x+m)>size-m && myHeights[x+m]<=myHeights[x])
    {
        return false;
    }
    return true;
}

void rain(int m, const std::vector<int> &heights) 
{
    int total=0;
    int max=1;
    if(m<=2 || heights.size()/m<=2)
    {
        std::cout << total << std::endl;
        return;
    }
    else
    {
        int myHeights[heights.size()];
        for(int x=0;x<heights.size();++x)
        {
            myHeights[x]=heights[x];
        }
        bool done=false;
        while(!done)
        {
            done=true;
            for(int x=m+1;x<heights.size()-m;++x)
            {
                if(x<=m || x%m==0 || x%m==m-1)
                {
                    continue;
                }

                int lower=0;
                if(myHeights[x]<myHeights[x-1])++lower;
                if(myHeights[x]<myHeights[x+1])++lower;
                if(myHeights[x]<myHeights[x-m])++lower;
                if(myHeights[x]<myHeights[x+m])++lower;

                if(lower==4)
                {
                    ++total;
                    ++myHeights[x];
                    done=false;
                }
                else if(lower>=2)
                {
                    bool checked[heights.size()];
                    for(int y=0;y<heights.size();++y)
                    {
                        checked[y]=false;
                    }
                    if(check(myHeights,x,m,checked,heights.size()))
                    {
                        ++total;
                        ++myHeights[x];
                        done=false;
                    }
                }
            }
        }
    }
    std::cout << total << std::endl;
    return;
}

Note: this is not java, but the principle is the same (I didn't actually look at the code for veracity) — Russell Uhl, Mar 23 '15 at 15:31
I figured it isn't really as helpful to spell out the answer as it is for them to have to translate the logic themself :) — dufresnb, Mar 23 '15 at 15:34
I agree. I just wanted to make sure the OP didn't copy/paste and wonder why nothing worked. — Russell Uhl, Mar 23 '15 at 15:35
Thx, will try this out soon and let you know! and Russell I am not stupid haha. — Chantal, Mar 23 '15 at 15:45
All of these codes don't work, they all work for the simple test cases, but not for the complicated ones. It's disturbing to see what code people send in their and actually gets through the test cases. It also disturbing to see that talent buddy has such bad test cases. Anyway I am not any further, my own code is still the best right now. — Chantal, Mar 23 '15 at 17:45

Water capacity of a 2D array

2 Answers2