8

You are given an image of a surface photographed by a satellite.The image is a bitmap where water is marked by '.' and land is marked by '*'. Adjacent group of '*'s form an island. (Two '*' are adjacent if they are horizontal, vertical or diagonal neighbours). Your task is to print the number of islands in the bitmap.

Example Input:-

.........**
**......***
...........
...*.......
*........*.
*.........*

Output:- 5

Here, is my implementation which takes O(r * c) space and O(r * c) space where r is total no. of rows and c is total no of cols.

#include <stdio.h>
#define COLS 12

void markVisted(char map[][COLS], int visited[][COLS], int row, int col, int rowCount)
{
    if((row < 0) || (row >= rowCount) || (col < 0) || (col >= COLS) || (map[row][col] != '*') || (visited[row][col] == 1)) return;

    visited[row][col] = 1;

    //calling neighbours
    markVisted(map, visited, row+1, col, rowCount);
    markVisted(map, visited, row, col+1, rowCount);
    markVisted(map, visited, row-1, col, rowCount);
    markVisted(map, visited, row, col-1, rowCount);
    markVisted(map, visited, row+1, col+1, rowCount);
    markVisted(map, visited, row-1, col-1, rowCount);
    markVisted(map, visited, row-1, col+1, rowCount);
    markVisted(map, visited, row+1, col-1, rowCount);
}
int countIslands(char map[][COLS], int visited[][COLS], int rowCount)
{
    int i, j, count = 0;
    for(i=0; i<rowCount; ++i){
        for(j=0; j<COLS; ++j){

            if((map[i][j] == '*') && (visited[i][j] == 0)){
                ++count;
                markVisted(map, visited, i, j, rowCount);
            }
        }
    }
    return count;
}

int main()
{
    char map[][COLS] = {
                    "*..........",
                    "**........*",
                    "...........",
                    "...*.......",
                    "*........*.",
                    "..........*"               
                    };
    int rows = sizeof(map)/sizeof(map[0]);
    int visited[rows][COLS], i, j;  

    for(i=0; i<rows; ++i){
        for(j=0; j<COLS; ++j) visited[i][j] = 0;
    }

    printf("No. of islands = %d\n", countIslands(map, visited, rows));


    return 0;
}

please suggest some better logic for this problem
also, suggestions to improve my solution is welcomed.

Mathew Thompson
  • 55,877
  • 15
  • 127
  • 148
Eight
  • 4,194
  • 5
  • 30
  • 51
  • The question is interesting, but describing your algorithm by printing the source code is poor. You need to provide an explanation. –  Aug 13 '12 at 14:16
  • 1
    Sometimes, though, the source *is* the best documentation. – JayC Aug 13 '12 at 14:20
  • 5
    Your algorithm is well written and fine enough and there is no algorithm better than yours, because any algorithm to solve this should check each node state at least one time and will have a same order as your algorithm. – Saeed Amiri Aug 13 '12 at 14:21
  • 1
    By `O (n ^ 2)` you really mean `O (r * c)`, right? – pmg Aug 13 '12 at 14:22
  • Right, if `n` is your input size this is actually `O(n)` as you do one thing for each input cell. – Claudiu Aug 13 '12 at 14:22
  • 1
    I think this might be more on-topic in [Code Review](http://codereview.stackexchange.com/), but it appears to be welcome here, so I don't know for sure. – hyper-neutrino Apr 08 '17 at 21:10

5 Answers5

9

I think the confusion here is that your algorithm does actually run in linear time, not quadratic time.

When using big-O notation, n stands for the input size. Your input here is not just r or just c, but rather, r * c, as it is a grid of nodes. Your algorithm runs in O(r * c), as you said in your question... thus your algorithm runs in O(n) which is linear time.

It seems to me that any algorithm that solves this problem will have to read each input cell once in the worst case. Thus the best running time you can hope for is O(n). As your algorithm runs in O(n) you can't have any algorithm that runs of a faster order, in the worst case, than the algorithm you proposed.

I can think of some clever tricks. For example, if you have a block of *s, you could only check the diagonals, in certain cases. That is, if you have

......
.****.
.****.
.****.
.****.
......

it won't matter if you only read these cells:

......
.*.*..
..*.*.
.*.*..
..*.*.
......

unless for example you have something in the bottom-left-most corner, in which case you would need to read that bottom-left-most *. So maybe in certain cases your algorithm can run more quickly, but for the worst case (which is what O measures), it will have to be O(n).

EDIT: Also even in that case where you only read half the nodes, the run-time would be O(n/2) which is still of the same order (O(n)).

Claudiu
  • 224,032
  • 165
  • 485
  • 680
2

This is highly related to connected component labeling. The number of connected component is just a byproduct of the labeling. Algorithm described in the linked wikipedia article works in linear time.

Nicolas Barbey
  • 6,639
  • 4
  • 28
  • 34
1
  1. Create a undirected graph, where each island node connects to its neighboor island nodes.

  2. While there are unvisited nodes:

    • pick an unvisited node, do a depth-first traversal and mark every node visited, increase number_of_islands.
  3. Done.

Both (1) and (2) takes O(n) time.

Ali Ferhat
  • 2,511
  • 17
  • 24
  • 1
    this is basically what the OP's code does except he uses an implicit graph. this algorithm is of the same order as the OP's. – Claudiu Aug 13 '12 at 14:26
  • Yes. But my answer is more conscious, more understandable, more obviously correct, uses correct terminology and reusable components. I'd hire me instead of OP if I were the interviewer :-) – Ali Ferhat Aug 13 '12 at 14:33
  • your answer is also slower (has more overhead), especially if you implement it naively (e.g. using pointers for the graph, thus you now have memory allocation all over the place). and it isn't as simple. i would rather add a comment in the OP's code saying "do depth-first search on neighboring islands" and leave it at that instead of re-designing the whole thing – Claudiu Aug 13 '12 at 14:41
  • Correct. Mine is only a simple, quick, correct answer to an *interview question* (that can be reused to answer many similar questions as well) – Ali Ferhat Aug 13 '12 at 14:57
  • yea true. gave you the +1 cause i do see the value of being able to express it so succinctly (instead of pasting a bunch of code). i guess i would just replace "create a undirected graph" with "treat the array as an undirected graph". – Claudiu Aug 13 '12 at 18:06
1

Asymptotically your approach is the best O(n).

However, I noticed a couple of things:

First:

inside the function markVisited you check a cells neighbors in the order:

down, right, up, left, down-right, up-left, up-right, down-left

A better order would be:

left, right, up-left, up, up-right, down-left, down, down-right

This will play nicer in the cache since it is starting by reading values directly next to the current position, and sequentially in a given row.

(note that starting with the diagonals would mark visited to a larger number of locations, but since checking if a cell was visited is only the last check it wouldn't save any time).

Secondly:

In the worst case of a satellite image that contains only land, your algorithm will visit each cell multiple times, (something like once for each neighbor the cell has).

This means you are approximately doing eight times more array accesses than possibly needed.

I believe that you can solve this problem with a more or less linear pass over the data.

I'm currently thinking of an approach, if it works I'll add the code as an edit.

Xantix
  • 3,321
  • 1
  • 14
  • 28
0

Without any apriori knowledge about the nature of islands the algorithm can not be made more efficient than O(n) in time, however memory-wise your algo can be improved. The visited array is simply redundant. Here is a quick attempt (pardon the usage of ASCII artihmetics - not so readable, but quicker to code)

#include <stdio.h>
#define COLS 12


int main()
{
    char map[][COLS] = {
            "*..........",
            "**........*",
            "...........",
            "...*.......",
            "*........*.",
            "..........*"               
            };
    int rows = sizeof(map)/sizeof(map[0]);
    int i, j;  

    int main_count = 0;

    if(map[0][0] == '*') {
        main_count++;
    }
    for(j=0; j<COLS-1; j++) {
        if((map[0][j]-map[0][j+1])==4) {
            main_count++;   
        }
    }

    for(i=1; i<rows; ++i){
        if(map[i][0] == '*' && (map[i][0]-map[i-1][0]-map[i-1][1])==-50) {
            main_count++;
        }
        for(j=0; j<COLS-1; j++) {
            if((map[i][j]-map[i][j+1])==4) {
                if( j==COLS-2 && (map[i][j+1]-map[i-1][j]-map[i-1][j+1])==-50) {
                    main_count++;
                }   
                if( j!=COLS-2 && (map[i][j+1]-map[i-1][j]-map[i-1][j+1])-map[i-1][j+1]==-96) {
                    main_count++;
                }   
            }
        }
    }

    printf("Number of islands: %d\n", main_count);

    return 0;
}
Arik G
  • 484
  • 2
  • 5