Finding Islands In a Binary Matrix

Let's define some terms first

Before going any further lets establish what is a binary matrix. Well, that part is easy, it's a matrix of any dimension that consists of only 0's and 1's in each matrix cell. 

Here's a 4 x 4 binary matrix

Here's a 4 x 4 binary matrix

At first glance it resembles an adjacency matrix commonly used to represent a graph. On closer look we see a bit of a difference. First, we don't assign column and row headers to represent nodes. Second, the matrix can be of any dimension; 4x2, 10x10, 1x1, which isn't true for an adjacency matrix. A binary matrix can be thought of as a chess board or some sort of a coordinate plane divided into a grid. Each cell in the grid can be indexed into wit X(i,j) where i is the index of the column and j is the index of the row . A Cell containing a value of 1 represents  presence of some resource in the grid and 0 suggests an absence.

This leads us directly into the "Islands" problem. We can easily imagine that the each cell containing a valueof 1 represents a part of an island surrounded either by 0's (water) or neighboring 1's extending the island.

To fully understand the setup, imagine that a cell is considered a neighbor if it's adjacent either horizontally, vertically or diagonally to another cell.  To formalize the definition of neighbor we can assume that a matrix cell X(i, j) can only have neighbors in cells with coordinates:

(i-1,j-1),(i,j-1),(i+1,j-1),(i-1,j),(i+1,j),(i-1,j+1),(i,j+1),(i+1,j+1)

So why does this problem exists?

This problem tends to pop up in a lot of technical coding interviews because the solution can be trivial if the candidate understands core concepts of computer science such as recursion and depth first search. It also reveals in-depth knowledge of graphs and common graph traversal algorithms.

Does it have practical applications in real world problems?

Earlier I mentioned that the binary matrix is different from an adjacency matrix, however, if we augment our definition of a binary matrix by not allowing the uneven dimensions, then our new binary matrix definition can represent an un-directed graph, albeit with a few redundant data points such a self loops. With this new definition, the islands problem is simply a problem of looking for connected components of a graph. Read more about graph connectivity here.

Approaches

Many algorithms exist to find connected components of a graph. The approach I'm going to present in this article is likely not the most efficient, and I will present better solutions in future articles. However, my solution here will guarantee a correct answer and will demonstrate techniques of using depth first search and graph traversal. 

The algorithm: 

  1. Iterate through each cell in the NxM matrix, starting from cell (0,0) until (n,m) where N = (0,n) and M=(0,m). 
  2. For each cell (i,j) check if the cell exists and has not been visited before.
    1. If the above condition holds then start depth first search (DFS) at the cell (i,j)
    2. Else, continue to next cell
  3. Once DFS call returns increment islands count by 1.
  4. When iteration finishes simply return the island count.

DFS is a recursive function which will iterate through each cell t in set T of neighbors of cell (i,j). See definition of neighbors from above.

  1. If the t in T exists and has not been visited and contains 1
    1. Mark the cell visited
    2. Recursively call DFS for t
  2. Otherwise, continue to next t in T.
  3. Once, DFS recursive calls complete simply exit DFS

Notice, I mention we need to track if cells were visited or not. There are many ways to achieve this, including, keeping another matrix of visited cells or simply modify the original matrix and set the visited cells to something other than 1 or 0 (for example 2). The first approach is non-destructive but will require 2n space where n is number of cells. I prefer the second approach and my implementation will demonstrate this technique, but it could be easily transformed into a non-destructive solution if that's a requirement.

Complexity?

The worst case complexity of this algorithm is O(n). Lets examine this further. Our iteration will attempt to visit each cell in the matrix if the matrix is empty (all cells contain 0's). If the matrix is full (all cells contains 1's) the iteration will begin with cell (0,0) and the first DFS call will recurse through each cell. Since all cells will be marked visited by the DFS our iteration will simply skip over them, however it will still need to visit each cell, to ensure that ALL cells have been visited.

Implementation

I'm going to show a Java implementation of the above algorithm and a few test cases to validate my solution. You can also check out the complete solution on Github https://github.com/dkhanaferov/islands

The main entry point of the algorithm is the findIslandsCount(...) function which takes a two dimensional array representing a matrix. We can see it's very simple, all this method does is iterated through each cell of the two dimensional array and calls the DFS function if the cell has not been visited and contains a 1.

public int findIslandsCount(int[][] matrix) {

        int numIslands = 0;

        for(int i=0; i<matrix.length; i++) {
            for(int j=0; j<matrix[i].length; j++) {
                if(matrix[i][j]!=VISITED && matrix[i][j] == 1) {
                    dfs(i,j, matrix);
                    numIslands++;
                }
                else {
                    matrix[i][j]=VISITED;
                }
            }
        }

        return numIslands;
    }

The DFS function is equally simple.  All we need to do is iterate over all the neighbors of the current cell and check if we should recursively call the DFS on the each neighbor. We are legally allowed to recurse into the neighbor if the cell exists in the matrix and if it contains a 1 and if it has not been visited before.

/**
     * Depth first search recursive implementation will recursively look for 'legal' cells to visit
     * Will return when no more cells can be visited.
     *
     * @param i {@link int} row index
     * @param j {@link int} column index
     * @param matrix {@link int[][]} input matrix
     */
    private void dfs(int i, int j, int[][] matrix) {

        //Mark current cell as visited
        matrix[i][j] = VISITED;

        //iterate through each neighbors and if legal recursively call dfs
        for(int n=0; n<columns.length; n++) {

            int c = j+columns[n];
            int r = i+rows[n];

            if(isLegal(c, r, matrix)) {
                dfs(r, c, matrix);
            }
        }
    }

The only method we are still missing is the isLegal method, which simply checks if the neighbor is part of the island.

/**
     * Checks if the cell identified by coordinates (r,c) is legally part of an island
     * Which basically means the cell exists within the matrix (bounded by matrix indexes) and
     * has not been visited and contains a value of 1
     *
     * @param c {@link int} column index
     * @param r {@link int} row index
     * @param matrix {@link int[][]} input matrix
     * @return {@link boolean} is legal or not
     */
    private boolean isLegal(int c, int r, int[][] matrix) {
        return c>=0 && r>=0 && r<matrix.length && c<matrix[0].length && matrix[r][c]!=VISITED && matrix[r][c]==1;
    }

Finally, you probably noticed use of the rows and columns arrays to generate indexes (coordinates) of neighbors cells in the dfs() method. I defined these as constants like so:

private final int VISITED = 2;

    //define indexes of neighbors for fast lookup
    private final int[] columns = {-1, 0, 1, -1,1, -1,0,1};
    private final int[] rows =    {-1,-1,-1,0,0,1,1,1};

The idea here is if you take the each element from the two arrays at the same index you get the offset of the neighbor cell. So for example, (-1, -1) is the index offset of the top left neighbor cell. In order to calculate the proper index of that cell for any cell in the matrix all you need to do is add the cell's row and column index to the appropriate offset.

For example:

int c = j+columns[n];
int r = i+rows[n];

Thats about it for this algorithm. It's pretty compact and simple to read and understand. Check out the full solution on Github if you want to compile and run the test cases on your own.


Some test cases:

Bellow are some test cases I used to validate my algorithm. They are nowhere near a complete test coverage but I believe I cover enough cases to be convinced that the algorithm works.

@Resource(name = "dfs")
    private IslandsAlgorithm dfsAlgorithm;

    @Test
    public void emptyMatrix() {

        int[][] matrix = {
                {0,0,0,0},
                {0,0,0,0},
                {0,0,0,0},
                {0,0,0,0}
        };

        int expected = 0;
        int actual = dfsAlgorithm.findIslandsCount(matrix);

        assertEquals(expected, actual);
    }

    @Test
    public void fullMatrix() {

        int[][] matrix = {
                {1,1,1,1},
                {1,1,1,1},
                {1,1,1,1},
                {1,1,1,1}
        };

        int expected = 1;
        int actual = dfsAlgorithm.findIslandsCount(matrix);

        assertEquals(expected, actual);
    }

    @Test
    public void oneIslandsMatrixVariant1() {

        int[][] matrix = {
                {1,1,0,1},
                {0,0,1,0},
                {1,1,0,1},
                {1,1,0,1}
        };

        int expected = 1;
        int actual = dfsAlgorithm.findIslandsCount(matrix);

        assertEquals(expected, actual);
    }

    @Test
    public void oneIslandsMatrixVariant2() {

        int[][] matrix = {
                {1,0,0,1},
                {1,0,1,0},
                {1,1,0,1},
                {0,1,0,1}
        };

        int expected = 1;
        int actual = dfsAlgorithm.findIslandsCount(matrix);

        assertEquals(expected, actual);
    }

    @Test
    public void oneIslandsMatrixVariant3() {

        int[][] matrix = {
                {1,1,1,1},
                {1,1,1,0},
                {1,1,1,1},
                {0,1,1,1}
        };

        int expected = 1;
        int actual = dfsAlgorithm.findIslandsCount(matrix);

        assertEquals(expected, actual);
    }

    @Test
    public void oneIslandsMatrixVariant4() {

        int[][] matrix = {
                {0,0,0,0},
                {0,1,0,0},
                {0,0,0,0},
                {0,0,0,0}
        };

        int expected = 1;
        int actual = dfsAlgorithm.findIslandsCount(matrix);

        assertEquals(expected, actual);
    }

    @Test
    public void oneIslandsMatrixVariant5() {

        int[][] matrix = {
                {1,0,0,0},
                {0,1,0,0},
                {0,0,1,0},
                {0,0,0,1}
        };

        int expected = 1;
        int actual = dfsAlgorithm.findIslandsCount(matrix);

        assertEquals(expected, actual);
    }

    @Test
    public void oneIslandsMatrixVariant6() {

        int[][] matrix = {
                {1,0,0,1},
                {0,1,1,0},
                {0,1,1,0},
                {1,0,0,1}
        };

        int expected = 1;
        int actual = dfsAlgorithm.findIslandsCount(matrix);

        assertEquals(expected, actual);
    }

    @Test
    public void twoIslandsMatrix() {

        int[][] matrix = {
                {1,1,1,1},
                {0,0,0,0},
                {1,1,1,1},
                {1,1,1,1}
        };

        int expected = 2;
        int actual = dfsAlgorithm.findIslandsCount(matrix);

        assertEquals(expected, actual);
    }

    @Test
    public void threeIslandsMatrix() {

        int[][] matrix = {
                {1,1,0,1},
                {0,0,0,0},
                {1,1,1,1},
                {1,1,1,1}
        };

        int expected = 3;
        int actual = dfsAlgorithm.findIslandsCount(matrix);

        assertEquals(expected, actual);
    }

    @Test
    public void fourIslandsMatrixVariant1() {

        int[][] matrix = {
                {1,1,0,1},
                {0,0,0,0},
                {1,1,0,1},
                {1,1,0,1}
        };

        int expected = 4;
        int actual = dfsAlgorithm.findIslandsCount(matrix);

        assertEquals(expected, actual);
    }

    @Test
    public void fourIslandsMatrixVariant2() {

        int[][] matrix = {
                {1,0,0,1},
                {0,0,0,0},
                {0,0,0,0},
                {1,0,0,1}
        };

        int expected = 4;
        int actual = dfsAlgorithm.findIslandsCount(matrix);

        assertEquals(expected, actual);
    }