Buttercola: String and Array

Tuesday, September 16, 2014

Leetcode: Plus One

Given a non-negative number represented as an array of digits, plus one to the number.

The digits are stored such that the most significant digit is at the head of the list.

Understand the problem:
The problem gives an array of digits, return a new array plus one to the number. Note that the digits are stored such that the most significant digit is at the head of the list.
We can take several examples to show the problem:
For the input array 123, the new array is 124.
For the input array 99, the new array is 100.

Solution:
The solution is again very similar to adding the two integers. So we can add from the last digit. Note the carry number after the loop iteration.

Code (Java):

public class Solution {
    public int[] plusOne(int[] digits) {
        ArrayList<Integer> result = new ArrayList<Integer>();
        if (digits == null || digits.length == 0) {
            int[] temp = {1};
            return temp;
        }
        
        int carry = 0;
        for (int i = digits.length - 1; i >= 0; i--) {
            if (i == digits.length - 1) {
                carry = carry + digits[i] + 1;
            } else {
                carry += digits[i];
            }
            result.add(0, carry % 10);
            carry /= 10;
        }
        
        if (carry == 1) {
            result.add(0, 1);
        }
        int resultArray[] = new int[result.size()];
        for (int i = 0; i < result.size(); i++) {
            resultArray[i] = result.get(i);
        }
        
        return resultArray;
    }
}

Discussion:
The solution itself is correct but not very efficient. In this solution, we need to traverse all digits of the array, no matter if there is a carry value. Actually we know that only if the digit is 9, there is a carry value to consider. So one better idea is to plus one at the original digital array. Then we can safely stop the process as long as there is no carry value.

Code (Java):

public class Solution {
    public int[] plusOne(int[] digits) {
        if (digits == null || digits.length== 0) {
            int[] temp = {1};
            return temp;
        }
        
        int carry = 1;
        int i;
        for (i = digits.length - 1; i >= 0; i--) {
            if (digits[i] == 9) {
                digits[i] = 0;
            } else {
                carry += digits[i];
                digits[i] = carry;
                break;
            }
        }
        
        if (i < 0) {
            int[] result = new int[digits.length + 1];
            result[0] = 1;
            return result; 
        } else {
            return digits;
        }
    }
}

Summary:
The second solution is smart by not easy to understand at the first glance. Both of the solutions have its pros and cons. In the first solution, we have to iterate all digits of the array no matter if there exists the carry value. However, it does not modify the original array, which is sometimes preferred in certain cases. The second solution is neat but modify the original input array. In addition, the second solution is restricted to only plus one problem, whereas the first solution is more general to any adding two numbers problems.

So the take-away message is no matter what solution you come out, you need to justify and analyze it.

Wednesday, September 3, 2014

Longest Common Prefix

Write a function to find the longest common prefix string amongst an array of strings.

Understand the problem:
The question asks for finding the longest common prefix (LCP) string among an array of strings. We can take an example to illustrate this problem. For the array of strings
a
bc
cd
The LCP string is empty as there is no common prefix at all.

For the array of strings
a
ab
abc
The LCP string is "a" since a is the longest common prefix string

For an array of strings
abc
abd
abfghi
The LCP string is "ab" since it is the longest common prefix string.

Solution:
Based on the analysis above, we can come up a solution. We compare the character with the same index for each of the string. If the same, we go to the next. There are two termination conditions:
a. the LCP is the shortest string, as in the example 2.
b. the LCP is the substring of a string, we continue until we found the character is different.

Code (Java):

public class Solution {
    public String longestCommonPrefix(String[] strs) {
        if (strs == null || strs.length == 0) return "";
        
        for (int i = 0; i < strs[0].length(); i++) {
            for (int j = 0; j < strs.length - 1; j++) {
                if (i >= strs[j].length() || i >= strs[j + 1].length() || strs[j].charAt(i) != strs[j + 1].charAt(i)) {
                    return strs[j].substring(0, i);
                }
            }
        }
        return strs[0];
    }
}

Discussion:
The time complexity of this solution is O(m * n), where m is the length of the LCP string, n is the number of strings in the array.

Summary:
It is not a very difficult question, but requires careful implementation. Note the outer loop, where i < strs[0].length(). Think about why we choose to use strs[0].length() ? Actually what we wanna find is the LCP string. So if strs[0] itself is the LCP string, it is fine. If strs[0] is not, we have other conditions inside of the loop to terminate the loop. How about we choose strs[1]? It is basically correct, however be careful about the size of the array, which has only one element. So we must change the code slightly like this:

public class Solution {
    public String longestCommonPrefix(String[] strs) {
        if (strs == null || strs.length == 0) return "";
        if (strs.length == 1) return strs[0];
        
        for (int i = 0; i < strs[1].length(); i++) {
            for (int j = 0; j < strs.length - 1; j++) {
                if (i >= strs[j].length() || i >= strs[j + 1].length() || strs[j].charAt(i) != strs[j + 1].charAt(i)) {
                    return strs[j].substring(0, i);
                }
            }
        }
        return strs[1];
    }
}

Update on 9/28/15:

public class Solution {
    public String longestCommonPrefix(String[] strs) {
        if (strs == null || strs.length == 0) {
            return "";
        }
        
        StringBuffer sb = new StringBuffer();
        
        for (int i = 0; i < strs[0].length(); i++) {
            char curr = strs[0].charAt(i);
            int j = 1;
            for (j = 1; j < strs.length; j++) {
                String str = strs[j];
                if (str == null || str.length() == 0 || i >= str.length() || str.charAt(i) != curr) {
                    break;
                }
            }
            if (j == strs.length) {
                sb.append(curr);
            } else {
                break;
            }
        }
        
        return sb.toString();
    }
}

Tuesday, September 2, 2014

Leetcode: ZigZag Conversion

The string "PAYPALISHIRING" is written in a zigzag pattern on a given number of rows like this: (you may want to display this pattern in a fixed font for better legibility)

P   A   H   N
A P L S I I G
Y   I   R

And then read line by line: "PAHNAPLSIIGYIR"

Write the code that will take a string and make this conversion given a number of rows:

string convert(string text, int nRows);

convert("PAYPALISHIRING", 3) should return "PAHNAPLSIIGYIR".

Understand the problem:
The crux of the problem is to understand what is the "zigzag" format. We can use examples to illustrate this. For number of rows is 2,

0 A C
1 B D

For nRows = 3,
0 A E
1 B D F
2 C G

For nRows = 4
0 A G
1 B F H
2 C E I
3 D M

It is actually easier to understand the problem putting the characters into a mesh grid.

Solution:
We can divide the zigzag format into several zigzags. For nRows = 4,
1 | 7 |
2 6 | 8 12 |
3 5 | 9 11 |
4 | 10 |

So each zigzag has number of nRows * 2 - 2 elements.

For each zigzag, the first row and last row, the distance between two indices is nRow * 2 - 2. For example, distance between 1 and 7 is 6, distance between 4 and 10 is 6, which can be calculated as 4 * 2 - 2 = 6.

For each other rows from 1 to nRows - 2, the distance between two indices is (nRows - i - 1 ) * 2 = nRows * 2 - 2 - 2 * i. Note that the distance between A to B is defined as [A, B). Also note that for each zigzag, there are at most two elements in a row.

Based on the analysis above, we can come up the solution below:

Code (Java):

public class Solution {
    public String convert(String s, int nRows) {
        if (s == null || s.length() == 0 || nRows <= 0)
            return "";
        if (nRows == 1) return s;
        
        StringBuilder sb = new StringBuilder(); //non-thread safe
        int size = 2 * nRows - 2; // number of elements for each zigzag
        
        for (int i = 0; i < nRows; i++) {
            for (int j = i; j < s.length(); j += size) {
                sb.append(s.charAt(j));
                
                if (i != 0 && i != nRows - 1 && (j + size - 2 * i) < s.length()) {
                    sb.append(s.charAt(j + size - 2 * i));
                }
            }
        }
        return sb.toString();
    }
}

Discussion:
The time complexity is O(n) since we only traverse the string once. The space complexity is O(n) as well since we performed out-of-place transform.

Summary:
The crux of this problem is to convert the indices from zigzag-major to row-major. Be familiar with the zigzag-major format.

Tuesday, August 26, 2014

Leetcode: Triangle

Given a triangle, find the minimum path sum from top to bottom. Each step you may move to adjacent numbers on the row below.

For example, given the following triangle

[
     [2],
    [3,4],
   [6,5,7],
  [4,1,8,3]
]

The minimum path sum from top to bottom is 11 (i.e., 2 + 3 + 5 + 1 = 11).

Note:
Bonus point if you are able to do this using only O(n) extra space, where n is the total number of rows in the triangle.

Understand the problem:
As described by the problem, given a triangle, find the minimum path sum from top to bottom. Each step you may move to adjacent numbers on the row below.
So the key to understand this problem is to know what is adjacent numbers?
For index = 1 in row i, the adjacent numbers that 1 can go is index 0, 1, 2 on row i + 1.

In addition, the problem allows negative numbers in the triangle, so we must consider all possible paths.

However, after reading some posts, I found that the adjacent numbers for A[ i ][ j ] is actually A[ i + 1 ][ j ] and A[ i + 1 ] [ j + 1 ], no A[i + 1] [j - 1].

Recursive Solution:
Consider the DFS. Use post-order traversal, i.e. bottom-up traversal.

Code (Java):

public class Solution {
    public int minimumTotal(List<List<Integer>> triangle) {
        return postOrderTraversal(triangle, 0, 0);
    }
    
    private int postOrderTraversal(List<List<Integer>> triangle, int layer, int next) {
        if (layer == triangle.size()) return 0;
        
        int lMax = postOrderTraversal(triangle, layer + 1, next);
        int rMax = postOrderTraversal(triangle, layer + 1, next + 1);
        
        return Math.min(lMax, rMax) + triangle.get(layer).get(next);
    }
}

Discussion:
Now let's analyze the complexity. The time complexity is O(2^n) since it has many repeated calculations. Therefore we can think of a DP solution.

DP Solution:

public class Solution {
    public int minimumTotal(List<List<Integer>> triangle) {
        int[][] dp = new int[triangle.size()][triangle.size()];
        return postOrderTraversal(triangle, 0, 0, dp);
    }
    
    private int postOrderTraversal(List<List<Integer>> triangle, int layer, int next, int[][] dp) {
        if (layer == triangle.size()) return 0;
        
        if (dp[layer][next] != 0) return dp[layer][next];
        
        int lMax = postOrderTraversal(triangle, layer + 1, next, dp);
        int rMax = postOrderTraversal(triangle, layer + 1, next + 1, dp);
        
        dp[layer][next] = Math.min(lMax, rMax) + triangle.get(layer).get(next);
        return dp[layer][next];
    }
}

Discussion:
From the solution, we can see that the space complexity now becomes O(n^2) since we allocated O(n^2) space of the array, where n is the number of rows.

A Space-optimal Solution:
For the DP solution, we can find actually we only need to store layer i + 1 for calculating layer i. Consequently, we only need to create an array with size row (maximal number of numbers for the last row).

Since DFS cannot traverse the "tree" in level order. In this solution, we use an iterative approach.

Code (Java):

public class Solution {
    public int minimumTotal(List<List<Integer>> triangle) {
        if (triangle == null || triangle.isEmpty()) return 0;
        
        int[] dp = new int[triangle.size()];
        
        // save the last row
        for (int i = 0; i < triangle.size(); i++) {
            dp[i] = triangle.get(triangle.size() - 1).get(i);
        }
        
        // start from the second last row bottom-up
        for (int i = triangle.size() - 2; i >= 0; i--) {
            for (int j = 0; j < triangle.get(i).size(); j++) {
                dp[j] = triangle.get(i).get(j) + Math.min(dp[j], dp[j + 1]);
            }
        }
        return dp[0];
    }
}

Summary:
This problem is a very classical DP problem. Make sure at least you understand the first recursive solution. The idea of post-order traversal is often used in many maximal/minimum path sum problems. Make sure you understand the first DP approach. The last solution is very tricky. Understand the idea.

Update on 10/9/2014:
Top-down 2D DP:
dp[i][j] -- the minimum path sum from root to position i, j
dp[i][j] = min(dp[i - 1][j - 1], dp[i - 1][j]) + triangle[i][j];
initial: dp[0][0] = 0;
dp[i][0] = dp[i - 1][0] + triangle[i][j];
Others are set to Integer.MAX_VALUE

Final state: min (dp[n - 1][0] ... dp[n - 1][n - 1])

public class Solution {
    int min = Integer.MAX_VALUE;
    public int minimumTotal(List<List<Integer>> triangle) {
        if (triangle == null || triangle.size() == 0) {
            return 0;
        }
        
        // 2D DP solution
        int n = triangle.size();
        int[][] dp = new int[n][n];
        
        // Initialize
        dp[0][0] = triangle.get(0).get(0);
        for (int i = 1; i < n; i++) {
            dp[i][0] = dp[i - 1][0] + triangle.get(i).get(0);
        }
        
        for (int i = 0; i < n; i++) {
            for (int j = 1; j < n; j++) {
                dp[i][j] = Integer.MAX_VALUE;
            }
        }
        
        for (int i = 1; i < n; i++) {
            for (int j = 1; j < triangle.get(i).size(); j++) {
                dp[i][j] = Math.min(dp[i - 1][j - 1], dp[i - 1][j]) + triangle.get(i).get(j);
            }
        }
        
        int min = Integer.MAX_VALUE;
        for (int i  = 0; i < n; i++) {
            if (dp[n - 1][i] < min) {
                min = dp[n - 1][i];
            }
        }
        
        return min;
    }
}

1D DP solution:
dp[j]: the minimum path sum from bottom to the column j at current level
Transit function: dp[j] = dp[j] + dp[j + 1]
Initial state: Initialize dp[j] as the bottom number of the triangle
Final state: dp[0].

Code (java):

public class Solution {
    int min = Integer.MAX_VALUE;
    public int minimumTotal(List<List<Integer>> triangle) {
        if (triangle == null || triangle.size() == 0) {
            return 0;
        }
        
        int n = triangle.size();
        /// 1D DP solution
        int[] dp = new int[n];
        
        /// Initialization
        for (int i = 0; i < n; i++) {
            dp[i] = triangle.get(n-1).get(i);
        }
        
        for (int i = n - 2; i >= 0; i--) {
            for (int j = 0; j <= i; j++) {
                dp[j] = Math.min(dp[j], dp[j + 1]) + triangle.get(i).get(j);
            }
        }
        
        return dp[0];
    }
}

Thursday, August 14, 2014

Leetcode: Word Ladder

Given two words (start and end), and a dictionary, find the length of shortest transformation sequence from start to end, such that:

Only one letter can be changed at a time
Each intermediate word must exist in the dictionary

For example,

Given:
start = "hit"
end = "cog"
dict = ["hot","dot","dog","lot","log"]

As one shortest transformation is "hit" -> "hot" -> "dot" -> "dog" -> "cog",
return its length 5.

Note:

Return 0 if there is no such transformation sequence.
All words have the same length.
All words contain only lowercase alphabetic characters.

Understand the problem:

The question gives two words and a dictionary, find the length of the shortest transformation sequence from start to end. Note the two requirements.

Naive Solution:

One straight-forward solution is to use BFS + a queue. The idea is to add the start into the queue, dequeue, and check if it is one letter difference between the end. If yes, we are done, return the length 2; otherwise, we add all neighbors into the queue. The neighbors are defined as one letter difference between the element just dequeued. Then repeat to check if all its neighbours is one letter difference between the end. If not, add the neighbours again.

One thing needs to handle is to mark the words in the dictionary as visited. The easiest way is just to remove the visited words in the dictionary.

Code (Java):

public class Solution {
    public int ladderLength(String start, String end, Set<String> dict) {
        if (start == null || start.isEmpty() || end == null || end.isEmpty())
            return 0;
        
        int length = 0;
        Queue<String> queue = new LinkedList<String>();
        queue.offer(start);
        
        HashSet<String> shadow = new HashSet<String>();
        
        while (!queue.isEmpty()) {
            int size = queue.size();
            for (int i = 0; i < size; i++) {
                String curr = queue.poll();
                if (isAdj(curr, end)) {
                    return length + 1;
                } else {
                    for (String str : dict) {
                        if (isAdj(curr, str) && !shadow.contains(str)) {
                            queue.offer(str);
                            //dict.remove(str);
                            shadow.add(str);
                        }
                    }
                    length++;
                }
            }
        }
        return 0;
    }
    
    // determine if two strings have only one different letter
    private boolean isAdj(String str1, String str2) {
        int count = 0;
        for (int i = 0; i < str1.length(); i++) {
            if (str1.charAt(i) != str2.charAt(i)) {
                count++;
            }
        }
        
        return count == 1;
    }
}

Discussion:
Now let's analyze this solution. Note that instead of removing the dictionary, we used a shadow hash set to store the visited nodes. So we don't have to modify the input parameters. For each node dequeued in the queue, it takes O(n) time to find its neighbor. Since there will be totally n nodes enqueued into the queue, it takes totally O(n^2) time. Also consider the complexity of determining if two strings are adjacent, which has time of O(m), where m is the length of the string. The total time complexity is O(n^2 * m). The space complexity is O(n) + O(n) = O(n).

A Better Solution:
In the naive solution, for each node which is not adjacent to the end, we look for all its neighbors in the dictionary. This will take O(n * m) time, where n is the number of strings in the dictionary, and m is the length of a string. If we can do this in constant time, or at least proportional to m, the overall time complexity would be O(n * m). Note that

All words contain only lowercase alphabetic characters.

Does it give us a hint? We can change a character of the string at a time and traverse all possible changes, then compare with the end string. If yes, we are done and simply return the length. Else if the changed string equals to the dictionary, add the string into the queue. Since lowercase alphabetic characters have exactly 26 different characters, it takes O(26 * m) time to traverse all possible changes for a given string.

Code (Java):

public class Solution {
    public int ladderLength(String start, String end, Set<String> dict) {
        if (start == null || start.isEmpty() || end == null || end.isEmpty())
            return 0;
        
        int length = 1;
        Queue<String> queue = new LinkedList<String>();
        queue.offer(start);
        
        HashSet<String> visited = new HashSet<String>();
        
        while (!queue.isEmpty()) {
            int size = queue.size();
            for (int i = 0; i < size; i++) {
                String curr = queue.poll();
                for (int j = 0; j < curr.length(); j++) {
                    char[] charCurr = curr.toCharArray();
                    for (char c = 'a'; c < 'z'; c++) {
                        charCurr[j] = c;  // change one character at a time
                        String strCurr = new String(charCurr);
                        if (strCurr.equals(end)) {
                            return length + 1;
                        } else {
                            if (dict.contains(strCurr) && !visited.contains(strCurr)) {
                                queue.offer(strCurr);
                                visited.add(strCurr);
                            }
                        }
                    }
                }
            }
            length++;
        }
        return 0;
    }
}

Discussion:
Note the line 17 should be in the outer loop, since we can change at most a character at a time. So if we traversed all possible characters for a position, we should recover it back to the original string.

As we discussed before, the time complexity is O(n * m). The space complexity is O(n).

Summary:
This question can be categorized into the graph theory, where each node represents a word, and each edge connects two neighbors. For this question, it actually asks us to find the shortest path in a graph. In such case a BFS would be the best solution. Also note that BFS will usually come with a queue.

Thursday, August 7, 2014

Leetcode: Insert Interval

Given a set of non-overlapping intervals, insert a new interval into the intervals (merge if necessary).

You may assume that the intervals were initially sorted according to their start times.

Example 1:
Given intervals [1,3],[6,9], insert and merge [2,5] in as [1,5],[6,9].

Example 2:
Given [1,2],[3,5],[6,7],[8,10],[12,16], insert and merge [4,9] in as [1,2],[3,10],[12,16].

This is because the new interval [4,9] overlaps with [3,5],[6,7],[8,10].

Understand the Problem:

This problem is very similar to merge interval. The question asks for given a set of non-overlapping intervals, insert a new interval into the intervals, and merge them, if necessary.

So the question can be divided by two steps: insert and merge.

Also note that the intervals has already been sorted initially.

Solution:
Since the problem can be divided by two steps, insert and merge, we can do it step by step. Insertion is very simple, we just need to scan the input intervals and compare the start value until we found one is larger than the insert interval. Then we insert the token into the ArrayList. The merge process will be exactly the same as the Merge Interval problem.

Code (Java):

/**
 * Definition for an interval.
 * public class Interval {
 *     int start;
 *     int end;
 *     Interval() { start = 0; end = 0; }
 *     Interval(int s, int e) { start = s; end = e; }
 * }
 */
public class Solution {
    public ArrayList<Interval> insert(ArrayList<Interval> intervals, Interval newInterval) {
        ArrayList<Interval> result = new ArrayList<Interval>();
        
        // check if the list is empty or null
        if (intervals == null || newInterval == null) return result;
        
        if (intervals.isEmpty()) {
            result.add(newInterval);
            return result;
        }
        
        // find the insertion point and insert the new newInterval
        for (int i = 0; i < intervals.size(); i++) {
            if (intervals.get(i).start > newInterval.start) {
                intervals.add(i, newInterval);
                break;
            }
        }
        // insert at end of the list
        intervals.add(newInterval);
        
        // merge the overlapped intervals
        Interval prev = intervals.get(0);
        for (int i = 1; i < intervals.size(); i++) {
            Interval curr = intervals.get(i);
            if (isOverlapped(prev, curr)) {
                Interval temp = new Interval();
                temp.start = prev.start;
                temp.end = Math.max(prev.end, curr.end);
                prev = temp;
            } else {
                result.add(prev);
                prev = curr;
            }
        }
        result.add(prev);
        
        return result;
    }
    
    private boolean isOverlapped(Interval prev, Interval curr) {
        return curr.start <= prev.end;
    }
}

Discussion:
Note the method add(int index, E element) inserts the elements into the current index. Shifts the element currently at that position (if any) and any subsequent elements to the right (adds one to their indices). That's why we could use add() instance method and no worry it is replace the value in the insertion position. In addition, when we insert a value, take care if the value is greater than the largest one in the list. In this case, we have to insert the value at the end of the end, as is shown in Line 30 of the code. This is a tricky bug we are easily to forget.

Note that this solution has two downsides. First of all, intervals.add() will modify the input parameters, which is sometimes not allowed. Second, insert an number at middle will cause the shift of all its right elements.

A Better Solution:
Does there exist a better solution without needing to modify the input parameters? The answer is yes. Re-think the question. We are not actually required to insert an interval into the original. We are only asked to return the final merged list.

Following this idea, we could decide when we insert an internal, what needs to be inserted into the result list. There are three cases:
1. For the current interval is less than the newInterval, i.e, the end of current interval is less than the start of newInterval. Then there must have no overlapping. In this case, we only need to insert the current interval into the result list.
2. For the current interval is greater than the newInterval. That is, the start of the current interval is greater than the end of newInterval. In this case, we insert the newInterval into the result list, and update the newInterval as current interval.
3. For other cases where we have overlapped regions. We merge the two intervals and update the newInterval.

Code (Java):

/**
 * Definition for an interval.
 * public class Interval {
 *     int start;
 *     int end;
 *     Interval() { start = 0; end = 0; }
 *     Interval(int s, int e) { start = s; end = e; }
 * }
 */
public class Solution {
    public ArrayList<Interval> insert(ArrayList<Interval> intervals, Interval newInterval) {
        ArrayList<Interval> result = new ArrayList<Interval>();
        
        if (newInterval == null) return intervals;
        
        if (intervals == null || intervals.size() == 0) {
            result.add(newInterval);
            return result;
        }
        
        Interval prev = newInterval;
        
        for (Interval curr : intervals) {
            // if curr interval is greater than prev
            if (curr.start > prev.end) {
                result.add(prev);
                prev = curr;
            } else if (curr.end < prev.start) {
                result.add(curr);
            } else {
                prev.start = Math.min(prev.start, curr.start);
                prev.end = Math.max(prev.end, curr.end);
            }
        }
        result.add(prev);
        return result;
    }
}

Summary:
The problem is not hard at the first glance, but very tricky to come out the second solution. If you follow the hint of the question you will insert then merge. Then you have to modify the input array to insert the new inserted point. A smarter way is to compare each interval from the input array with the insertion interval, and determine if they overlapped. We only need to insert the interval when they are not overlapped. This is quite close to the merge interval question.

Leetcode: Merge Intervals

Given a collection of intervals, merge all overlapping intervals.

For example,
Given [1,3],[2,6],[8,10],[15,18],
return [1,6],[8,10],[15,18].

Understand the problem:
For question asks for given a collection of intervals, merge all overlapping intervals. So the crux of the problem is to understand "what is the overlapping intervals"? Let's take look at several examples:

For the intervals [1, 3], [2, 6], they are overlapped, because the left boundary of the second token is less than the right boundary of the first token.

For the intervals [2, 6], [8, 10], they are not overlapped, because again, the left boundary of the second token is greater than the right boundary of the first token.

Therefore, one method to know if two intervals are overlapped is to first sort the list of intervals according to the left boundary, then compare the left boundary of the second interval with the right boundary of the first interval.

Then, the next question is after we decided two intervals overlapped, how can we merge them together? A easy way is to choose the minimum value of the left boundary, and maximum value of the right boundary. For example, for [1, 3] and [2, 6], the merged interval is [1, 6].

Solution:
According to the analysis above, the solution is quite straight-forward. We first sort the array list according to the start value of the Interval. Then check if two intervals are overlapped, if yes, merge together.

Code (Java):


/**
 * Definition for an interval.
 * public class Interval {
 *     int start;
 *     int end;
 *     Interval() { start = 0; end = 0; }
 *     Interval(int s, int e) { start = s; end = e; }
 * }
 */
public class Solution {
    public ArrayList<Interval> merge(ArrayList<Interval> intervals) {
        ArrayList<Interval> result = new ArrayList<Interval>();
        if (intervals == null || intervals.size() == 0) {
            return result;
        }
        
        Collections.sort(intervals, new IntervalComparator());
        
        Interval prev = intervals.get(0);
        for (int i = 1; i < intervals.size(); i++) {
            Interval curr = intervals.get(i);
            if (isOverlapped(curr, prev)) {
                prev.start = prev.start;
                prev.end = Math.max(curr.end, prev.end);
            } else {
                result.add(prev);
                prev = curr;
            }
        }
        
        result.add(prev);
        return result;
    }
    
    private boolean isOverlapped(Interval curr, Interval prev) {
        return curr.start <= prev.end;
    }
    
    private class IntervalComparator implements Comparator<Interval> {
        public int compare(Interval a, Interval b) {
            return a.start - b.start;
        }
    }
}

Discussion:
Since we sort the list once before, the time complexity bounded by O(nlogn). For the space complexity, we only allocated a new list to store the results, so the space complexity at worst case is O(n). That is the best space complexity we can do required by this question.

Summary:
The take away message of the solution is to utilize the advantage of sorted array. Also, make sure you are familiar with sorting objects in Java.

Wednesday, August 6, 2014

Leetcode: Evaluate Reverse Polish Notation

Evaluate the value of an arithmetic expression in Reverse Polish Notation.

Valid operators are +, -, *, /. Each operand may be an integer or another expression.

Some examples:

  ["2", "1", "+", "3", "*"] -> ((2 + 1) * 3) -> 9
  ["4", "13", "5", "/", "+"] -> (4 + (13 / 5)) -> 6

Analysis:
First of all, we must figure out what is the "Reverse Polish Notation"? According to the wiki, RPN is a mathematical notation in which every operator follows all its operands. It is also known as postfix notation. E.g.
3 4 + is postfix notation
+ 3 4 is prefix notation
3 + 4 is infix notation

If there are multiple operations, the operator is given immediately after the second operand.
For e.g. 2 1 + 3 * - > (2 + 1) * 3 = 9
3 4 5 * - -> 3 - (4 * 5) = -17
3 4 - 5 * -> (3 - 4) * 5 = -5

Solution:
From the analysis, it's not hard to see that we can use stack to store the digital numbers.
For instance, for the expression of RPN 3 4 - 5 *, we first push 3 and 4 into the stack, when we see an operator - , we pop 3 and 4 and performs 3 - 4. Note that the first pop-out number will be the second operand. Then we push back the result into the stack. Now the stack has -1 and 5. When we see the operator * we repeat the same operations until we reach the last token.

Code (Java):

public class Solution {
    public int evalRPN(String[] tokens) {
        // if empty or null
        if (tokens == null || tokens.length == 0) return 0;
        
        Stack<Integer> stack = new Stack<Integer>();
        int result = 0;
        
        for (int i = 0; i < tokens.length; i++) {
            if (!isOperator(tokens[i])) {
                stack.push(Integer.valueOf(tokens[i]));
            } else {
                int operand2 = stack.pop();
                int operand1 = stack.pop();
                result = calculation(operand1, operand2, tokens[i]);
                
                // push back the intermediate results
                stack.push(result);
            }
        }
        return stack.pop();
    }
    
    private boolean isOperator(String str) {
        return (str.equals("+") || str.equals("-") || str.equals("*") || str.equals("/"));
    }
    
    private int calculation(int op1, int op2, String operator) {
        int result = 0;
        if (operator.equals("+")) {
             result = op1 + op2;
        } else if (operator.equals("-")) {
            result = op1 - op2;
        } else if (operator.equals("*")) {
            result = op1 * op2;
        } else if (operator.equals("/")) {
            result = op1 / op2;
        }
        return result;
    }
}

Discussion:
The solution is very straight forward. For the time complexity, it is O(n) because we only need to traverse the input array of string only once. Let's look at how much space is consumed. For input with n tokens, it needs push at most n / 2 tokens into the stack, as roughly the rest half is operators. So the space complexity for this solution is O(n).

Summary:
The crux of the problem is to get idea behind the RPN. In the code interview, the interviewer may not explain to you very details. Maybe they will only show you the definition and several examples. So before you jump into any solution, make the best clear of the problem. In this problem, it is critical to understand "If there are multiple operations, the operator is given immediately after the second operand". That is the key to come out the stack solution.

Tuesday, August 5, 2014

Leetcode: Longest Substring Without Repeating Characters

Given a string, find the length of the longest substring without repeating characters. For example, the longest substring without repeating letters for "abcabcbb" is "abc", which the length is 3. For "bbbbb" the longest substring is "b", with the length of 1.

Analysis:
The question asks for finding the longest substring without repeating characters, and return its length. So the problem has two requirements:

1. Find a substring without repeating characters, and keep its length
2. Return the longest length

We can give several examples from empty string to common cases:
a. For empty string, its substring is empty as well, so return the length 0
b. For string length equals to 1, the longest substring is itself, return 1;
c. For string "aab", its longest substring is "ab", return 2;
d. For string "abcabcbb", the longest substring is "abc", return 3.

Naive Solution:
The crux of this question is to find longest substring without repeating characters. So for each character in the string, we need to check if the character is repeated before. How to check efficiently? If we compare the character with all its previous characters, the time complexity would be O(n^2). That is, in worst case, the string contains no repeated characters, so the longest substring is the string itself.

How about using hash table? Specifically, we can use Java HashMap. The key stores the character while value stores the index. Why we need to store the index? Consider the string "abcdaef". When we encounter the second a, we need to reverse the index back starting from b. So the length of the longest substring is "bcdaef", i.e, 6.

Code (Java):

public class Solution {
    public int lengthOfLongestSubstring(String s) {
        // check if string is empty or length equals to 1
        if (s.length() < 2) return s.length(); 
        
        HashMap<Character, Integer> hashMap = new HashMap<Character, Integer>();
        int maxLen = 1;
        int len = 0;
        
        for (int i = 0; i < s.length(); i++) {
            if (!hashMap.containsKey(s.charAt(i))) {
                hashMap.put(s.charAt(i), i);
                len++;
            } else { // found repeated
                i = hashMap.get(s.charAt(i));
                maxLen = Math.max(maxLen, len);
                hashMap.clear();
                len = 0;
            }
        }
        return Math.max(maxLen, len);
     }
}

Discussion:
The solution looks straight-forward. There are several points need to take note. First of all, when we found a repeated character, we need to move the index back to the first occurrence of the character. In addition, we must clean the HashMap and the counter. That are tiny things likely to forget.

Now let's take a look at the complexity. It is a little bit complicated since we move back the index once we found a duplicate. In the best case, where string has no duplicates. There is no need to move back the index, and we only need to iterate the string once, so the time complexity is O(n). In average case, the time complexity is O(n * m), where m is the length of the longest substring. Why? For a string "abcabc", for each character, it has to move m steps until meets the duplicated character. Thus the complexity is O(nm). Regarding to the space complexity, since the hash table stores at most m key-value pairs, the space complexity is O(m).

A Better Solution:
The native solution has O(nm) complexity since we always moved back the index to the position next to the first occurrence of the duplicated character. Do we have a method that never move back the index? Then the time complexity would become O(n).

Remember we have two pointers solution. In the first round, move one pointer until we reach a duplicated character, then use another pointer points to the first occurrence of the duplicated element. Then instead of moving back the frond pointer, we continually iterate the first pointer in the second round and update the back pointer to the position next to the first occurrence of the duplicated element.

Code (Java):

public class Solution {
    public int lengthOfLongestSubstring(String s) {
        // if the string is empty or has only one character
        if (s.length() < 2) return s.length();
        
        HashMap<Character, Integer> hashMap = new HashMap<Character, Integer>();
        int j = 0;
        int maxLen = 1;
        
        for (int i = 0; i < s.length(); i++) {
            if (!hashMap.containsKey(s.charAt(i))) {
                hashMap.put(s.charAt(i), i);
            } else {
                maxLen = Math.max(maxLen, i - j);
                for (int k = j; k < i; k++) {
                    if (s.charAt(k) == s.charAt(i)) {
                        j = k + 1;
                        break;
                    }
                }
            }    
        }
        return Math.max(maxLen, s.length() - j);
    }
}

Discussion:
In this solution, since we never revert the front pointer back, the time complexity is O(n) and the space complexity is O(m). Note that When we found a duplicated element, we must update the corresponding j pointer. But why not just use hashMap.get(charAt(i)) + 1 ?
That is because we never update the position of the duplicated keys in the hash table. For example, for the string "abababc", when we meet the third a, the j pointer will be moved back to the first b, which is wrong.

A Wrong Solution:

public class Solution {
    public int lengthOfLongestSubstring(String s) {
        // if the string is empty or has only one character
        if (s.length() < 2) return s.length();
        
        HashMap<Character Integer> hashMap = new HashMap<Character Integer>();
        int j = 0;
        int maxLen = 1;
        
        for (int i = 0; i < s.length(); i++) {
            if (!hashMap.containsKey(s.charAt(i))) {
                hashMap.put(s.charAt(i), i);
            } else {
                maxLen = Math.max(maxLen, i - j);
                j = hashMap.get(s.charAt(i)) + 1;
                hashMap.put(s.charAt(i), i);
            }    
        }
        return Math.max(maxLen, s.length() - j);
    }
}

Why this solution is wrong? It updates the value for the duplicated key. Honestly, I haven't figured out the reason. I will come back later.

Update on 10/14:
The problem of the wrong solution is the left boundary might move back. For example,
abcdba, for the second a, the left boundary will move back to the first b, which wrongly get a longer substring.

Another Thought:
The bast solution above would have O(n) time and O(m) space. It seems that time complexity has reached the optimal, how about the space complexity? Can we have a constant space solution?

The answer is yes. Actually if we can assume that the string is encoded by ASCII code, it has at most 256 different characters in the string. Therefore, we could create a bit map with size of 256. If a character occurred before, we mark the corresponding map as true.

In this solution, the space complexity is constant, i.e. 256, no matter how long the string is.

Summary:
In this post, we learned how to find a longest substring without repeating characters. We introduced three methods: naive solution with O(nm) time complexity, linear solution with O(m) space complexity and linear solution with constant space complexity.

The problem itself is not very difficult. But it is tricky to devise the two pointers approach. It is very important to deduce your idea to interviewers instead of memorizing solution.

Update on 11/16/14:
Actually we don't need to maintain a hash map, but just a set is fine. That is because we only need to save the visited characters in the set.

Code (Java):

public class Solution {
    public int lengthOfLongestSubstring(String s) {
        if (s == null || s.length() == 0) {
            return 0;
        }
        
        int start = 0;
        int end = 0;
        int len = 1;
        Set<Character> set = new HashSet<Character>();
        
        for (end = 0; end < s.length(); end++) {
            if (!set.contains(s.charAt(end))) {
                set.add(s.charAt(end));
            } else {
                len = Math.max(len, end - start);
                
                // move the start pointer
                while (start < end) {
                    if (s.charAt(start) != s.charAt(end)) {
                        set.remove(s.charAt(start));
                    } else {
                        break;
                    }
                    start++;
                }
                
                start++;
            }
        }
        len = Math.max(len, end - start);
        return len;
    }
}

Update on 5/2/18:

class Solution {
    public int lengthOfLongestSubstring(String s) {
        if (s == null || s.length() == 0) {
            return 0;
        }
        
        int ans = 0;
        Set<Character> set = new HashSet<>();
        int j = 0;
        int i = 0;
        
        for (i = 0; i < s.length(); i++) {
            char c = s.charAt(i);
            
            if (!set.contains(c)) {
                set.add(c);
            } else {
                // First note the current length
                //
                ans = Math.max(ans, i - j);
                
                // Then move forward the j
                //
                while (s.charAt(j) != c) {
                    set.remove(s.charAt(j));
                    j++;
                }
                
                // move one more step for j
                //
                j++;
            }
        }
        
        ans = Math.max(ans, i - j);
        
        return ans;
    }
}

Monday, August 4, 2014

Leetcode: Implement strStr()

Implement strStr().

Returns a pointer to the first occurrence of needle in haystack, or null if needle is not part of haystack.

Analysis:
As usual, before we come out a solution, we must understand the question very well. Ask questions if there is any ambiguities in it.

In this question, we are actually asked for finding a substring from a string.
The question looks simple at first glance but very tricky.

First of all, let's look at the return value. The question asks for returning a pointer to the first occurrence of needle in haystack. And the method of the question defines a return value as a string. Therefore, we can return a substring(i) where i points to the first occurrence of the substring. For instance, for haystack = "abcde", needle = "cd", the return value is "cde"

Then we can take several examples. The crux of the question is to handle boundary condition well. Let's take a look at it from null string.

1. If either of the string null. We should return null. Note that you should always check null string before calling length() else it will return NullPointerException.

2. If both strings are empty string, we should return empty string instead of null. That is because empty string is substring of any string

3. If haystack is empty, while needle string is not, we should return null. To make it more generic, if length of haystack is less than needle, it must return null.

4. If haystack is not empty while needle is empty, we should return haystack. Again, that is because empty string is a substring of all strings. Therefore, we may merge the condition check 2 and 4 into one together, i.e, only check if needle is empty, if yes, return haystack.

Solution:
Use two pointers, one points i, the haystack string, while the other points j, to the need string. For each i, we check if charAt(i) == charAt(j) and its subsequent characters in j matches. Note that you should check if i is overflowed. For instance, in the condition haystack = "abcde", while needle = "efg", the first match happens at end of the haystack string, when we increase the pointer of i and j, we should make sure i is not overflow.

Code (Java):

public class Solution {
    public String strStr(String haystack, String needle) {
        // if two strings are null,  return null
        if (haystack == null || needle == null) return null;
        
        // if two strings are empty, return empty instead of NULL
        //if (haystack.isEmpty() && needle.isEmpty()) return "";
        
        // if needle string is empty, return haystack
        if (needle.isEmpty()) return haystack;
        
        int lenHaystack = haystack.length();
        int lenNeedle = needle.length();
        
        // if length of haystack less than needle, return empty string
        if (lenHaystack < lenNeedle) return null;
        
        for (int i = 0; i < lenHaystack; i++) {
             for (int j = 0; j < lenNeedle; j++) {
                // check if haystack is overflow
                if (i + j >= lenHaystack) return null;
                if (haystack.charAt(i + j) != needle.charAt(j)) break;
                else if (j == lenNeedle - 1) return haystack.substring(i);
            }
        }
        return null;
    }
}

Discussion:
For haystack length is n and needle length m, the worst case time complexity is O(mn). That is because for each character in haystack, we should check every character in needle until we find a match. In the worst case there is no matched substring, so the complexity in worst case is O(mn). Since we did not allocate additional space the space complexity is constant.

Summary:
This question is very simple but quite error-prone. It is actually very hard to write a bug-free implementation at once. The crux is you should consider every possible conditions and check. The rule of thumb for string problems is first to check if the input string(s) is null, and consider the return values. In most cases, null string is the most unacceptable string because it points to a null address. Then check if it is an empty string and consider the return value.
Take care if the string if is overflowed and make a condition to check.

Update on 11/19/2014:

Implement strStr().

Returns the index of the first occurrence of needle in haystack, or -1 if needle is not part of haystack.

Update (2014-11-02):
The signature of the function had been updated to return the index instead of the pointer. If you still see your function signature returns a char * or String, please click the reload button to reset your code definition.

Code (Java):

public class Solution {
    public int strStr(String haystack, String needle) {
     if (needle == null || needle.length() == 0) {
         return 0;
        }

        if (haystack == null || haystack.length() == 0) {
         return -1;
        }

        for (int i = 0; i < haystack.length() - needle.length() + 1; i++) {
         int j = 0;
         for (j = 0; j < needle.length(); j++) {
          if (haystack.charAt(i + j) != needle.charAt(j)) {
           break;
                } 
            }
            if (j == needle.length()) {
             return i;
            }
        }

        return -1;
    }
}

Update on 9/28/15:

public class Solution {
    public int strStr(String haystack, String needle) {
        for (int i = 0; i <= haystack.length() - needle.length(); i++) {
            int j = 0;
            int start = i;
            for (j = 0; j < needle.length(); j++) {
                if (start < haystack.length() && haystack.charAt(start) == needle.charAt(j)) {
                    start++;
                } else {
                    break;
                }
            }
            if (j == needle.length()) {
                return i;
            }
        }
        
        return -1;
    }
}

Update on 4/14/19:

public class Solution {
    /**
     * @param source: 
     * @param target: 
     * @return: return the index
     */
    public int strStr(String source, String target) {
        for (int i = 0; i < source.length() - target.length() + 1; i++) {
            int j = 0;
            while (j < target.length() && source.charAt(i + j) == target.charAt(j)) {
                j++;
            }
            
            if (j == target.length()) {
                return i;
            }
        }
        
        return -1;
    }
}