Wednesday, July 23, 2014

Leetcode: Two sum

Given an array of integers, find two numbers such that they add up to a specific target number.

The function twoSum should return indices of the two numbers such that they add up to the target, where index1 must be less than index2. Please note that your returned answers (both index1 and index2) are not zero-based.

You may assume that each input would have exactly one solution.

Input: numbers={2, 7, 11, 15}, target=9

Output: index1=1, index2=2


Analysis:

There are several points to note for this question:
1. We can assume that there is one and only one solution. Thus we don't need to handle the situation that there is no answer existed, or there might be multiple possible combinations
Follow up: What if the question may have no solution or multiple solutions?
2. Index 1 must be less than index 2. It means they cannot be equal, i.e, if one number is equal to the target number, the other must be 0.
3. The negative number is allowed in the input array.

4. Note that output index starts from 1
5.  How to return two numbers?
Check what is the return value of the method. Yes it is integer array, where arr[0] stores the index1, and arr[1] stores the index2.

Native Solution:
A straight-forward solution is to compare each pair of the array and check the sum equals to the target. So we can see that in the worst case the time complexity is O(n^2). The space complexity is constant because it doesn't need any temporary buffer to store the data.

Code (java):
public class Solution {
    public int[] twoSum(int[] numbers, int target) {
        // Native solution O(n^2)
        int size = numbers.length;
        int[] result = new int[2];
        
        for (int i = 0; i < size; i++) {
            for (int j = i + 1; j < size; j++) {
                if ((numbers[i] + numbers[j]) == target) {
                    result[0] = i + 1;
                    result[1] = j + 1;
                    break;
                }
            }
        }
        return result;
    }
}
Discussion:
Interaction is important during the code interview. No matter how complex the answer is, just speak out the native solution that comes up first in your mind. Then try to analyze the problem and optimize it.

Improved Solution:

Noted that the two-sum problem is defined as a[i] + a[j] == target, where i < j ? That equals to
check if a[i] == target - a[j] ? So for each array index i, we check if there is a number (target - a[j]) is equal to a[i]. It is similar to the native solution.

But how can we check if there is a target - a[j] equals to a[i] efficiently? We use hash table. At java, we use HsahMap. The key target  - a[j] where as the value stores the index of the array. As a result, for each a[i] we check if the hash table contains the key, if yes, we're done, and index1 is equal to the value of the key, and index2 is equal to loop index i. If not hit, we store the target-a[i] into the hash table as key, and loop index i as the value.

Code (Java): 
public class Solution {
    public int[] twoSum(int[] numbers, int target) {
        //Improved version using hash_map
        HashMap<Integer, Integer> map = new HashMap<Integer, Integer>();
        int[] result = new int[2];
        int size = numbers.length;
        
        for (int i = 0; i < size; i++) {
            if (map.containsKey(numbers[i])) {
                result[0] = map.get(numbers[i]) + 1;
                result[1] = i + 1;
            } else {
                map.put(target - numbers[i], i);
            }
        }
        return result;
    }
}

Discussion:
Since hash table has average access time O(1), and we only access the array once. The time complexity of t he solution is O(n). Since we use t he hash table as a temporary buffer, at worst case we need additional O(n) storage.
Note that in this solution we use Java HashMap, and used constant methods:  
  • boolean containsKey(Object key), which returns true if the hash map contains a mapping from the specified key
  • V get(Ojbect key), which returns the value to which the specified key is mapped, or null if this map contains no mapping for the key
  • put(K key, V value) associate the specified key and value in the hash map
  • Several other constant methods may be useful:
    • void clear(), removes all of the mapping from the map
    • boolean containValue(Object value), returns true if the map maps one or more keys to the value
    • boolean isEmpty() returns true if the map is empty
    • V remove(Object key) remove the mapping for the key if exists any
    • int size() returns the number of key-value pairs in the map

Another Solution:
First sort the array. Use two pointers, start and end, points to the start and end of the array. Check if a[start] + a[end] == target. If yes, we are done, break and return. If the sum is greater than the target, decrease the end index, end--, and compare the sum again. If the sum is less than the target, increase the start index, start++, and repeat again. It terminates when start > end. 

In this solution, we can see that the timing complexity is O(nlogn) + O(n), which is still bounded by O(nlogn). Since we don't need any space to store temporary data, the space complexity is O(1). For the data size is small, we can expect that this solution should be a bit faster than hash table solution, since get and put computations are relatively complex. For the large data size, however, sorting the array may not be desired. Furthermore, sorting may not be desirable if the array is immutable. 

Summary
There is no such an absolute best solution in a code interview. You could always interact with the interviewer. Usually an interview question is quite open; they do not expect to see the best solution. On the contrary, they would like to see how you analyze the problem. In this question, the first solution is brute force but very straight forward to understand. In the second solution, the time complexity is O(n) but requires additional O(n) space to store the data. If a space limited system, it may be a problem. In the third solution, the time is bounded by the sorting but does not requires additional space. If the input data is (nearly) sorted, we can expect that the third solution is the fastest. 

No comments:

Post a Comment