Thursday, June 2, 2016

Leetcode: Intersection of Two Arrays II

Given two arrays, write a function to compute their intersection.
Example:
Given nums1 = [1, 2, 2, 1]nums2 = [2, 2], return [2, 2].
Note:
  • Each element in the result should appear as many times as it shows in both arrays.
  • The result can be in any order.
Follow up:
  • What if the given array is already sorted? How would you optimize your algorithm?
  • What if nums1's size is small compared to num2's size? Which algorithm is better?
  • What if elements of nums2 are stored on disk, and the memory is limited such that you cannot load all elements into the memory at once?
Solution 1:
Sort the two arrays and iterate over to find out the intersections. So the overall time complexity is bounded by O(n logn), where n is the length of the longer array. The main body of the loop is bounded by O(m + n). 

Code (Java):
public class Solution {
    public int[] intersect(int[] nums1, int[] nums2) {
        if (nums1 == null || nums2 == null || nums1.length == 0 || nums2.length == 0) {
            return new int[0];
        }
        
        int i = 0;
        int j = 0;
        
        Arrays.sort(nums1);
        Arrays.sort(nums2);
        
        List<Integer> result = new ArrayList<>();
        
        while (i < nums1.length && j < nums2.length) {
            if (nums1[i] == nums2[j]) {
                result.add(nums1[i]);
                i++;
                j++;
            } else if (nums1[i] < nums2[j]){
                i++;
            } else {
                j++;
            }
        }
        
        // Convert list to array
        return listToArray(result);
    }
    
    private int[] listToArray(List<Integer> list) {
        int[] result = new int[list.size()];
        
        for (int i = 0; i < list.size(); i++) {
            result[i] = list.get(i);
        }
        
        return result;
    }
}

Solution 2:
Use a HashMap to store the first array, then check each element of the second array and see if it is in the map. Note that since we need to output all repeated elements, we also need to count the occurrence of each array element in the map, and consume it when we compare with the second array.

Code (Java):
public class Solution {
    public int[] intersect(int[] nums1, int[] nums2) {
        if (nums1 == null || nums2 == null || nums1.length == 0 || nums2.length == 0) {
            return new int[0];
        }
        
        Map<Integer, Integer> map = new HashMap<>();
        
        // step1: Put elements in nums1 into the map
        for (int num : nums1) {
            if (map.containsKey(num)) {
                int freq = map.get(num);
                map.put(num, freq + 1);
            } else {
                map.put(num, 1);
            }
        }
        
        // step 2: iterate the nums2 and get the result
        List<Integer> result = new ArrayList<>();
        for (int num : nums2) {
            if (map.containsKey(num) && map.get(num) > 0) {
                result.add(num);
                int freq = map.get(num);
                freq--;
                map.put(num, freq);
            }
        }
        
        return listToArray(result);
    }
    
    private int[] listToArray(List<Integer> list) {
        int[] result = new int[list.size()];
        for (int i = 0; i < list.size(); i++) {
            result[i] = list.get(i);
        }
        
        return result;
    }
}



Follow up:
  • What if the given array is already sorted? How would you optimize your algorithm?
    • Solution 1, i.e., sorting,  would be better since it does not need extra memory. 
  • What if nums1's size is small compared to num2's size? Which algorithm is better?
    • If two arrays are sorted, we could use binary search, i.e., for each element in the shorter array, search in the longer one. So the overall time complexity is O(nlogm), where n is the length of the shorter array, and m is the length of the longer array. Note that this is better than Solution 1 since the time complexity is O(n + m) in the worst case. 
  • What if elements of nums2 are stored on disk, and the memory is limited such that you cannot load all elements into the memory at once?
    • If the two arrays have relatively the same length, we can use external sort to sort out the two arrays in the disk. Then load chunks of each array into the memory and compare, by using the method 1. 
    • If one array is too short while the other is long, in this case, if memory is limited and nums2 is stored in disk, partition it and send portions of nums2 piece by piece. keep a pointer for nums1 indicating the current position, and it should be working fine~
    • Another method is, store the larger array into disk, and for each element in the shorter array, use "Exponential Search" and search in the longer array. 

4 comments:

  1. Can you elaborate how solution 2 ( using binary search) is better than solution 1.
    And how worst case complexity will be O(n + m) in solution 2 (using binary search).

    ReplyDelete
  2. Assume the lists are [1,2,3] & [4,5,6,7]. Method1 will traverse all elements at least once (3 +4 = 7 elements).
    Method2 would need 3 lookups from array1 in the larger array2. Each lookup is log (size of larger array=4), total lookups = 3 * log4 = 3 *2 = 6

    ReplyDelete
  3. We can find intersection of two arrays with hashmaps as well, here's how to do it.

    ReplyDelete
  4. Nice

    https://favtutor.com/blogs/sorting-algorithms-java

    ReplyDelete