Friday, June 17, 2016

Collatz Conjecture

考拉茲猜想英語:Collatz conjecture),又稱為奇偶歸一猜想3n+1猜想冰雹猜想角谷猜想哈塞猜想烏拉姆猜想敘拉古猜想,是指對於每一個正整數,如果它是奇數,則對它乘3再加1,如果它是偶數,則對它除以2,如此循環,最終都能夠得到1。

取一個正整數:
  • 如n = 6,根據上述公式,得出序列6, 3, 10, 5, 16, 8, 4, 2, 1。(步驟中最高的數是16,共有8個步驟)
  • 如n = 11,根據上述公式,得出序列11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1。(步驟中最高的數是52,共有14個步驟)
  • 如n = 27,根據上述公式,得出序列
{ 27, 82, 41, 124, 62, 31, 94, 47, 142, 71, 214, 107, 322, 161, 484, 242, 121, 364, 182, 91, 274, 137, 412, 206, 103, 310, 155, 466, 233, 700, 350, 175, 526, 263, 790, 395, 1186, 593, 1780, 890, 445, 1336, 668, 334, 167, 502, 251, 754, 377, 1132, 566, 283, 850, 425, 1276, 638, 319, 958, 479, 1438, 719, 2158, 1079, 3238, 1619, 4858, 2429, 7288, 3644, 1822, 911, 2734, 1367, 4102, 2051, 6154, 3077, 9232, 4616, 2308, 1154, 577, 1732, 866, 433, 1300, 650, 325, 976, 488, 244, 122, 61, 184, 92, 46, 23, 70, 35, 106, 53, 160, 80, 40, 20, 10, 5, 16, 8, 4, 2, 1 }(步驟中最高的數是9232,共有111個步驟)
偶歸一猜想稱,任何正整數,經過上述計算步驟後,最終都會得到1。

n = 27時的序列分布(橫軸-步數;縱軸-運算結果)
數目少於1萬的,步驟中最高的數是6171; 數目少於1億的,步驟中最高的數是63728127,共有949個步驟; 數目少於10億的,步驟中最高的數是670617279,共有986個步驟。

Question:
Given a positive number n >= 1, ask how many times the number, n, need to change in order to get to 1. e.g. n = 6, return 8, because there are 8 steps to transform 6 to 1.

Brute force solution:
import java.io.*;
import java.util.*;

public class Solution {
  public int collatzConjecture(int n) {
    if (n < 1) {
      throw new IllegalArgumentException("n must be greater or equal to 1");
    }
    
    int count = 0;
    
    while (n != 1) {
      if ((n & 1) == 1) {
        n = n * 3 + 1;
      } else {
        n /= 2;
      }
      count++;
    }
    
    return count;
  }
  
  public int collatzConjectureRecursive(int n) {
    if (n < 1) {
      throw new IllegalArgumentException("n must be greater or equal to 1");
    }
    
    if (n == 1) {
      return 0;
    }
    
    if (n % 2 == 0) {
      return 1 + collatzConjectureRecursive(n / 2);
    } else {
      return 1 + collatzConjectureRecursive(n * 3 + 1);
    }
  }
  
  public static void main(String[] args) {
    Solution sol = new Solution();
    int result = sol.collatzConjecture(27);
    System.out.println(result);
    
    result = sol.collatzConjectureRecursive(27);
    System.out.println(result);
  }
}

Discussion:
1. Be very careful about the integer overflow problem. For large integers, where n is odd, 3 * n + 1 can easily get overflow. Clarify this problem with the interviewer. We may also use long instead of int to expand the size of n. 
Follow-up:
What if we call the function with different n many times. Can we do it faster? 

The answer is yes. Implementing a recursive algorithm is probably the simplest way to calculate the length, but seemed to me like an unnecessary waste of calculation time. Many sequences overlap; take for example 3's Hailstone sequence:
3 -> 10 -> 5 -> 16 -> 8 -> 4 -> 2 -> 1
This has length 7; more specifically, it takes 7 operations to get to 1. If we then take 6:
6 -> 3 -> ...
We notice immediately that we've already calculated this, so we just add on the sequence length of 3 instead of running through all those numbers again, considerably reducing the number of operations required to calculate the sequence length of each number.

I tried to implement this in Java using a HashMap (seemed appropriate given O(1) probabilistic get/put complexity).

Code (Java):
import java.io.*;
import java.util.*;

public class Solution {
  private static Map<Integer, Integer> map;
  
  public Solution() {
    map = new HashMap<>();
    map.put(1, 0); // NOTE that we need to put 1 into the cache as the base case
  }
  
  public int collatzConjectureCached(int n) {
    if (n < 1) {
      throw new IllegalArgumentException("n must be greater or equal to 1");
    }
    
    if (n == 1) {
      return 0;
    }
    
    int count = 0;
    int m = n;
    
    while (true) {
      if (map.containsKey(n)) {
        count += map.get(n);
        map.put(m, count);
        
        return count;
      } else if (n % 2 == 0) {
        n /= 2;
      } else {
        n = n * 3 + 1;
      }
      count++;
    }
  }
  
  public static void main(String[] args) {
    Solution sol = new Solution();

    System.out.println(sol.collatzConjectureCached(3));
    System.out.println(sol.collatzConjectureCached(6));
    System.out.println(sol.collatzConjectureCached(6));
    
    for (Integer key : map.keySet()) {
      System.out.println(key + ", " + map.get(key));
    }
    
  }
}
Follow-up:
what if we call this function many times with different n, then it could consume lots of memory to save the HashMap. What if the memory is not largely enough to hold the entire HashMap, what can we do?

One possible solution is instead of building an unlimited sized cache(implemented as a HashMap), we can build a fixed sized cache. For example, a LRU cache with a fixed capacity. The capacity is less than the memory size of the machine. We can maintain such a LRU cache in memory. In this case, some numbers which are not frequently used might be evicted from the cache. This is a trade-off between time and space. How to implement a LRU cache? Check out the LC problem: LRU cache. 

No comments:

Post a Comment