Thursday, July 31, 2014

Leetcode: String to Integer (atoi)

Implement atoi to convert a string to an integer.
Hint: Carefully consider all possible input cases. If you want a challenge, please do not see below and ask yourself what are the possible input cases.
Notes: It is intended for this problem to be specified vaguely (ie, no given input specs). You are responsible to gather all the input requirements up front.
Requirements for atoi:
The function first discards as many whitespace characters as necessary until the first non-whitespace character is found. Then, starting from this character, takes an optional initial plus or minus sign followed by as many numerical digits as possible, and interprets them as a numerical value.
The string can contain additional characters after those that form the integral number, which are ignored and have no effect on the behavior of this function.
If the first sequence of non-whitespace characters in str is not a valid integral number, or if no such sequence exists because either str is empty or it contains only whitespace characters, no conversion is performed.
If no valid conversion could be performed, a zero value is returned. If the correct value is out of the range of representable values, INT_MAX (2147483647) or INT_MIN (-2147483648) is returned.

Analysis:
The crux of this question to handle all the cases for the input string, which include:


  • Leading white spaces e.g. "   12" should be interpreted as 12
  • Optional initial plus or minus sign "-5" should be -5, where "+5" should be 5
  • The string starts by the optional sign and followed by all numerical digits until the non-digit character, and interprets them as a numerical value. The string can contain additional characters after that but will be ignored. For instance, " 3.1415926" will be interpreted as 3.
  • Return 0 if
    • The first sequence of non-whitespace characters in str is not a valid integral number
    • The string is empty
    • The string only contains whitespace characters
  • Out of the range of representable values, return INT_MAX or INT_MIN
After we make clear all of those input cases, then we figure out how to handle each of them
  • For leading white spaces, we can use Java trim() instance method of the String class to remove leading and trailing white spaces. Or we may use the static method isWhiteSpace(ch) in Character class to determine the white space and simply move the pointer until a non-whitespace character. 
  • To handle optional initial plus or minus sign, there are several conditions to check
    • For '+' character, use a flag to mark as plus. E.g. "+12" should be interpreted as 12
    • For '-' character, mark the flag as minus
    • For input if contains multiple signs, return 0. For e.g. "+++--4" return 0
  • Then we check if the character next to the sign is a digit. If yes, we continue scan it until a non-digit character and save it to a temp string. Else, return 0. E.g. "-abc4" will return 0
  • After that, we should get a string with only integers, e.g. "456". The next step is to convert the string into integers. We should also check the sign flag to get the number is positive or negative. 
  • The next is to check if the integer is overflowed. This is pretty tricky, because we are unable to check the overflow after the string has been transformed into integers. One easy way is to use a long integer to save the transformed string. But how about the number is greater to the max value of long as well? We could check the length. If t he length of the string is greater than 9, the output will be either INT_MAX or INT_MIN. 
Code (Java):
public class Solution {
    public int atoi(String str) {
        boolean neg = false; // flag to mark if the converted integer positive or negative. 
        StringBuilder buf = new StringBuilder(); // temp buffer to store the converted string
        
        // check if the string is null or empty
        if (str == null || str.isEmpty()) return 0;
        
        // trim the leading whitespaces
        str = str.trim();
        
        // if string contains only whitespace characters
        if (str.isEmpty()) return 0;
        
        // get length of the trimed string
        int length = str.length();
        
        // Check if the first character of the string
        if (isNeg(str.charAt(0))) neg = true;
        else if (isPos(str.charAt(0))) neg = false;
        else if (Character.isDigit(str.charAt(0))) buf.append(str.charAt(0));
        else return 0;
        
        // check the first sequence of digit characters in the string
        int start = 1;
        while (start < length && Character.isDigit(str.charAt(start))) {
            buf.append(str.charAt(start));
            start++;
        }
        
        // check if the buf is empty
        if (buf.length() == 0) return 0;
        
        // check if the converted integer is overflowed
        long result;
        if (buf.length() <= 10) {
            result = toInteger(buf, neg);
        } else if (neg) {
            return Integer.MIN_VALUE;
        } else
            return Integer.MAX_VALUE;
            
        // Post-processing the convert long to int
        if (neg && result <= Integer.MAX_VALUE) {
            return 0 - (int) result;
        } else if (!neg && result <= Integer.MAX_VALUE) {
            return (int) result;
        } else if (neg && result > Integer.MAX_VALUE) {
            return Integer.MIN_VALUE;
        } else return Integer.MAX_VALUE;
    }
    
    private boolean isPos(char ch) {
        return ch == '+';
    }
    
    private boolean isNeg(char ch) {
        return ch == '-';
    }
    
    private long toInteger(StringBuilder buf, boolean neg) {
        int len = buf.length();
        long result = 0;
        for (int i = 0; i < len; i++) {
            result += Character.getNumericValue(buf.charAt(i)) * Math.pow(10, len - i - 1);
        }
        
        return result;
    }
}

Discussion:
The code looks ugly but handled all the input cases as far as I can think of. Specifically, it handles:

  • Null or empty string
  • Leading white spaces
  • First character is sign 
  • Leading character is non-sign nor digit
  • First sequence of digit characters
  • Calculate real integer number
  • Handle overflow

Note that in the function toInteger() we convert a string with digital characters into an integer. 
For a specific character, how to convert to an integer? If the code above, I used an instance method getNumericValue(char ch). What if it is not allowed? Do we have a clever method? 

In all common character encodings, the values are sequential: '0' has a value one less than '1', which in turn followed by '2' and '3', which means:
'0' = ASCII_0;
'1' = 1 + ASCII_0;
'2' = 2 + ASCII_1;
 ... 
'9' = 9 + ASCII_9;
Therefore, to get a numeric value from a digital character, you just need to subtract from the '0'. For example, 9 = '9' - '0'. By this way you can avoid using getNumericValue(char ch) in your code. 

Secondly, to calculate the integer from a string we use  Math.pow() method. The idea is e.g. 123= 1 * 10^2 + 2 * 10^1 * 3 * 10^0. But power operation is usually quite costly in CPU cycles, so do we have a better idea without power function?

We still scan the string from left to right, but think about how 123 is calculated. When you see the first character , it will be 1. When you see the second character, it will be 1 x 10 + 2 = 12. When you see the first character , it will be 12 x 10 + 3 = 123. Now you must understand how to calculate a string from a string. 

Below is a new version of the Java code, besides the optimizations shown above, it does not need a temp buffer to store the digital string. On the contrary, it just calculated the integer once we got a digital character. 
public class Solution {
    public int atoi(String str) {
        boolean neg = false; // flag to mark if the converted integer positive or negative.
        
        // check if the string is null or empty
        if (str == null || str.isEmpty()) return 0;
        
        // trim the leading white spaces
        str = str.trim();
        
        // check if the string contains white spaces only
        if (str.isEmpty()) return 0;
        
        int i = 0;
        
        // check the sign position
        if (isNeg(str.charAt(0))) {
            neg = true;
            i++;
        } else if (isPos(str.charAt(0))) {
            neg = false;
            i++;
        }
        
        // calcuate the integer value of the digital string
        long result = 0;
        while (i < str.length() && i <= 11 && Character.isDigit(str.charAt(i))) {
            result *= 10;
            result += str.charAt(i) - '0';
            i++;
        }
        
        // check the sign flag
        if (neg) result = -result;
        
        // handle overflow
        if (result > Integer.MAX_VALUE) return Integer.MAX_VALUE;
        if (result < Integer.MIN_VALUE) return Integer.MIN_VALUE;
        
        return (int)result;
    }
    
    // check if the sign is negative
    private boolean isNeg(char ch) {
        return ch == '-';
    }
    
    // check if the sign is positive
    private boolean isPos(char ch) {
        return ch == '+';
    }
}

From Integer to String  
So far we learned how to convert from a string to integer. How about the reverse, i.e., convert from an integer to string? For example, given an integer 567, should be outputted a string "567". For integer -567, should be outputted as "-567". 

There are several things to consider: 
1. Convert an integer to string is easier because we don't need to consider the leading white spaces and overflow problems.. 
2. How to parse each digit from the integer. e.g. 567 should be parsed into 5 6 and 7. 
3. Given a digit value, how to convert to a corresponding character. 

For the second question, we can parse from either left to right or right to left. From left to right is easier but not very efficient. Now let's consider parse from right to left.  For the integer 732, we can do this way:

732 % 10 = 2
732 / 10 =  73

73 % 10 = 3
73 / 10 = 7

7 % 10 = 7 
7 / 10 = 0

So in this way we can find the digit 2, 3, 7 in a reversed order. We we have to use a temporary buffer to store the number reversely and put them into a string backward. 

For the third question, given an integer value, e.g. 7, how to get its character representation. Similar to the string to integer, we can add '0'. For instance, '7' = 7 + '0'. 

Below is the Java code:

public class IntToString {
  public static String intToString (int num) {
    boolean neg = false;

    if (num < 0) {
      neg = true;
      num = -num;
    }

    char buf[] = new char[11];
    
    int i = 0;
    do {
      buf[i++] = (char) (num % 10 + '0');
      num /= 10;
    } while (num != 0);

    StringBuilder result = new StringBuilder();

    if (neg) result.append('-');
    
    while (i > 0) {
      result.append(buf[--i]);
    }

    return result.toString();
  }
}

Summary:
In this post, we learned how to convert a string to an integer, and verse vera. 
Specifically, we learned:
1. How to get a digital value from the digital character? We used - '0'.
2. How to calculate the integer value from a string? We used "Horner's rule"
3. Howe to get a corresponding character from a digit? We used + '0'
4. How to get place value from an integer? We parsed the integer from right to left

Last but not least, this quest ion is not complicated. But the description of the question is intentionally vague. So it is very important to communicate with the interviewer to clarify the problem and ask as many questions as possible. 



Update on 10/14/14:
public class Solution {
    public int atoi(String str) {
        if (str == null || str.length() == 0) {
            return 0;
        }
        
        // Remove leading white spaces
        str = str.trim();
        
        char[] array = str.toCharArray();
        boolean neg = false;
        int i = 0;
        
        // Handle leading sign character
        if (str.charAt(i) == '-') {
            neg = true;
            i++;
        } else if (str.charAt(i) == '+') {
            i++;
        }
        
        // If leading character is not digital, return 0
        if (!isDigital(str.charAt(i))) {
            return 0;
        }
        
        // Find out the first sequence of digits
        ArrayList<Integer> digits = new ArrayList<Integer>();
        while (i < str.length()) {
            if (isDigital(str.charAt(i))) {
                digits.add(str.charAt(i) - '0');
                i++;
            } else {
                break;
            }
        }
        
        // Handle overflow
        if (digits.size() > 10 && neg == true) {
            return Integer.MIN_VALUE;
        } else if (digits.size() > 10 && neg == false) {
            return Integer.MAX_VALUE;
        }
        
        // Get results
        long result = 0;
        for (int j = 0; j < digits.size(); j++) {
            result = result * 10 + digits.get(j);
        }
        
        if (neg == true) {
            result = -result;
        }
        
        // Handle overflow
        if (result >= Integer.MAX_VALUE) {
            return Integer.MAX_VALUE;
        } else if (result <= Integer.MIN_VALUE) {
            return Integer.MIN_VALUE;
        } else {
            return (int) result;
        }
    }
    
    private boolean isDigital(char a) {
        return (a >= '0') && (a <= '9');
    }
}

Update on 9/29/15:

public class Solution {
    public int myAtoi(String s) {
        if (s == null || s.length() == 0) {
            return 0;
        }
        
        int i = 0;
        long result = 0;
        int n = s.length();

        // Step 1: skip the leading whitespaces, if any
        while (i < n && s.charAt(i) == ' ') {
            i++;
        }
        
        // Step 2: skip the '+' or '-' sign, if exists
        boolean isNeg = false;
        if (i < n && (s.charAt(i) == '+' || s.charAt(i) == '-')) {
            if (s.charAt(i) == '-') {
                isNeg = true;
            }
            i++;
        }
        
        // Step 3: parse the first sequence of numeric numbers
        int numDigits = 0;
        while (i < n && Character.isDigit(s.charAt(i))) {
            result = result * 10 + (s.charAt(i) - '0');
            i++;
            numDigits++;
        }
        
        if (numDigits > 10) {
            if (isNeg) {
                return Integer.MIN_VALUE;
            } else {
                return Integer.MAX_VALUE;
            }
        }
        
        if (isNeg) {
            result = -result;
        }
        
        if (result > Integer.MAX_VALUE) {
            return Integer.MAX_VALUE;
        }
        
        if (result < Integer.MIN_VALUE) {
            return Integer.MIN_VALUE;
        }
        
        return (int) result;
    }
}

No comments:

Post a Comment