Tuesday, July 22, 2014

Leetcode: Reverse words in a string

Given an input string, reverse the string word by word.
For example,
Given s = "the sky is blue",
return "blue is sky the".
Clarification:
  • What constitutes a word?
    A sequence of non-space characters constitutes a word.
  • Could the input string contain leading or trailing spaces?
    Yes. However, your reversed string should not contain leading or trailing spaces.
  • How about multiple spaces between two words?
    Reduce them to a single space in the reversed string.

The solution should handle three corner cases:
1. A string has leading spaces, which however should be removed in the reversed string
2. A string has trailing spaces, which however should be removed in the reversed string
3. A string multiple spaces between two words, but the reversed string should contain only one single space

A Naive Solution:
The crux of this problem is to split the string into words separated by the space, and return the words in a reversed order. How to do that?

A native solution is to scan the string from the end. Use one pointer points to ends of a string, iterate another one in a reversed order until reaches to a space character. Then iterate this word in forward order and store into a new string. 

How to handle the leading and trailing spaces? To remove the leading spaces, scan the string from beginning until reaches to a non-space character. Removing the trailing spaces has the similar idea: iterate the string from the end in a reversed order until reaches a non-space order. 

How to handle the multiple spaces between words? Once we copy a word into a new string, append a space character, iterate the pointer in reversed order until reaches to a non-space character.   

Code (Java):
public class Solution {
    public String reverseWords(String s) {
        // empty string
        if (s == null || s.length() == 0) return "";
        
        int length = s.length();
        StringBuilder r = new StringBuilder(); // buffer to store the reversed string
        int stringStart = 0;
        int tokenReadPos = length - 1;
        int wordEnd;
        
        // Check the leading spaces
        while (stringStart < length && s.charAt(stringStart) == ' ')
            stringStart++;
        
        // Determine the trailing spaces
        while (tokenReadPos >= stringStart && s.charAt(tokenReadPos) == ' ') {
            tokenReadPos--;
        }
        
        while (tokenReadPos >= stringStart) {
            if (s.charAt(tokenReadPos) == ' ') {
                r.append(' ');
                tokenReadPos--;
                // Handle multiple spaces
                while (tokenReadPos >= stringStart && s.charAt(tokenReadPos) == ' ')
                    tokenReadPos--;
            }
            else {
                wordEnd = tokenReadPos;
                while (tokenReadPos >= stringStart && s.charAt(tokenReadPos) != ' ') {
                    tokenReadPos--;
                }
                // Copy a word into the buffer
                for (int i = tokenReadPos + 1; i <= wordEnd; i++)
                    r.append(s.charAt(i));
            }
        }
        return r.toString();
    }
}

Analysis:
Removing the leading and trailing spaces take constant time. This solution only traverses the string once, thus the time complexity is O(n). It needs an additional string to store the reversed string, so the space complexity is O(n). 


An Improved Solution:
The naive solution looks very complicated and the code is not neat at all. Actually, the main part of the work for the solution is to parse the string into tokens. As a result, does there a method existed in Java support this functionality? The answer is yes, we can use split() method to parse the string into tokens, and then store the tokens in a reversed order.

Code (Java):
public class Solution {
    public String reverseWords(String s) {
        // Check if the string is empty
        if (s == null || s.length() == 0) return "";
        
        // Split the string into tokens
        String delim = "[ ]+";
        String[] arr = s.split(delim);
        
        // Create a temporary buffer to store the reversed string
        // Use StringBuilder since String is immutable
        StringBuilder sb = new StringBuilder();
        
        // Iterate the array in an reversed order
        for (int i = arr.length - 1; i >= 0; i--) {
            if (!arr[i].equals("")) {
                sb.append(arr[i]).append(" "); // leading empty string
            }
        }
        
        return sb.length() == 0 ? "" : sb.substring(0, sb.length() - 1);
    }
}

Analysis:
We can see that in this solution the complexity remains the same, but the code looks much more neat. The crux is to use the split method to parse a string, the split is defined as:
String[] split(String regex)
It divides the string into tokens based on the given delimiters. A token a piece of information, e.g. a "word" in this problem. The delimiter is defined as one or more characters to separate tokens.  Here we gave a very brief overview of common ways to use the split method.

Case 1: When there is just one character used as a delimiter:
Example 1:
We want to divide up a phrase into words where spaces are used to separate the words. E.g.
the music made     it     hard     to     concentrate
In this case, we have just one delimiter (space) and consecutive delimiters should be treated as one delimiter. To parse this string in Java, we do
String phrase = "the music made    it    hard     to        concentrate";
String delims = "[ ]+";
String[] tokens = phrase.split(delims);

Note that the general form for specifying the delimiters that we will use is "[delim_char]+". The "+" indicates that consecutive delimiters should be treated as only one.

The split method returns an array of string, to see the strings, just use a for loop in Java
System.out.println("Length of the string: " + tokens.length);
for(String n : tokens)
    System.out.println(n);

And the print out should be:
Length of the string: 7
the
music
made
it
hard
to
concentrate

Example2:
Suppose there is string contains an employee's last name, first name, ID#, and the number of hours worked for each day of work, separated by commas. So
Smith,Katie,3014, ,8.25,6.5, ,10.5

To parse this string we do
String employee = "Smith,Katie,3014, ,8.25,6.5, ,10.5";
String delim = "[,]";
String tokens = employee.split(delim);

So the output should be
Length of the string: 8
Smith
Katie
3014

8.25
6.5

10.5
Note the space in between. 
There is only one small wrinkle to be aware of: regardless how consecutive delimiters are handled, if the string starts with one or more delimiters, then the first token will be the empty string("").

This is very important to problem of reverse words in a string: if the string has leading spaces, using split([ ]+) will introduce another empty string at beginning.  That is exactly the reason at Line 16, we wanna check if the string is not empty, because we don't allow leading spaces in the reversed string.


For example, in the code below:
String str = "   Today is good";
String delim = "[ ]+";
String tokens = str.split(delim);

The output will be 4 strings:
Length of the string: 4

Today
is
good

Case 2: When there are several characters being used as delimiters
Example 3:
Suppose we have a string containing English sentences that uses only commas, periods, question marks, etc.. as punctuation. We wish to extract the individual words in the string without the punctuation. In this case we have several delimiters (the punctuation marks as well as spaces) and we want to treat consecutive delimiters as one:
String str = "This is a sentence.   This is a question, right? Yes!!";
String delim = "[ .,?!]+";
String[] tokens = str.split(delim);

The output will be: 
Length of the string: 10
This
is
a
sentence
This
is
a
question
right
Yes

Example 4:
Suppose we are representing arithmetic expressions using strings and with to parse out the operands. That is, the operators as delimiters. The arithmetic operators that we will allow are +, *, / and exponentiation(^). This situation is not as straight-forward as before. In java regular expression, there are several characters which has special meanings:
ConstructDescription
[abc]a, b, or c (simple class)
[^abc]Any character except a, b, or c (negation)
[a-zA-Z]a through z, or A through Z, inclusive (range)
[a-d[m-p]]a through d, or m through p: [a-dm-p] (union)
[a-z&&[def]]d, e, or f (intersection)
[a-z&&[^bc]]a through z, except for b and c: [ad-z] (subtraction)
[a-z&&[^m-p]]a through z, and not m through p: [a-lq-z] (subtraction

In order to use one of those characters, we need to put \\ in front of the character. For example, in the code below:
String expr = "c";
String delims = "[+\\-*/\\^ ]+";
String[] tokens = expr.split(delims);


Length of the string: 8
2
x
3
4
5
y
z
2

Another Interesting Solution:
In the previous solutions, the reversed string is out of the place of the input string, which needs additional O(n) space to store the reversed string. What if the system has limited space (memory, storage)? Does there a in-place solution existed?

The answer is yes. Here I only gave a brief idea. We all know how to reverse a string in-place. For example, for string "Today is good". If it is reversed in place, it becomes
"doog si yadoT". You must get the idea! Then we can reverse each token by token, then it becomes "good is Today". Then we are done!! In this solution, we don't need allocate additional space to store the reversed string. But the down side is we have to traverse the string twice where the time complexity is still O(n).


Summary
In this problem, we introduced three solutions. In a code interview, there is no absolute best solution; it totally depends on the requirements. So it is very important to communicate with the interviewer and clarify the requirements. Most of them they have no restricted answer yet, so it is you should provide all the possible solutions, and more importantly, analyze!!
In this question, the first solution is complex and error-prone, but does not require any langue feature, so the solution is portable to any language, e.g., C/C++, Java and Python. In the second solution, the code looks clean and straight-forward, but it requires Java regular expression. In the third one, the solution is tricky. But it does not require additional space. In the case you have limited space or you are allowed to modify the input string, the solution 3 is an option. 

Updates on 11/20/14:
public String reverseWords(String s) {
   StringBuilder reversed = new StringBuilder();
   int j = s.length();
   for (int i = s.length() - 1; i >= 0; i--) {
      if (s.charAt(i) == ' ') {
         j = i;
      } else if (i == 0 || s.charAt(i - 1) == ' ') {
         if (reversed.length() != 0) {
            reversed.append(' ');
         }
         reversed.append(s.substring(i, j));
      }
   }
   return reversed.toString();
}

How about in-place?
Two steps: 
Step 1: Swap the first character of the string with the last character, then second character with the second-to-last character ,and so on.

Step 2: Identify each token, and swap characters within each token. 


No comments:

Post a Comment