Sunday, August 23, 2015

Leetcode: Regular Expression Matching

implement regular expression matching with support for '.' and '*'.
'.' Matches any single character.
'*' Matches zero or more of the preceding element.

The matching should cover the entire input string (not partial).

The function prototype should be:
bool isMatch(const char *s, const char *p)

Some examples:
isMatch("aa","a") → false
isMatch("aa","aa") → true
isMatch("aaa","aa") → false
isMatch("aa", "a*") → true
isMatch("aa", ".*") → true
isMatch("ab", ".*") → true
isMatch("aab", "c*a*b") → true

Naive Solution:
The key of the problem is to check if p[j + 1] is a '*', and has two cases:
1. If p[j + 1] is a '.', then this case is simple. Just need to check s.charAt(i) == p.charAt(j) || p.charAt(j) == '.'. If not, return false, else s and p goes to the next character, ie. i + 1, j + 1.

2. If p[j + 1] is a "*", the case is a bit tricky. 
Suppose that if s[i], s[i + 1], s[i + 2] .. s[i + k] is equal to p[j], that means all those could be the possible matches. So we need to check the rest of (i, j + 2), (i + 1, j + 2), (i + 2, j + 2), ... (i + k, j + 2). 

Code (Java):
public class Solution {
    public boolean isMatch(String s, String p) {
        if (p == null || p.length() == 0) {
            return s == null || s.length() == 0;
        }
        
        // Case 1: p.length() == 1
        if (p.length() == 1) {
            if (s == null || s.length() == 0) {
                return false;
            }
            
            if (s.charAt(0) != p.charAt(0) && p.charAt(0) != '.') {
                return false;
            }
            
            return isMatch(s.substring(1), p.substring(1));
        }
        
        // Case 2: p.charAt(1) != '*'
        if (p.charAt(1) != '*') {
            if (s.length() == 0) {
                return false;
            }
            
            if (s.charAt(0) == p.charAt(0) || p.charAt(0) == '.') {
                return isMatch(s.substring(1), p.substring(1));
            } else {
                return false;
            }
        } else { // case 3
            if (isMatch(s, p.substring(2))) {
                return true;
            }
            
            int i = 0;
            while (i < s.length() && (s.charAt(i) == p.charAt(0) || p.charAt(0) == '.')) {
                if (isMatch(s.substring(i + 1), p.substring(2))) {
                    return true;
                }
                i++;
            }
            
            return false;
        }
    }
}

A DP Solution:
2-sequence problem:
 -- dp[s.length() + 1][p.length() + 1], where dp[i][j] means the first i characters from string i matches the first j characters in string j. 
 -- Initial state: dp[0][0] = true, e.g. "" -> "", true. 
                        dp[i][0] = false, i >= 1, any string cannot match a empty string 
                        dp[0][i], if (p.charAt(j) == '*'), dp[0][j] = dp[0][j - 2] 

-- Transit function: 
      -- If p.charAt(j) != '*'. Then IF s.charAt(i - 1) == p.charAt(j - 1) || p.charAt(j - 1) == '.'. 
            -- dp[i][j] = dp[i - 1][j - 1];
      -- Else  // p.charAt(j - 1) == "*"
           -- If s.charAt(i - 1) != p.charAt(j - 2) && p.charAt(j - 2) != '.' 
               Then dp[i][j] = dp[i][j - 2] // zero matched, e.g. s = acdd, p = acb*dd. 
           -- Else 
                Then dp[i][j] = dp[i][j - 2]  ||  // zero matched
                                       dp[i][j - 1] || // 1 matched
                                       dp[i - 1][j] // 2+ matched

Code (Java):
public class Solution {
    public boolean isMatch(String s, String p) {
        if (p == null || p.length() == 0) {
            return s == null || s.length() == 0;
        }
        int rows = s.length();
        int cols = p.length();
        
        boolean[][] dp = new boolean[rows + 1][cols + 1];
        dp[0][0] = true;
        
        for (int j = 1; j <= cols; j++) {
            if (p.charAt(j - 1) == '*') {
                dp[0][j] = dp[0][j - 2];
            }
        }
        
        for (int i = 1; i <= rows; i++) {
            for (int j = 1; j <= cols; j++) {
                char sChar = s.charAt(i - 1);
                char pChar = p.charAt(j - 1);
                
                if (pChar != '*') {
                    if (sChar == pChar || pChar == '.') {
                        dp[i][j] = dp[i - 1][j - 1];
                    }
                } else {
                    if (sChar != p.charAt(j - 2) && p.charAt(j - 2) != '.') {
                        dp[i][j] = dp[i][j - 2];
                    } else {
                        dp[i][j] = dp[i][j - 2] || dp[i - 1][j] || dp[i][j - 1];
                    }
                }
            }
        }
        return dp[rows][cols];
    }
}


No comments:

Post a Comment