Tuesday, September 23, 2014

Leetcode: Distinct Subsequences

Given a string S and a string T, count the number of distinct subsequences of T in S.
A subsequence of a string is a new string which is formed from the original string by deleting some (can be none) of the characters without disturbing the relative positions of the remaining characters. (ie, "ACE" is a subsequence of "ABCDE" while "AEC" is not).
Here is an example:
S = "rabbbit"T = "rabbit"
Return 3.
Understand the problem:
The problem gives two strings, S and T, count the number of distinct subsequences of T in S. The problem is a bit of ambiguous. It actually asks that how many distinct subsequences of S which is equal to T. 

Be aware of subsequence and substring. A subsequence of a string is a subset of the string but couldn't disturb the relative positions of the original characters. The main difference between a substring and subsequence is substring must include continuous characters of the original string, whereas subsequence does not need to be, just maintain the relative order of the selected characters. 

DP Solution:
This is a classic DP problem, so we can think of solving this problem using the DP approach.
  1. Define a DP matrix, dp[S.length() + 1][T.length() + 1], whereas dp[i][j] means the number of distinct subsequences from string S[0, i] to string T[0, j].
  2. Initial state: dp[0][0] = 1, dp[0][j] = 0, dp[i][0] = 1
  3. Transit function: For S.charAt(i) == T.charAt(j), dp[i][j] = dp[i - 1][j] + dp[i - 1][j - 1]. For different, dp[i][j] = dp[i - 1][j]
  4. Final state: dp[S.length()][T.length()]


    #    r   a   b   b   i   t
#   1    0   0   0   0   0   0
r   1    1   0   0   0   0   0
a   1    1   1   0   0   0   0
b   1    1   1   1   0   0   0
b   1    1   1   2   1   0   0
b   1    1   1   3   3   0   0
i   1    1   1   3   3   3   0
t   1    1   1   3   3   3   3 

Code (Java):
public class Solution {
    public int numDistinct(String S, String T) {
        if (S == null || S.length() == 0 || T == null) {
            return 0;
        }
        
        int[][] dp = new int[S.length() + 1][T.length() + 1];
        dp[0][0] = 1;
        
        for (int i = 1; i <= S.length(); i++) {
            dp[i][0] = 1;
        }
        
        for (int i = 1; i <= S.length(); i++) {
            for (int j = 1; j <= T.length(); j++) {
                if (S.charAt(i - 1) == T.charAt(j - 1)) {
                    dp[i][j] = dp[i - 1][j - 1] + dp[i - 1][j];
                } else {
                    dp[i][j] = dp[i - 1][j];
                }
            }
        }
        
        return dp[S.length()][T.length()];
        
    }
}

Summary:
For DP related problems, it is crucial to figure out how to define the dp matrix and the transit function.

Update on 4/10/19:
public class Solution {
    /**
     * @param S: A string
     * @param T: A string
     * @return: Count the number of distinct subsequences
     */
    public int numDistinct(String S, String T) {
        if (T == null || T.length() == 0) {
            return 1;
        }
        
        int[] dp = new int[T.length() + 1];
        dp[0] = 1;
        
        for (int i = 1; i <= S.length(); i++) {
            dp[0] = 1;
            for (int j = T.length(); j >= 1; j--) {
                if (S.charAt(i - 1) == T.charAt(j - 1)) {
                    dp[j] = dp[j - 1] + dp[j];
                } 
            }
        }
        
        return dp[T.length()];
    }
}

No comments:

Post a Comment