Wednesday, September 17, 2014

Leetcode: Edit Distance

Given two words word1 and word2, find the minimum number of steps required to convert word1 to word2. (each operation is counted as 1 step.)
You have the following 3 operations permitted on a word:
a) Insert a character
b) Delete a character
c) Replace a character
Understand the problem:
The problem asks for a minimum edit distance from string 1 to string 2. E.g. 
a => b, the minimum edit distance is 1
ab => bc, the minimum edit distance is 2
abc => acd, the output is 2

Since solving this problem using recursion will take the runtime at exponential order, we should naturally think about the DP solution. The key of the DP solution is to define the dp array as well as the transit equation.

Solution:
We use a 2D DP dp[i][j] denotes the minimum edit distance from substring[0, i] to substring[0, j]. So we wanna check the last element of the DP, which stands for the minimum distance from string 1 to string 2. 

The initial state, dp[0][j] = j, dp[i][0] = i. Means the minimum edit distance from empty string to string j and i is length of j and i, respectively. 

For dp[i][j], if word1.charAt(i) == word2.charAt(j), dp[i][j] = dp[i - 1][j - 1]. 
Else, there are three cases you must consider:

  • Replace the word1.charAt(i) to word2.charAt(j), so dp[i][j] = dp[i - 1][j - 1] + 1. E.g. abd, abc, when i points to d and j points to c, its minimum edit distance equals to replacing d with c, which is the minimum edit distance from ab to ab plus 1. 
  • Insert at word1.charAt(i) to word2.charAt(j), which equals to delete at word2.charAt(j). So dp[i][j] = dp[i][j - 1] + 1. E.g. ab to abc
  • Delete at word1.charAt(i) to word2.charAt(j). So dp[i][j] = dp[i - 1][j]. E.g. abc to ab.
  • We choose the minimum number from the above three cases. 
Code (Java):
public class Solution {
    public int minDistance(String word1, String word2) {
        if (word1 == null || word1.length() == 0) {
            return word2.length();
        }
        
        if (word2 == null || word2.length() == 0) {
            return word1.length();
        }
        
        int m = word1.length();
        int n = word2.length();
        int[][] dp = new int[m + 1][n + 1];
        
        // update the first row
        for (int i = 1; i <= n; i++) {
            dp[0][i] = i;
        }
        
        // update the first column
        for (int j = 1; j <= m; j++) {
            dp[j][0] = j;
        }
        
        for (int i = 1; i <= m; i++) {
            for (int j = 1; j <= n; j++) {
                if (word1.charAt(i - 1) == word2.charAt(j - 1)) {
                    dp[i][j] = dp[i - 1][j - 1];
                } else {
                    int replace = dp[i - 1][j - 1] + 1;
                    int insert = dp[i][j - 1] + 1;
                    int delete = dp[i - 1][j] + 1;
                    
                    dp[i][j] = Math.min(replace, Math.min(insert, delete));
                }
            }
        }
        
        return dp[m][n];
    }
}

Summary:
The problem is a bit of trickier than other DP problems like unique path, but still could be categorized as one kind of problems. The challenge part of a DP problem is to construct the transit equation. 


Update on 3/25/19: O(n) space solution
public class Solution {
    /**
     * @param word1: A string
     * @param word2: A string
     * @return: The minimum number of steps.
     */
    public int minDistance(String word1, String word2) {
        if (word1.length() == 0) {
            return word2.length();
        }
        
        if (word2.length() == 0) {
            return word1.length();
        }
        
        int m = word1.length();
        int n = word2.length();
        int[][] dp = new int[2][n + 1];
        int old = 0;
        int cur = 1;
        
        for (int i = 0; i <= n; i++) {
            dp[0][i] = i;
        }
        
        for (int i = 1; i <= word1.length(); i++) {
            dp[cur][0] = i;
            for (int j = 1; j <= word2.length(); j++) {
                if (word1.charAt(i - 1) == word2.charAt(j - 1)) {
                    dp[cur][j] = dp[old][j - 1];
                } else {
                    dp[cur][j] = Math.min(dp[old][j - 1], Math.min(dp[old][j], dp[cur][j - 1])) + 1;
                }
            }
            
            old = cur;
            cur = 1 - cur;
            
        }
        
        return dp[old][n];
    }
}

1 comment:

  1. The insertion and delection should be exchanged or not? I think insertion should be dp[i-1][j] +1

    ReplyDelete