
Problem Statement
The task is to determine the length of the longest subsequence that can be found identically in two given strings, text1
and text2
. If there is no such subsequence that both strings share, the result should be 0
. A subsequence is a sequence that can be derived from another sequence by deleting some or no elements without changing the order of the remaining elements. In context, for any two strings, this problem seeks to identify the longest sequence of characters that appear left-to-right (but not necessarily consecutively) in both strings and measure its length.
Examples
Example 1
Input:
text1 = "abcde", text2 = "ace"
Output:
3
Explanation:
The longest common subsequence is "ace" and its length is 3.
Example 2
Input:
text1 = "abc", text2 = "abc"
Output:
3
Explanation:
The longest common subsequence is "abc" and its length is 3.
Example 3
Input:
text1 = "abc", text2 = "def"
Output:
0
Explanation:
There is no such common subsequence, so the result is 0.
Constraints
1 <= text1.length, text2.length <= 1000
text1
andtext2
consist of only lowercase English characters.
Approach and Intuition
To solve this problem and understand the involved steps, let's break down the given examples
and constraints.
Consider the dynamic programming approach, which uses a 2D array
dp
wheredp[i][j]
represents the length of the longest common subsequence oftext1[0...i-1]
andtext2[0...j-1]
.We incrementally build this
dp
array by comparing characters from both strings:- If
text1[i-1] == text2[j-1]
: It means the current characters match, hencedp[i][j]
can be obtained by taking the value fromdp[i-1][j-1]
and adding one. - If
text1[i-1] != text2[j-1]
: Here, the characters do not match, so we take the maximum of either excluding the current character fromtext1
ortext2
, i.e.,dp[i][j] = max(dp[i-1][j], dp[i][j-1])
.
- If
Initialize the
dp
array with base cases where:dp[i][0]
anddp[0][j]
are zero because a zero length of either string will have no common subsequence with the other.
Finally, the solution to our problem, the length of the longest common subsequence, will be stored in
dp[text1.length()][text2.length()]
.
Consider these important notes from the examples:
Example 1:
- Inputs:
text1 = "abcde", text2 = "ace"
- A clear point-by-point comparison leads us to find "ace" as the longest common subsequence, which has a length of 3.
- Inputs:
Example 2:
- Inputs:
text1 = "abc", text2 = "abc"
- Both strings are identical, making the whole string "abc" the longest common subsequence with a length of 3.
- Inputs:
Example 3:
- Inputs:
text1 = "abc", text2 = "def"
- There are no common characters between
text1
andtext2
, resulting in a longest common subsequence length of 0.
- Inputs:
Constraint Insights:
- The maximum length for
text1
andtext2
is 1000, which implies our approach needs to efficiently handle large inputs. - Both strings are composed of lowercase English letters, simplifying the problem as we deal with a known and limited character set.
Solutions
- Java
- Python
class Solution {
public int longestCommonSubsequence(String s1, String s2) {
// Swap strings if s2 is shorter than s1
if (s2.length() < s1.length()) {
String tempStr = s1;
s1 = s2;
s2 = tempStr;
}
// Use dynamic programming tables for LCS calculation
int[] oldRow = new int[s1.length() + 1];
int[] newRow = new int[s1.length() + 1];
// Process each character from the end to the beginning
for (int i = s2.length() - 1; i >= 0; i--) {
for (int j = s1.length() - 1; j >= 0; j--) {
if (s1.charAt(j) == s2.charAt(i)) {
newRow[j] = 1 + oldRow[j + 1];
} else {
newRow[j] = Math.max(oldRow[j], newRow[j + 1]);
}
}
// Swap oldRow and newRow for the next iteration
int[] temp = oldRow;
oldRow = newRow;
newRow = temp;
}
// Return the result from the computed LCS length
return oldRow[0];
}
}
Implement the "Longest Common Subsequence" algorithm in Java using Dynamic Programming. This approach facilitates efficient LCS calculation by considering all subsequences of two given strings to determine the longest common substrate. Essentially, the solution exploits a tabular framework to store the length of the longest subsequence that can be obtained from the input strings. Here's a streamlined overview of the solution's execution:
- If the second string is shorter than the first, swap the strings to optimize space usage because only one-dimensional arrays are used to store the previous and current lengths of LCS.
- Create two arrays,
oldRow
andnewRow
, initialized to the length of the shorter string plus one. These arrays are used to manage the current and previous states of LCS calculations. - Iterate from the last character to the first character of the longer string (s2) and simultaneously for the shorter string (s1). For each character pair comparison:
- If the characters match, update the
newRow
to1 + oldRow[j+1]
, which represents extending a subsequence. - If the characters do not match, update the
newRow
with the maximum value derived either fromoldRow[j]
ornewRow[j+1]
, representing skipping a character in one of the strings to find a better match.
- If the characters match, update the
- After each complete pass over the characters of s1, swap
newRow
andoldRow
to prepare for the next character in s2. This swap allows for reusing the arrays without allocating additional space. - After finishing all iterations,
oldRow[0]
contains the length of the longest common subsequence.
This algorithm efficiently reduces space complexity to O(min(m, n)) and time complexity to O(m*n), where m and n are the lengths of the two strings. This optimized approach is suited for situations where string lengths are considerably large, ensuring both performance and resource utilization are balanced.
class LCSCalculator:
def computeLCS(self, str1: str, str2: str) -> int:
if len(str1) > len(str2):
str1, str2 = str2, str1
prior = [0] * (len(str1) + 1)
curr = [0] * (len(str1) + 1)
for i in reversed(range(len(str2))):
for j in reversed(range(len(str1))):
if str2[i] == str1[j]:
curr[j] = 1 + prior[j + 1]
else:
curr[j] = max(prior[j], curr[j + 1])
prior, curr = curr, prior
return prior[0]
The provided Python code defines a class LCSCalculator
that computes the length of the longest common subsequence (LCS) between two strings. The algorithm utilizes dynamic programming to efficiently determine the LCS length. The method computeLCS
inside LCSCalculator
takes two string inputs, str1
and str2
. The implementation leverages a space optimization technique by using two lists (prior
and curr
) to keep track of the LCS lengths in a bottom-up manner.
First, the code checks and arranges
str1
andstr2
such thatstr1
is always the shorter string. This is done to reduce the memory footprint since only the length ofstr1
plus one is needed for creating the lists.Two lists,
prior
andcurr
, are initialized with zeros, where their length is one more than that of the shorter string. These lists serve to store intermediate LCS values.The main computation is performed inside a nested loop structure, where the outer loop iterates through the characters of
str2
in reverse, and the inner loop does the same forstr1
.For each character comparison:
- When characters from
str1
andstr2
match at certain indices, the value at the current position incurr
is updated to be 1 plus the diagonal value inprior
. - If there's no match, it picks the maximum value between the current indexed value in
prior
and the next indexed value incurr
.
- When characters from
After each inner loop iteration, the roles of
prior
andcurr
are swapped;curr
takes the role ofprior
for the next iteration of computations, which avoids reinitializing the array for each new row of calculations.Upon completion of loops,
prior[0]
contains the length of the LCS.
This approach ensures the LCS is determined in O(n*m) time complexity, where 'n' and 'm' are the lengths of str1
and str2
, and uses O(min(n,m)) space. This is both time and space efficient, particularly useful for comparing large strings where classic methods might become computationally expensive.
No comments yet.