
Problem Statement
Given two input strings, str1
and str2
, the task is to find and return the shortest possible string that contains both str1
and str2
as subsequences. A subsequence of a string t
is derived by deleting some characters (which could also be zero deletions) from t
without changing the order of the remaining characters. The problem requires not only finding such a string but ensuring that it is the shortest possible string which can encompass both the input strings as its subsequences. If multiple such shortest strings exist, returning any one of them is acceptable.
Examples
Example 1
Input:
str1 = "abac", str2 = "cab"
Output:
"cabac"
Explanation:
str1 = "abac" is a subsequence of "cabac" because we can delete the first "c". str2 = "cab" is a subsequence of "cabac" because we can delete the last "ac". The answer provided is the shortest such string that satisfies these properties.
Example 2
Input:
str1 = "aaaaaaaa", str2 = "aaaaaaaa"
Output:
"aaaaaaaa"
Constraints
1 <= str1.length, str2.length <= 1000
str1
andstr2
consist of lowercase English letters.
Approach and Intuition
To solve the problem of finding the shortest string containing two given strings as subsequences, we can utilize the concept similar to the "Shortest Common Supersequence". Below are some steps and considerations based on the given examples and constraints:
Understanding Subsequences and Superssequences:
- A subsequence is formed by deleting certain characters of a string, without rearranging the order of the remaining characters.
- A supersequence of two strings is a single string that contains each of the given strings as a subsequence.
Utilizing Dynamic Programming:
- We can use a dynamic programming approach akin to solving the "Longest Common Subsequence" (LCS) problem.
- Generally, the length of the shortest common supersequence can be determined by
len(str1) + len(str2) - LCS(str1, str2)
. - By finding the LCS, we identify the overlap or the common sequence between the two strings that need not be repeated in our supersequence.
Constructing the Shortest Supersequence:
- After calculating the LCS, construct the shortest supersequence by iterating through both strings.
- Add non-LCS characters while ensuring that the characters from the LCS are added in the correct sequence.
Consider Edge Cases:
- If either string is a subsequence of the other, then the longer string is the shortest supersequence.
- Ensure that each character is included from both strings in such a way that they remain subsequences in the resultant string.
Applying these principles will allow us to construct a solution that efficiently finds the shortest string in which both input strings are subsequences. The dynamic programming approach ensures that we handle strings of lengths up to the constraint limits efficiently.
Solutions
- C++
class Solution {
public:
string findShortestSupersequence(string firstStr, string secondStr) {
int lenFirst = firstStr.length();
int lenSecond = secondStr.length();
vector<vector<int>> dpMatrix(lenFirst + 1, vector<int>(lenSecond + 1, 0));
// Edge cases initialization
for (int i = 0; i <= lenFirst; i++) {
dpMatrix[i][0] = i;
}
for (int j = 0; j <= lenSecond; j++) {
dpMatrix[0][j] = j;
}
// Populating the DP matrix
for (int i = 1; i <= lenFirst; i++) {
for (int j = 1; j <= lenSecond; j++) {
if (firstStr[i - 1] == secondStr[j - 1]) {
dpMatrix[i][j] = dpMatrix[i - 1][j - 1] + 1;
} else {
dpMatrix[i][j] = min(dpMatrix[i - 1][j], dpMatrix[i][j - 1]) + 1;
}
}
}
string resultSequence = "";
int i = lenFirst, j = lenSecond;
// Tracing back to find the common supersequence
while (i > 0 && j > 0) {
if (firstStr[i - 1] == secondStr[j - 1]) {
resultSequence += firstStr[i - 1];
i--;
j--;
} else if (dpMatrix[i - 1][j] < dpMatrix[i][j - 1]) {
resultSequence += firstStr[i - 1];
i--;
} else {
resultSequence += secondStr[j - 1];
j--;
}
}
// Append any remaining characters from either string
while (i > 0) {
resultSequence += firstStr[i - 1];
i--;
}
while (j > 0) {
resultSequence += secondStr[j - 1];
j--;
}
reverse(resultSequence.begin(), resultSequence.end());
return resultSequence;
}
};
The goal is to determine the Shortest Common Supersequence (SCS) of two strings using C++. Here's how the provided code accomplishes this task effectively utilizing dynamic programming (DP):
Initialize Variables and DP Matrix:
- Capture the lengths of the two input strings,
firstStr
andsecondStr
. - Create a DP matrix
dpMatrix
sized (lenFirst+1
xlenSecond+1
) initialized to zero, to store the length of the SCS at each substring intersection.
- Capture the lengths of the two input strings,
Base Case Setup:
- Set up the base cases in the DP matrix. This involves setting the edges where one of the strings is of zero length. If
firstStr
orsecondStr
is empty, the length of the SCS is simply the length of the other string at that point.
- Set up the base cases in the DP matrix. This involves setting the edges where one of the strings is of zero length. If
Fill the DP Matrix:
- Iterate over each character of both strings. For each pair of indices (i, j):
- If characters from both strings match, set
dpMatrix[i][j]
asdpMatrix[i-1][j-1] + 1
. - If they don't match, choose the minimum length from either removing a character from
firstStr
orsecondStr
and add one to it.
- If characters from both strings match, set
- Iterate over each character of both strings. For each pair of indices (i, j):
Construct the SCS:
- Start from the bottom-right corner of the matrix (
lenFirst
,lenSecond
) and trace back to the top-left corner. - Append characters to the
resultSequence
based on the matrix values, prioritizing matches and then minimum edits.
- Start from the bottom-right corner of the matrix (
Handle Remaining Characters:
- After reaching one of the top rows or columns, append the remaining characters from either
firstStr
orsecondStr
directly to theresultSequence
.
- After reaching one of the top rows or columns, append the remaining characters from either
Reverse and Return:
- Since the sequence is constructed backwards, reverse
resultSequence
to get the final SCS. - Return the correct sequence as the result.
- Since the sequence is constructed backwards, reverse
This approach provides an efficient solution to finding the shortest common supersequence by leveraging dynamic programming to break the problem down into simpler subproblems and building up the solution from there.
- Java
class Solution {
public String findShortestSupersequence(String first, String second) {
int length1 = first.length();
int length2 = second.length();
int[][] dpTable = new int[length1 + 1][length2 + 1];
// Base cases setup
for (int i = 0; i <= length1; i++) {
dpTable[i][0] = i;
}
for (int j = 0; j <= length2; j++) {
dpTable[0][j] = j;
}
// Build the dpTable for supersequence lengths
for (int i = 1; i <= length1; i++) {
for (int j = 1; j <= length2; j++) {
if (first.charAt(i - 1) == second.charAt(j - 1)) {
dpTable[i][j] = dpTable[i - 1][j - 1] + 1;
} else {
dpTable[i][j] = Math.min(dpTable[i - 1][j], dpTable[i][j - 1]) + 1;
}
}
}
StringBuilder supersequenceBuilder = new StringBuilder();
int i = length1, j = length2;
// Construct the supersequence
while (i > 0 && j > 0) {
if (first.charAt(i - 1) == second.charAt(j - 1)) {
supersequenceBuilder.append(first.charAt(i - 1));
i--;
j--;
} else if (dpTable[i - 1][j] < dpTable[i][j - 1]) {
supersequenceBuilder.append(first.charAt(i - 1));
i--;
} else {
supersequenceBuilder.append(second.charAt(j - 1));
j--;
}
}
// Add remaining characters from first string
while (i > 0) {
supersequenceBuilder.append(first.charAt(i - 1));
i--;
}
// Add remaining characters from second string
while (j > 0) {
supersequenceBuilder.append(second.charAt(j - 1));
j--;
}
// Reverse StringBuilder content to get correct sequence
return supersequenceBuilder.reverse().toString();
}
}
The given Java program solves the problem of finding the shortest common supersequence of two strings. The overall strategy leverages dynamic programming to build a solution. Here’s a breakdown of the provided code:
Initialization:
- An integer matrix
dpTable
is created with dimensions based on the lengths of the two input strings,first
andsecond
. This table is used to store the lengths of the shortest common supersequences at different substrings.
- An integer matrix
Base Case Setting:
- The first row and first column of
dpTable
are filled with increasing integer values starting from 0, representing the scenario where one of the strings is empty.
- The first row and first column of
DP Table Buildup:
- The matrix is populated using a nested loop structure. For each pair (i, j) where
i
is an index intofirst
andj
is an index intosecond
, it checks if characters at these positions match. If they match, the celldpTable[i][j]
is set to the value of the diagonal celldpTable[i-1][j-1]
plus one. If they don't match, it is set to the minimum of the cell to the left or above plus one.
- The matrix is populated using a nested loop structure. For each pair (i, j) where
Supersequence Construction:
- Starting from the last position of the
dpTable
, the supersequence is constructed using aStringBuilder
. Based on the values in thedpTable
, either characters fromfirst
,second
, or both are appended to the builder. If the paths lead up or left in the dp matrix, the corresponding character fromfirst
orsecond
is appended.
- Starting from the last position of the
Handling Remaining Characters:
- After the primary loop completes, any remaining characters from either the
first
orsecond
string (indicated by non-zero indicesi
orj
) are appended to the builder.
- After the primary loop completes, any remaining characters from either the
Final Step:
- The contents of the
StringBuilder
are reversed before being returned to ensure the sequence is in the correct order.
- The contents of the
This approach efficiently compiles the shortest common supersequence, optimizing decisions recursively through matrix calculations, thus ensuring an optimal solution.
- Python
class SupersequenceSolver:
def findSCS(self, string1: str, string2: str) -> str:
len1 = len(string1)
len2 = len(string2)
dp = [[0] * (len2 + 1) for _ in range(len1 + 1)]
for i in range(len1 + 1):
dp[i][0] = i
for j in range(len2 + 1):
dp[0][j] = j
for i in range(1, len1 + 1):
for j in range(1, len2 + 1):
if string1[i - 1] == string2[j - 1]:
dp[i][j] = dp[i - 1][j - 1] + 1
else:
dp[i][j] = min(dp[i - 1][j], dp[i][j - 1]) + 1
scs = []
i, j = len1, len2
while i > 0 and j > 0:
if string1[i - 1] == string2[j - 1]:
scs.append(string1[i - 1])
i -= 1
j -= 1
elif dp[i - 1][j] < dp[i][j - 1]:
scs.append(string1[i - 1])
i -= 1
else:
scs.append(string2[j - 1])
j -= 1
while i > 0:
scs.append(string1[i - 1])
i -= 1
while j > 0:
scs.append(string2[j - 1])
j -= 1
return ''.join(reversed(scs))
The provided Python solution implements a method to find the Shortest Common Supersequence (SCS) of two strings. This method efficiently determines the shortest sequence that contains both the given strings as subsequences.
Here’s a breakdown of how the solution functions:
Initialize a 2-dimensional dynamic programming array
dp
to store the lengths of the shortest supersequence at each subproblem(i, j)
, wherei
andj
are the current indices instring1
andstring2
respectively.Populate the base cases in the
dp
array. When one of the strings is empty, the shortest supersequence length to that point is just the length of the other string.Use a nested loop to fill in the
dp
table by comparing characters ofstring1
andstring2
. If they match, copy the diagonal value fromdp
(representing the continuation of both strings without adding length). If they don’t match, take the minimum value from either adjoining to the left (excluding a character fromstring1
) or above (excluding a character fromstring2
) and add one.After constructing the
dp
table, trace back fromdp[len1][len2]
to the origin (0,0) to construct the supersequence. This is done by comparing the values in thedp
table to decide whether to take a character fromstring1
,string2
, or both in case of a match.Reverse the collected characters since the building of the supersequence starts from the end to the beginning of the strings.
Execute this method to generate the shortest sequence that embeds both input strings while retaining their original order. Use examples to validate the functionality and correctness of your implementation for practical understanding.
No comments yet.