
Problem Statement
The task is to determine the number of distinct subsequences of a given string s
that match exactly another string t
. A subsequence of a string is a new string generated from the original string by deleting some (can be none) of the characters without disturbing the relative positions of the remaining characters. For instance, "ace" is a subsequence of "abcde". We aim to find all such subsequences of s
that are identical to t
. This problem ensures that the solution remains within the bounds of a 32-bit signed integer.
Examples
Example 1
Input:
s = "rabbbit", t = "rabbit"
Output:
3
Explanation:
As shown below, there are 3 ways you can generate "rabbit" from s.
rabbbit
rabbbit
rabbbit
Example 2
Input:
s = "babgbag", t = "bag"
Output:
5
Explanation:
As shown below, there are 5 ways you can generate "bag" from s.
babgbag
babgbag
babgbag
babgbag
babgbag
Constraints
1 <= s.length, t.length <= 1000
s
andt
consist of English letters.
Approach and Intuition
To solve this problem, a dynamic programming approach is very relevant:
- Define a 2D array
dp
wheredp[i][j]
represents the count of distinct subsequences ofs[0...i-1]
(the substring ofs
from start to indexi-1
inclusive) that equalst[0...j-1]
. - Initialize
dp[0][0]
to1
because an empty substring ofs
matches an empty substring oft
in exactly one way. - Initialize
dp[i][0]
to1
for alli >= 1
; any part ofs
(including the whole string) contains the empty stringt
exactly once as a subsequence. - For all other positions, compute
dp[i][j]
based on two conditions:- If characters
s[i-1]
andt[j-1]
are the same, updatedp[i][j]
based on counting these subsequences by includings[i-1]
as part of the subsequence (dp[i-1][j-1]
) and ignorings[i-1]
(dp[i-1][j]
). - If characters
s[i-1]
andt[j-1]
are not the same, thendp[i][j]
is entirely based on ignorings[i-1]
(dp[i-1][j]
).
- If characters
Let's walk through aspects of the examples to clarify:
In Example 1 with
s = "rabbbit"
andt = "rabbit"
: By evaluating different matching scenarios, such as considering every character's occurrence and possibility ins
for use in a matching sequence, we get three distinct ways to form "rabbit".In Example 2 with
s = "babgbag"
andt = "bag"
: The pattern "bag" can be created in numerous ways by skipping different characters ins
while maintaining relative order, leading to five different sequences.
The above approach calculates the answer by efficiently using prior computed values. This strategy emphasizes the power of dynamic programming in reducing redundant computations, especially in counting problems involving subsequences. It ensures that our solution is quantifiable and adaptable to input scales defined by the constraints.
Solutions
- C++
- Java
- C
- JavaScript
- Python
class Solution {
public:
int countSubsequences(string str1, string str2) {
int len1 = str1.length();
int len2 = str2.length();
vector<vector<unsigned int>> dpTable(len1 + 1, vector<unsigned int>(len2 + 1));
for (int i = 0; i <= len1; i++) {
dpTable[i][len2] = 1;
}
for (int j = len2 - 1; j >= 0; j--) {
for (int i = len1 - 1; i >= 0; i--) {
if (str1[i] == str2[j]) {
dpTable[i][j] = dpTable[i + 1][j + 1] + dpTable[i + 1][j];
} else {
dpTable[i][j] = dpTable[i + 1][j];
}
}
}
return dpTable[0][0];
}
};
This solution tackles the problem of finding the number of distinct subsequences of one string (str2
) within another (str1
). It's implemented in C++ and makes use of dynamic programming to solve the problem efficiently.
The approach uses a 2D vector, dpTable
, where the element dpTable[i][j]
represents the count of subsequences starting from the i-th character of str1
and the j-th character of str2
. The process is as follows:
- Initialize
dpTable
with dimensions(len1 + 1) x (len2 + 1)
, wherelen1
andlen2
are the lengths ofstr1
andstr2
respectively. - Set the last column of
dpTable
, reflecting scenarios where the remainder ofstr2
is an empty subsequence. - Iteratively fill
dpTable
, moving backwards fromlen1
andlen2
. For each pair(i, j)
, check if characters ofstr1
andstr2
at positionsi
andj
match:- If they match, the value at
dpTable[i][j]
is set to the sum ofdpTable[i + 1][j + 1]
anddpTable[i + 1][j]
. This accounts for scenarios where the character atstr1[i]
contributes to a subsequence and scenarios where it does not. - If they do not match, propagate the value from
dpTable[i + 1][j]
.
- If they match, the value at
Finally, dpTable[0][0]
will contain the total number of distinct subsequences of str2
in str1
. This approach ensures that all potential subsequences are calculated efficiently by leveraging previously computed results. Thus, the solution optimizes both time and space complexity using dynamic programming principles.
class Solution {
public int distinctSubsequences(String s, String t) {
int sLen = s.length();
int tLen = t.length();
int[] dpTable = new int[tLen];
int previous = 1;
for (int i = sLen - 1; i >= 0; i--) {
previous = 1;
for (int j = tLen - 1; j >= 0; j--) {
int temp = dpTable[j];
if (s.charAt(i) == t.charAt(j)) {
dpTable[j] += previous;
}
previous = temp;
}
}
return dpTable[0];
}
}
In the Java solution for counting "Distinct Subsequences," the program determines how many times the string t
appears as a subsequence in string s
. This uses dynamic programming to efficiently solve the problem, avoiding unnecessary recomputation.
- Start by obtaining the lengths of strings
s
(sLen
) andt
(tLen
). - Create a dynamic programming table
dpTable
of integers, initializing it to zeros. This table will have a length equal totLen
, and each indexj
represents the number of ways the substringt[0...j]
can be formed froms[i...sLen-1]
. - Utilize a helper integer
previous
to hold the cumulative count of subsequences for processing the dynamic programming transitions.
The core processing involves iterating backwards through both strings s
and t
:
- Initialize
previous
to 1 at the start of each outer loop iteration overs
. - For each character in
s
, iterate backwards overt
. During this:- Store the current value at
dpTable[j]
into a temporary variabletemp
. - If
s[i]
matchest[j]
, incrementdpTable[j]
by the valueprevious
because each match offers new opportunities to form subsequences up to that index. - Update
previous
to the value stored intemp
, which represents the subsequences count up to the next character int
.
- Store the current value at
At the end of the iterations, dpTable[0]
will contain the total number of distinct subsequences of t
in s
, which is the desired output of the function.
long long MODULO = 1000000007;
int countDistinctSubsequences(char* string1, char* string2) {
int len1 = strlen(string1);
int len2 = strlen(string2);
long long ways[len1 + 1][len2 + 1];
memset(ways, 0, sizeof(ways));
for (int i = 0; i <= len1; i++) ways[i][len2] = 1;
for (int i = len1 - 1; i >= 0; i--) {
for (int j = len2 - 1; j >= 0; j--) {
ways[i][j] = ways[i + 1][j];
if (string1[i] == string2[j]) {
ways[i][j] += ways[i + 1][j + 1];
ways[i][j] %= MODULO;
}
}
}
return (int)ways[0][0];
}
The solution provided uses dynamic programming to count the number of distinct subsequences of string2
that can be formed from string1
. Here's an overview of the method used in the C program:
- Initialize variables
len1
andlen2
to store the lengths ofstring1
andstring2
, respectively. - Define a two-dimensional array
ways
of size(len1 + 1) x (len2 + 1)
initialized with zeros to store the number of ways subproblem solutions. The scalarMODULO
is set to1000000007
to ensure results are computed under this modulus. - Fill the base case: for any
i
,ways[i][len2]
is set to1
. This represents that there is exactly one way to match an empty subsequence. - Iterate through the characters of
string1
from end to start. For each character, iterate throughstring2
from end to start:- Set
ways[i][j]
equal toways[i + 1][j]
, representing that all subsequences starting from the next character ofstring1
are also valid starting from the current character if the current character ofstring1
is skipped. - If the characters at
string1[i]
andstring2[j]
are the same, updateways[i][j]
by adding the value ofways[i + 1][j + 1]
and then take modulusMODULO
. This addition represents that every subsequence that can be formed by using the next characters of both strings can also be formed by starting from the current characters.
- Set
- Return the value of
ways[0][0]
which contains the count of distinct subsequences ofstring2
withinstring1
.
This method effectively uses dynamic programming principles to reduce the problem into smaller, manageable subproblems, thus ensuring an efficient solution to counting distinct subsequences. The use of the modulo operation ensures that the solution remains efficient in terms of memory and processing, avoiding overflow issues.
var countSubsequences = function (source, target) {
let srcLength = source.length;
let tgtLength = target.length;
let dp = new Array(tgtLength).fill(0);
let lastVal = 1;
for (let i = srcLength - 1; i >= 0; i--) {
lastVal = 1;
for (let j = tgtLength - 1; j >= 0; j--) {
let temp = dp[j];
if (source.charAt(i) === target.charAt(j)) {
dp[j] += lastVal;
}
lastVal = temp;
}
}
return dp[0];
};
The solution provided tackles the problem of counting the number of distinct subsequences of a given target
string that can be derived from a given source
string. The approach utilizes dynamic programming to efficiently compute the required count. Below explains the solution implemented in the given JavaScript function countSubsequences
.
- Initialize variables for the length of the source (
srcLength
) and target (tgtLength
) strings. - Create an array
dp
of sizetgtLength
and initialize its elements to0
. This array is key in storing intermediate results, specifically, the count of subsequences ending at different positions in the target. - Start iterating over the source string from the end to the beginning. This reverse iteration helps in building up the solution based on subsequences formed with earlier characters.
- Use a nested loop to also iterate over the target string from the end to the start. This double iteration allows comparing each character from the source with each character in the target.
- For each pair of characters from source and target, use the previous value (
lastVal
) which stores the sum of subsequences found up to the previous iteration. If the characters match, update the current position in thedp
array by addinglastVal
, which effectively counts a new valid subsequence. - At the conclusion of both loops,
dp[0]
contains the total count of distinct subsequences that match the target string.
The function returns the value dp[0]
as the total number of distinct subsequences of the target in the source, providing a dynamic and memory-efficient solution to the problem.
class Solution:
def countDistinctSubsequences(self, source: str, target: str) -> int:
source_len, target_len = len(source), len(target)
subsequences_count = [0] * target_len
for i in range(source_len - 1, -1, -1):
last = 1
for j in range(target_len - 1, -1, -1):
previous_value = subsequences_count[j]
if source[i] == target[j]:
subsequences_count[j] += last
last = previous_value
return subsequences_count[0]
The given Python code defines a method countDistinctSubsequences
to determine the number of distinct subsequences in the source
string that match a target
string. Here's a concise summary of how the solution works:
- Initialize the lengths of both the
source
andtarget
strings. - Create an array
subsequences_count
initialized to zero with the same length astarget
. This array is used to keep track of the count of subsequences found that match the target up to each character. - Traverse the
source
string in reverse.- For each character in
source
, start another reverse traversal for thetarget
string. - Use a temporary variable
last
to store the subsequences count temporarily. - During the nested loop over the
target
:- If a character in
source
matches a character intarget
, update thesubsequences_count
for the current position by adding the value oflast
, which represents the total ways the current subsequence could form the target up to that character.
- If a character in
- Update the
last
value to the current value ofsubsequences_count
at the positionj
.
- For each character in
- After processing all characters, the first element of
subsequences_count
holds the result, which represents the total distinct subsequences insource
that form the entiretarget
.
This method utilizes dynamic programming principles by storing intermediate results to avoid redundant calculations and efficiently computing the desired count of subsequences.
No comments yet.