
Problem Statement
Given two strings called s
and part
, we are tasked with removing all occurrences of the substring part
from the string s
. This process involves iteratively searching for the leftmost occurrence of the substring part
in s
and removing it each time it is found. This operation is repeated until no more occurrences of part
can be found within s
. The final form of the string s
, after all such occurrences have been removed, is what we return. This problem is mainly about manipulating strings based on the presence of certain substrings, ensuring all instances are diligently removed.
Examples
Example 1
Input:
s = "daabcbaabcbc", part = "abc"
Output:
"dab"
Explanation:
The following operations are done: - s = "daabcbaabcbc", remove "abc" starting at index 2, so s = "dabaabcbc". - s = "dabaabcbc", remove "abc" starting at index 4, so s = "dababc". - s = "dababc", remove "abc" starting at index 3, so s = "dab". Now s has no occurrences of "abc".
Example 2
Input:
s = "axxxxyyyyb", part = "xy"
Output:
"ab"
Explanation:
The following operations are done: - s = "axxxxyyyyb", remove "xy" starting at index 4 so s = "axxxyyyb". - s = "axxxyyyb", remove "xy" starting at index 3 so s = "axxyyb". - s = "axxyyb", remove "xy" starting at index 2 so s = "axyb". - s = "axyb", remove "xy" starting at index 1 so s = "ab". Now s has no occurrences of "xy".
Constraints
1 <= s.length <= 1000
1 <= part.length <= 1000
s
andpart
consists of lowercase English letters.
Approach and Intuition
- Start with the provided strings
s
andpart
. - The goal is to continuously search for and remove the substring
part
froms
. The key function to utilize here might be a string search which locates the leftmost occurrence ofpart
ins
. - Once
part
is found, it should be removed, and this adjusted version ofs
should be the new string on which to continue searching. - This process repeats until
part
can no longer be found withins
.
Examples Breakdown:
Example 1:
- Start with
s = "daabcbaabcbc"
andpart = "abc"
. - In each step, locate
abc
, remove it, and modifys
accordingly, untilabc
can no longer be found. - The modifications through steps show the progressive removal of the substring:
daabcbaabcbc
->dabaabcbc
->dababc
->dab
- Start with
Example 2:
- Starting with
s = "axxxxyyyyb"
andpart = "xy"
. - A similar approach is taken where
xy
is sought and sequentially removed. - Progressive updates:
axxxxyyyyb
->axxxyyyb
->axxyyb
->axyb
->ab
- Starting with
This iterative approach ensures that every instance of part
is removed before the final string is derived. Such a strategy is comprehensively interpreted from the examples which exhibit all the steps from initial string s
to the final result after all transformations.
Solutions
- C++
- Java
- Python
class Solution {
public:
string eliminateSubsequence(string str, string sub) {
vector<int> lps = buildLPS(sub);
stack<char> stk;
vector<int> matchPos(str.length() + 1, 0);
for (int i = 0, j = 0; i < str.length(); i++) {
char current = str[i];
stk.push(current);
if (current == sub[j]) {
matchPos[stk.size()] = ++j;
if (j == sub.length()) {
for (int k = 0; k < sub.length(); k++) stk.pop();
j = stk.empty() ? 0 : matchPos[stk.size()];
}
} else {
if (j != 0) {
i--;
j = lps[j - 1];
stk.pop();
} else {
matchPos[stk.size()] = 0;
}
}
}
string res;
while (!stk.empty()) {
res = stk.top() + res;
stk.pop();
}
return res;
}
private:
vector<int> buildLPS(string p) {
vector<int> lpsArray(p.length(), 0);
int len = 0;
for (int i = 1; i < p.length();) {
if (p[i] == p[len]) {
lpsArray[i++] = ++len;
} else {
if (len != 0) {
len = lpsArray[len - 1];
} else {
lpsArray[i++] = 0;
}
}
}
return lpsArray;
}
};
The solution provided here is a C++ function to remove all occurrences of a substring from a given string. It uses the KMP (Knuth-Morris-Pratt) algorithm to efficiently search and remove the specified substring. Here’s a concise explanation of the code:
The
eliminateSubsequence
function handles the main logic. It takes two strings,str
andsub
, wherestr
is the main string andsub
is the substring to remove fromstr
.It initializes a stack to keep track of the characters and an array
matchPos
to manage positions within the matched substring.The function iterates over each character in the main string. For each character that matches a character in
sub
, it updates a counterj
and thematchPos
array. If the characters do not match andj
is not zero, the function rolls back the index to recheck with the previous longest prefix that also suffixes without the character.If
j
equals the length ofsub
, indicating a full match, the function removes the characters corresponding tosub
from the stack. The removal is continuous for successive and overlapped occurrences ofsub
.Once the entire string is processed, characters remaining in the stack are popped to form the resultant string without the occurrences of
sub
.The
buildLPS
function is a helper used to precompute the longest prefix which is also a suffix (LPS) array for substringsub
, which is crucial for the KMP algorithm to function properly. This precomputation helps in jumping back to the longest matching position when a mismatch happens during the scanning ofstr
.
The implementation is efficient in terms of space and time complexity, enabling the function to handle large strings and substrings effectively. Consider using this approach when the requirement is to remove repeatedly overlapping substrings from a text efficiently.
class Manipulator {
public String deleteSubstrings(String mainStr, String subStr) {
int[] lps = kmpPreprocess(subStr);
Stack<Character> stack = new Stack<>();
int[] matchedIndices = new int[mainStr.length() + 1];
for (int i = 0, j = 0; i < mainStr.length(); i++) {
char c = mainStr.charAt(i);
stack.push(c);
if (c == subStr.charAt(j)) {
matchedIndices[stack.size()] = ++j;
if (j == subStr.length()) {
for (int k = 0; k < subStr.length(); k++) {
stack.pop();
}
j = stack.isEmpty() ? 0 : matchedIndices[stack.size()];
}
} else {
if (j != 0) {
i--;
j = lps[j - 1];
stack.pop();
} else {
matchedIndices[stack.size()] = 0;
}
}
}
StringBuilder result = new StringBuilder();
while (!stack.isEmpty()) {
result.append(stack.pop());
}
return result.reverse().toString();
}
private int[] kmpPreprocess(String pattern) {
int[] lps = new int[pattern.length()];
for (int i = 1, len = 0; i < pattern.length();) {
if (pattern.charAt(i) == pattern.charAt(len)) {
lps[i++] = ++len;
} else if (len != 0) {
len = lps[len - 1];
} else {
lps[i++] = 0;
}
}
return lps;
}
}
The given Java solution provides a method to remove all occurrences of a specific substring from a string using the Knuth-Morris-Pratt (KMP) algorithm for pattern matching. Here's a breakdown of how the solution works:
KMP Preprocessing: The
kmpPreprocess
method is used to compute the longest prefix which is also suffix (LPS) array for the substring. This array is crucial for efficient backtracking in the KMP algorithm, allowing the algorithm to skip unnecessary comparisons.Main Processing Logic:
- A
Stack<Character>
is used to manage the characters of the main string as they are processed. - An array
matchedIndices
keeps track of the indices to which each character of the substring has been matched. - For every character in the main string, it is pushed onto the stack and checked against the current character of the substring.
- If a character matches the corresponding character in the substring, the match index is updated.
- If a full match of the substring is found (
j == subStr.length()
), characters are popped from the stack to effectively remove the substring from the main string, and the process continues with the next characters. - If a mismatch happens after some matches, the algorithm uses the LPS array to backtrack to the previous possible match without unnecessary comparison or removing characters from the stack.
- A
Post-Processing: After processing all characters, the stack might still contain the characters of the main string that do not form the substring. These are then popped from the stack and appended to a
StringBuilder
. The resulting string, which has all occurrences of the substring removed, is obtained by reversing theStringBuilder
content.
This method ensures that all occurrences of the substring are efficiently removed from the main string, even if the substring overlaps with itself or appears multiple times in quick succession. Each character of the main string is processed in a linear pass, making the approach efficient in terms of time complexity, leveraging the KMP algorithm's preprocessing steps. The use of stack and the LPS array enable handling complex cases of substring removal in an optimal manner.
class Solution:
def deleteSubsequence(self, string: str, subsequence: str) -> str:
# Compute the KMP table for the subsequence
kmp_table = self._create_kmp(subsequence)
# Use a list to act as a stack to manage characters
character_stack = []
# Track positions of subsequence matches
match_positions = [0] * (len(string) + 1)
idx = 0
subseq_idx = 0
while idx < len(string):
char = string[idx]
character_stack.append(char)
if char == subsequence[subseq_idx]:
# Monitor the position for current match count
match_positions[len(character_stack)] = subseq_idx + 1
subseq_idx += 1
# Check if the subsequence is fully matched
if subseq_idx == len(subsequence):
# Remove the matched subsequence from stack
for _ in range(len(subsequence)):
character_stack.pop()
# Reset the matching index based on remaining characters in stack
subseq_idx = (
0
if not character_stack
else match_positions[len(character_stack)]
)
else:
# When characters mismatch
if subseq_idx != 0:
# Utilize the KMP table to skip unnecessary checks
idx -= 1
subseq_idx = kmp_table[subseq_idx - 1]
character_stack.pop()
else:
# Reset match position when mismatch at the first character
match_positions[len(character_stack)] = 0
idx += 1
# Build the result from the stack content
return "".join(character_stack)
def _create_kmp(self, seq: str) -> list:
# List for the KMP "partial match" table
lps = [0] * len(seq)
# Variables to track the position and the length of current prefix
length = 0
i = 1
while i < len(seq):
if seq[i] == seq[length]:
length += 1
lps[i] = length
i += 1
else:
if length != 0:
# Use previously computed lps value for optimization
length = lps[length - 1]
else:
lps[i] = 0
i += 1
return lps
Here's a solution summary for removing all occurrences of a substring from a given string using Python:
- Define a class
Solution
with the methoddeleteSubsequence
that accepts astring
and asubsequence
to be removed. - Utilize a KMP (Knuth-Morris-Pratt) algorithm to efficiently search for the substring within the string. Start by creating a KMP table with a helper function
_create_kmp
. - Use a list as a character stack to build the result of the string after all occurrences of the subsequence are removed.
- Initialize a list
match_positions
to track the positions of the subsequence matches while iterating through the string. - Iterate through each character in
string
, pushing characters to thecharacter_stack
and updatingmatch_positions
based on matches with the subsequence. - If the subsequence is found (when the length of the current match equals the length of the subsequence), reset the
subseq_idx
based on remaining characters incharacter_stack
and remove the matched subsequence from the stack. - If a mismatch is detected and it's not at the start, adjust the iterator based on the KMP table to skip unnecessary checks.
- Continue iterating until all characters of
string
have been processed. - Finally, construct and return the result by joining the characters left in the
character_stack
.
This approach efficiently removes all occurrences of a subsequence by using the KMP string matching algorithm to avoid redundant checks and optimize the search within the main string.
No comments yet.