
Problem Statement
The challenge is to find the count of strings within an array words
that can be formed as a subsequence from a given string s
. A subsequence in this context means you can derive a string by removing some or no characters from the string s
without changing the order of the remaining characters. Importantly, each string within words
must maintain its internal character order to be considered a valid subsequence of s
.
Examples
Example 1
Input:
s = "abcde", words = ["a","bb","acd","ace"]
Output:
3
Explanation:
There are three strings in words that are a subsequence of s: "a", "acd", "ace".
Example 2
Input:
s = "dsahjpjauf", words = ["ahjpjau","ja","ahbwzgqnuk","tnmlanowax"]
Output:
2
Constraints
1 <= s.length <= 5 * 104
1 <= words.length <= 5000
1 <= words[i].length <= 50
s
andwords[i]
consist of only lowercase English letters.
Approach and Intuition
The task involves determining if each word in the words
array is a subsequence of the string s
. Let’s break down how we might approach finding a solution:
Understand Subsequence: For a string to be a subsequence of another, each character of the former should appear in the latter and in the same order, though not necessarily consecutively.
Iterative Comparison:
- Iterate through each word in the
words
array. For each word, try to match it againsts
. - Use pointers or indices to trace the position in both the word and
s
. Move throughs
until all characters of the current word are found in order.
- Iterate through each word in the
Counting:
- For each word, if you can trace all characters in
s
while maintaining order, count it as a valid subsequence. - Keep a total count of such valid subsequences.
- For each word, if you can trace all characters in
Efficiency Consideration:
- Given the potential size of
s
and number of words, an efficient approach would be crucial. - Avoid re-checking any part of the
s
unnecessarily by using the dual-pointer technique that advances ons
for comparisons and only resets for the next word inwords
.
- Given the potential size of
Edge Cases:
- Consider matching with a single character.
- Handle cases where elements in
words
could be an empty string theoretically (not explicitly counter-indicated given constraints but good to consider).
By following this method, you systematically verify each word against s
, ensuring computational efficiency and adherence to the subsequence definition.
Solutions
- Java
class Solution {
public int matchSubsequences(String mainString, String[] wordList) {
int count = 0;
ArrayList<Node>[] bucketArray = new ArrayList[26];
for (int i = 0; i < 26; ++i)
bucketArray[i] = new ArrayList<Node>();
for (String word : wordList)
bucketArray[word.charAt(0) - 'a'].add(new Node(word, 0));
for (char ch : mainString.toCharArray()) {
ArrayList<Node> currentBucket = bucketArray[ch - 'a'];
bucketArray[ch - 'a'] = new ArrayList<Node>();
for (Node node : currentBucket) {
node.index++;
if (node.index == node.word.length()) {
count++;
} else {
bucketArray[node.word.charAt(node.index) - 'a'].add(node);
}
}
currentBucket.clear();
}
return count;
}
}
class Node {
String word;
int index;
public Node(String word, int idx) {
this.word = word;
index = idx;
}
}
This Java solution is designed to determine how many words from a given list are subsequences of a main string. The implemented method, matchSubsequences
, efficiently processes through each character of the main string and utilizes a bucket array concept to manage potential subsequences dynamically.
Implement a bucket array of lists that corresponds to each letter of the alphabet.
For every word in the provided
wordList
, place the word into the bucket that corresponds to its starting character.Iterate over each character in
mainString
. For the current character:- Move all nodes in the corresponding bucket to a new processing list.
- Clear the original bucket for reuse.
- For each node in the processing list:
- Increment the index to point to the next character in the node's word.
- Check if the node's index matches the word length, which indicates a complete subsequence match and increment the count.
- If the word has more characters to match, place the node back into the bucket that corresponds to the current character at the node's new index.
Return the
count
of total matches found, which represents how many words in thewordList
are subsequences of themainString
.
This approach leverages the efficiency of bucket sorting and direct indexing to minimize unnecessary comparisons and optimizes checking for subsequences within the main string.
No comments yet.