
Problem Statement
You are given an array of strings words
. Your task is to identify all strings from this array that appear as a substring in any other string within the same array.
Return the list of such strings. The output can be in any order.
Examples
Example 1
Input:
words = ["mass", "as", "hero", "superhero"]
Output:
["as", "hero"]
Explanation:
"as" is a substring of "mass", and "hero" is a substring of "superhero". ["hero", "as"] is also a valid answer.
Example 2
Input:
words = ["vultrcloud", "ul", "cloud"]
Output:
["ul", "cloud"]
Explanation:
Both "ul" and "cloud" are substrings of "vultrcloud".
Example 3
Input:
words = ["blue", "green", "bu"]
Output:
[]
Explanation:
None of the strings are substrings of any other string.
Constraints
1 <= words.length <= 100
1 <= words[i].length <= 30
words[i]
consists only of lowercase English letters.- All strings in
words
are unique.
Approach and Intuition
Iterate Over All Pairs: Since the number of strings is small (
≤ 100
), a brute-force approach is acceptable. For every words
in the list, compare it with every other wordt
to see ifs
is a proper substring oft
.Skip Self Comparison: Because all strings are unique, a word is never a substring of itself — so you can skip checking
s in s
.Substring Check: Use the
in
operator (e.g.,s in t
in Python) to check whethers
is a substring oft
.Avoid Duplicates: Use a set to collect the results or ensure that each string is added to the result list only once.
Return Result: Once all comparisons are complete, return the list of substrings.
Time Complexity
- O(n^2 × m) — where
n
is the number of strings andm
is the maximum string length. Since both are small (n ≤ 100
,m ≤ 30
), this is efficient for the given constraints.
Solutions
- C++
- Java
- Python
class SubstringMatcher {
public:
vector<string> findSubstrings(vector<string>& wordsList) {
vector<string> foundWords;
Trie* root = new Trie(); // Initialize the Trie root node.
// Append all word suffixes into the Trie structure.
for (const auto& word : wordsList) {
for (int startIdx = 0; startIdx < word.size(); startIdx++) {
// Insert each starting suffix from `startIdx` into the Trie.
addWordSuffix(root, word.substr(startIdx));
}
}
// Determine if any word is a substring within the trie modeled data.
for (auto word : wordsList) {
if (checkSubstring(root, word)) {
foundWords.push_back(word);
}
}
return foundWords;
}
private:
class Trie {
public:
// Count occurrences of substrings from the Trie.
int count;
// Child nodes mapped by character keys.
unordered_map<char, Trie*> children;
};
// Insertion of word or a suffix into the Trie structure.
void addWordSuffix(Trie* root, const string& wordSuffix) {
Trie* node = root;
for (char ch : wordSuffix) {
// Move througn the Trie nodes, creating new nodes if necessary.
if (node->children.find(ch) != node->children.end()) {
node = node->children[ch];
// Increment frequency count for this node.
node->count++;
} else {
// Create a new node if character node doesn't exist.
Trie* newNode = new Trie();
newNode->count = 1;
node->children[ch] = newNode;
node = newNode;
}
}
}
// Method to verify if a word is present as a substring in the Trie.
bool checkSubstring(Trie* root, string& word) {
Trie* node = root; // Begin the traverse at root.
for (char ch : word) {
// Move through each character's node.
node = node->children[ch];
}
// The word is a substring or overlapping if seen more than once.
return node->count > 1;
}
};
The provided C++ program is designed to identify strings within an array that appear as substrings within other strings of the same array. This solution employs a data structure called Trie, also known as a prefix tree, which effectively handles string manipulations and searches.
- Convert the list of words into a Trie structure where each node represents a possible substring of the input words.
- Insert every suffix of each word into the Trie. This ensures that all possible substrings are accounted for.
- For each word in the list, check if it exists as a substring in the Trie populated with suffixes from all the words.
- If a word is found as a substring, add it to the result list.
Key operations are:
- Adding word suffix to the Trie: Iterate over each word, creating suffix starting from each index, and inserting it into the Trie. Each node in the Trie keeps track of the frequency of the appearances of the substring.
- Checking for substrings in the Trie: For each word, traverse through the Trie based on its characters and determine if the substring appears more than once, indicating overlap or substring presence in the array.
This approach is efficient by leveraging the Trie data structure for fast insertion and search operations, significantly enhancing the process over brute forcing through all possible pairs of substrings. The use of a Trie also handles overlaps and repeated substrings elegantly, making the solution robust for varied inputs.
class Finder {
class TrieNode {
int count; // Number of times this node is visited
Map<Character, TrieNode> children;
TrieNode() {
count = 0;
children = new HashMap<>();
}
}
public List<String> findSubstrings(String[] words) {
List<String> result = new ArrayList<>();
TrieNode root = new TrieNode(); // Root of the Trie
// Building the Trie with all possible suffixes
for (String word : words) {
for (int start = 0; start < word.length(); start++) {
addSuffix(root, word.substring(start));
}
}
// Finding all substrings in Trie that appear in the list more than once
for (String word : words) {
if (checkSubstring(root, word)) {
result.add(word);
}
}
return result;
}
private void addSuffix(TrieNode root, String suffix) {
TrieNode current = root;
for (char ch : suffix.toCharArray()) {
current.children.putIfAbsent(ch, new TrieNode());
current = current.children.get(ch);
current.count++;
}
}
private boolean checkSubstring(TrieNode root, String str) {
TrieNode current = root;
for (char ch : str.toCharArray()) {
if (current.children.get(ch) == null) return false;
current = current.children.get(ch);
}
return current.count > 1;
}
}
This Java solution for the "String Matching in an Array" problem involves constructing a Trie (prefix tree) to efficiently store and query substrings of provided words. The process can be highlighted as follows:
Initialize a
TrieNode
class to represent each node in the Trie, where each node contains acount
indicating the number of times the node is visited, and a mapchildren
to store child nodes.Construct a method
findSubstrings
that:- Initializes an empty list
result
to hold the final substrings. - Constructs the root of the Trie.
- Iteratively adds all possible suffixes of each string from the input array to the Trie using
addSuffix
method. - Checks if each word in the input array is a substring in the other words by utilizing the
checkSubstring
method, adding it to the result list if true.
- Initializes an empty list
The
addSuffix
method takes a root and a suffix string, navigating through or building the Trie by:- Iterating over each character in the suffix.
- For each character, either fetch or create a new child node in the Trie, and increment the count associated with the node.
In the
checkSubstring
method:- Navigates through the Trie based on each character of the input string.
- Verifies the presence of the string in Trie and ensures that the count of the last node of this string is more than one, indicating substring inclusion in other strings.
By utilizing the Trie structure, this approach efficiently handles substrings, leveraging the count property of each node to determine the repetition of substrings within the input string list. This results in a comprehensive yet efficient solution to the substring matching challenge.
class Solution:
class Node:
def __init__(self):
self.count = 0
self.children = {}
def findSubstrings(self, words: List[str]) -> List[str]:
substrs_found = []
trie_root = self.Node() # Root node of the Trie
# Build trie by adding all word suffixes
for word in words:
for i in range(len(word)):
self._add_suffix_to_trie(trie_root, word[i:])
# Determine if each word is a substring more than once
for word in words:
if self._check_for_deps(trie_root, word):
substrs_found.append(word)
return substrs_found
def _add_suffix_to_trie(self, root: "Node", suffix: str) -> None:
node = root
for ch in suffix:
if ch not in node.children:
node.children[ch] = self.Node()
node = node.children[ch]
node.count += 1
def _check_for_deps(self, root: "Node", item: str) -> bool:
node = root
for ch in item:
node = node.children.get(ch)
return node.count > 1
The Python solution provided is designed to identify substrings from an array of words that appear more than once using a Trie data structure. Here's a concise summary of the method implemented:
- Initialize an empty list called
substrs_found
to store the result. - Create a root node for the Trie using the nested
Node
class. - Populate the Trie with all possible suffixes of each word in the input array
words
. - For each word, check if it exists as a substring in other words based on its count in the Trie.
- The
_add_suffix_to_trie
function is responsible for adding each suffix of a word to the Trie, incrementing the count of occurrences each time. - The
_check_for_deps
function is used to check if a word appears more than once as a substring by traversing the Trie and checking the count of the node associated with the last character of the word. - The list
substrs_found
gets appended with words that qualify as repeated substrings and is returned as the output.
This solution effectively utilizes the Trie to provide an efficient method for substring detection in an array of words.
No comments yet.