
Problem Statement
Given two strings, s
(the main string) and p
(the pattern), the task is to find all starting indices within s
where a substring is an anagram of p
. An anagram of a string is another string that contains the same characters, only the order of characters can be different. The function must return a list of starting indices for each substring of s
that is an anagram of p
. The output list doesn't need to be in any particular order.
Examples
Example 1
Input:
s = "cbaebabacd", p = "abc"
Output:
[0,6]
Explanation:
The substring with start index = 0 is "cba", which is an anagram of "abc". The substring with start index = 6 is "bac", which is an anagram of "abc".
Example 2
Input:
s = "abab", p = "ab"
Output:
[0,1,2]
Explanation:
The substring with start index = 0 is "ab", which is an anagram of "ab". The substring with start index = 1 is "ba", which is an anagram of "ab". The substring with start index = 2 is "ab", which is an anagram of "ab".
Constraints
1 <= s.length, p.length <= 3 * 104
s
andp
consist of lowercase English letters.
Approach and Intuition
The problem asks for finding all start positions where the anagram of the pattern p
appears in the string s
. Here's the overall approach:
Character Count Comparison: At the core, the problem can be solved by comparing character frequencies between the pattern
p
and substrings ofs
of the same length asp
.Sliding Window Technique: This problem is efficiently handled using a sliding window and a hashmap (or array) to count characters:
- Initialize a fixed size window equal to the length of
p
that slides overs
. - Use a hash map to count the frequency of each character in the pattern
p
. - As the window slides over
s
, compare if the character count of the window matches the character count ofp
.
- Initialize a fixed size window equal to the length of
Handle Window Slide:
- When the window slides right, decrease the count of the character that is left out of the window and increase the count of the new character that is included in the window.
- After adjusting counts, compare the new window's character count with
p
's count.
Efficiency: Each comparison of the counts has a time complexity of O(1) if implemented using fixed-size arrays (given the constraint of lowercase English letters), ensuring overall time complexity remains manageable even as input size grows.
These simple steps illustrate the methodology to solve the problem, acknowledging the constraints and nature of input (all lowercase English letters, making fixed-size frequency arrays a feasible solution). This solution does not need to check character counts across the entire window every time but only updates the counts as needed, which is more efficient in contrast to naive approaches.
Solutions
- Java
class Solution {
public List<Integer> searchAnagrams(String str, String pattern) {
int strLen = str.length(), patLen = pattern.length();
if (strLen < patLen) return new ArrayList<>();
int [] patFrequency = new int[26];
int [] strFrequency = new int[26];
for (char c : pattern.toCharArray()) {
patFrequency[c - 'a']++;
}
List<Integer> resultList = new ArrayList<>();
for (int i = 0; i < strLen; ++i) {
strFrequency[str.charAt(i) - 'a']++;
if (i >= patLen) {
strFrequency[str.charAt(i - patLen) - 'a']--;
}
if (Arrays.equals(patFrequency, strFrequency)) {
resultList.add(i - patLen + 1);
}
}
return resultList;
}
}
In the provided Java solution, the task is to identify all start indices of anagrams of a given pattern within a larger string. To achieve this, two frequency arrays are utilized to compare characters in the string with the pattern.
Here's a breakdown of the steps the code follows:
- Initialize the length of the input string (
strLen
) and the pattern (patLen
). Return an empty list if the input string is shorter than the pattern. - Create frequency arrays for both the pattern (
patFrequency
) and the substring of the string being examined (strFrequency
), each of size 26 to account for each letter of the alphabet. - Populate the
patFrequency
array with the frequency of each character in the pattern. - Iterate over the string while updating the
strFrequency
array to reflect the frequency of characters in the current sliding window of size equal to the pattern's length. - Each time a character is added to the window, check if the window size exceeds the pattern's length; if so, decrease the frequency of the character that just moved out of the window.
- Compare
patFrequency
withstrFrequency
after each update. If they match, it signifies that the substring starting from the current index minus the pattern length plus one is an anagram of the pattern. Add this starting index to the result list. - Finally, return the list of starting indices of all anagrams found in the string.
This method efficiently checks every possible substring of the specified length in the input string, utilizing the constant time complexity of array comparison once the window size matches the size of the pattern, leading to an optimal solution for finding all anagrams in the string.
No comments yet.