
Problem Statement
Given a binary string s
and an integer k
, the task is to determine whether every possible binary code of length k
exists as a substring within s
. If s
contains all such combinations, the function should return true
; otherwise, it should return false
. The challenge lies in checking for the presence of each combination effectively, especially given the constraints where the length of s
can be quite large, up to 500,000 characters.
Examples
Example 1
Input:
s = "00110110", k = 2
Output:
true
Explanation:
The binary codes of length 2 are "00", "01", "10" and "11". They can be all found as substrings at indices 0, 1, 3 and 2 respectively.
Example 2
Input:
s = "0110", k = 1
Output:
true
Explanation:
The binary codes of length 1 are "0" and "1", it is clear that both exist as a substring.
Example 3
Input:
s = "0110", k = 2
Output:
false
Explanation:
The binary code "00" is of length 2 and does not exist in the array.
Constraints
1 <= s.length <= 5 * 105
s[i]
is either'0'
or'1'
.1 <= k <= 20
Approach and Intuition
Understanding the problem:
- We need to check if all possible binary strings of length
k
are substrings of a given binary strings
. - The total number of distinct binary combinations of length
k
is ( 2^k ).
- We need to check if all possible binary strings of length
Initial considerations:
- If ( k ) is large, specifically close to the length of ( s ), ( s ) might not be long enough to contain all possible combinations.
- Conversely, if ( k ) is small, ( s ) has a higher chance of containing all combinations due to the small number of combinations.
Utilizing a sliding window approach:
- We can employ a sliding window of size ( k ) that moves through ( s ).
- As the window moves, it captures substrings of length ( k ).
Storing observed patterns:
- Use a data structure, such as a set, to store unique substrings of length ( k ) as we slide the window.
- This aids in checking off observed combinations against the ( 2^k ) possible combinations.
Efficiency checks:
- Once the set of substrings reaches the size ( 2^k ) (since we are only looking for that many distinct substrings), the function can return
true
. - If the end of the string is reached and we haven’t observed all possible substrings, return
false
.
- Once the set of substrings reaches the size ( 2^k ) (since we are only looking for that many distinct substrings), the function can return
Edge case considerations:
- If the length of ( s ) is less than ( k ), it’s impossible for ( s ) to contain all combinations, so return
false
immediately. - For very small values of ( k ) (e.g., ( k = 1 )), a simple check for the presence of '0' and '1' might suffice.
- If the length of ( s ) is less than ( k ), it’s impossible for ( s ) to contain all combinations, so return
By applying this methodical approach, which efficiently checks for the presence of all required substrings without repetitive calculations, the solution can handle high constraints effectively, staying within feasible execution time limits.
Solutions
- Java
- Python
class Solution {
public static boolean containsAllKLengthSubstrings(String input, int k) {
int required = 1 << k;
boolean[] found = new boolean[required];
int mask = required - 1;
int hashValue = 0;
for (int index = 0; index < input.length(); index++) {
hashValue = ((hashValue << 1) & mask) | (input.charAt(index) - '0');
if (index >= k - 1 && !found[hashValue]) {
found[hashValue] = true;
required--;
if (required == 0) {
return true;
}
}
}
return false;
}
}
This Java solution determines if a given string contains all possible substrings of binary codes of size k
. To achieve this, follow the established methodologies:
- Calculate
required
, the total number of possible binary codes of sizek
, using bit shifting (1 << k
). - Utilize a
boolean[] found
to keep track of which binary codes have been found. - Use
mask
(computed asrequired - 1
) to restrict thehashValue
to the lastk
bits. - Iterate over each character of the input string, updating the
hashValue
by left shifting it, applying themask
, and adding the current binary value of the character. - Check if the
k
-length substring represented byhashValue
has not been found yet; if true, mark it as found and decreaserequired
by 1. - If
required
becomes 0, returntrue
indicating all binary codes of lengthk
are present in the string. - If the loop completes without finding all codes, return
false
.
Ensure correct input parameters with the string being potentially very large and k
being an appropriate integer. This method effectively tracks the occurrence of binary codes using a space-efficient approach.
class Solution:
def containsAllBinaryCodes(self, sequence: str, length: int) -> bool:
required = 1 << length
seen = [False] * required
full_mask = required - 1
current_hash = 0
for j in range(len(sequence)):
# Calculate the bit hash for current window
current_hash = ((current_hash << 1) & full_mask) | (int(sequence[j]))
# Mark as seen if the current hash index marks first occurrence
if j >= length - 1 and not seen[current_hash]:
seen[current_hash] = True
required -= 1
if required == 0:
return True
return False
The Python3 solution defines a method containsAllBinaryCodes
, which checks if a string contains all possible binary codes of a given size k
. The approach utilizes bit manipulation and a sliding window technique to efficiently solve the problem. Here’s a breakdown of the implementation:
- Initialize
required
to be (2^k) (number of unique binary codes of size k). - Create a
seen
list of boolean values to track which binary codes are identified in thesequence
. - Calculate
full_mask
as (2^k - 1). This mask helps manage the hash to be within the scope of k-bits. - Initialize
current_hash
to track the current bit pattern of the window of size k in the sequence.
For each character sequence[j]
:
- The
current_hash
is updated using bit manipulation. Left shift to make room for the new bit and use the OR operation to add the current character as the least significant bit. - The sliding window (denoted by
current_hash
) checks if the current bit pattern has been seen. If it's a fresh pattern and its index was not seen, the seen index is marked True andrequired
is decremented. - Reducing
required
each time a new code is encountered ensures that once it reaches zero, all possible codes of length k are found in the sequence.
Should all binary codes be encountered before the end of the string, the function returns True indicating the string contains all binary codes of size k. If the loop completes without finding all codes, it returns False.
No comments yet.