
Problem Statement
The task is to transform any given string into a "good" string, where a "good" string is defined as one in which no two different characters share the same frequency. The frequency of a character is measured by the number of times it appears within the string. For example, in the string "aab"
, character 'a'
repeats twice and character 'b'
once. This problem requires determining the minimum number of deletions needed from the string to achieve the "good" string criteria, and returning that minimum number.
Examples
Example 1
Input:
s = "aab"
Output:
0
Explanation:
`s` is already good.
Example 2
Input:
s = "aaabbbcc"
Output:
2
Explanation:
You can delete two 'b's resulting in the good string "aaabcc". Another way it to delete one 'b' and one 'c' resulting in the good string "aaabbc".
Example 3
Input:
s = "ceabaacb"
Output:
2
Explanation:
You can delete both 'c's resulting in the good string "eabaab". Note that we only care about characters that are still in the string at the end (i.e. frequency of 0 is ignored).
Constraints
1 <= s.length <= 105
s
contains only lowercase English letters.
Approach and Intuition
To solve the problem of transforming a given string s
into a "good" string:
Calculate Frequencies:
First, compute how many times each character appears in the string. This can be done using a frequency dictionary where each key is a character and the value is its occurrence count ins
.Identify Conflicts:
Next, to understand the conflicts, map each frequency to the number of characters that have that frequency. This helps identify which frequencies are shared by multiple characters.Resolve Conflicts:
The main part involves adjusting the frequencies to eliminate any duplication:- Sort characters by their frequency.
- For each character, if its frequency conflicts with others (i.e., at least one other character has the same frequency), decrease the frequency of that character (remove instances of that character) until no conflict remains or the frequency becomes zero.
- Keep a count of how many deletions are made during this adjustment.
Return Deletions Count:
After resolving all frequency conflicts, the total number of deletions made will be the number required to make the string "good".
Examples
Example 1
Input:
s = "aab"
Output:
0
Explanation:
`s` is already good.
Example 2
Input:
s = "aaabbbcc"
Output:
2
Explanation:
You can delete two 'b's resulting in the good string "aaabcc". Another way it to delete one 'b' and one 'c' resulting in the good string "aaabbc".
Example 3
Input:
s = "ceabaacb"
Output:
2
Explanation:
You can delete both 'c's resulting in the good string "eabaab". Note that we only care about characters that are still in the string at the end (i.e. frequency of 0 is ignored).
Constraints
1 <= s.length <= 105
s
contains only lowercase English letters.
Example Cases Explanation Based on the Approach
Example 1:
"aab"
The frequencies are{a: 2, b: 1}
. There are no shared frequencies, so the string is already "good". Therefore, no deletions are necessary (0
deletions).Example 2:
"aaabbbcc"
The initial frequencies are{a: 3, b: 3, c: 2}
. The charactersa
andb
share the same frequency. By deleting twob
s (or alternative deletions yielding the same result), the frequencies update to{a: 3, b: 1, c: 2}
which has no conflicts. This requires2
deletions.Example 3:
"ceabaacb"
Starting with frequencies{c: 2, e: 1, a: 3, b: 2}
, to resolve shared frequencies betweenc
andb
, and adjusta
which conflicts no more after other adjustments, deleting twoc
s can be one optimal way, resulting in adjusted frequencies{c: 0, e: 1, a: 3, b: 2}
. This involves2
deletions.
The critical insight lies in efficiently handling and resolving frequency conflicts to minimize the number of deletions.
Solutions
- C++
- Java
- Python
class Solution {
public:
int minimumDeletions(string str) {
vector<int> charFreq(26, 0);
for (char ch : str) {
charFreq[ch - 'a']++;
}
sort(charFreq.begin(), charFreq.end(), greater<int>());
int removals = 0;
int allowableFreq = str.length();
for (int idx = 0; idx < 26 && charFreq[idx] > 0; idx++) {
if (charFreq[idx] > allowableFreq) {
removals += charFreq[idx] - allowableFreq;
charFreq[idx] = allowableFreq;
}
allowableFreq = max(0, charFreq[idx] - 1);
}
return removals;
}
};
The solution provided is written in C++ and is designed to address the problem of reducing the frequency of characters in a string so that no two characters have the same frequency, with the goal of achieving this by making the least number of deletions.
The implementation follows these steps:
- First, initialize a frequency array
charFreq
to store the count of each character in the input stringstr
. - Traverse the string and increment the corresponding index in the
charFreq
based on the character value. - Sort the
charFreq
array in decreasing order to handle the most frequent characters first. - Set
removals
to zero to count the number of deletions required and setallowableFreq
to the length of the string, which represents the maximum frequency a character can have after sorting. - Using a loop, traverse each element in the sorted frequency array. If the current frequency is greater than
allowableFreq
, calculate the difference, add this difference toremovals
, and adjust thecharFreq
at the current index toallowableFreq
. - Decrease
allowableFreq
by one for the next character, ensuring it does not go below zero. - After the loop, return the count of
removals
which gives the minimum deletions needed.
This method effectively ensures that the character frequencies are unique by the end of the operation while minimizing the number of deletions required. It handles edge cases such as all characters being the same and characters with already different frequencies efficiently.
class Solution {
public int minimumDeletions(String word) {
// Array to count the frequency of each alphabet character
int[] freq = new int[26];
for (int idx = 0; idx < word.length(); idx++) {
freq[word.charAt(idx) - 'a']++;
}
Arrays.sort(freq);
int deletions = 0;
// Define the highest frequency that is currently allowed
int allowedMaxFrequency = word.length();
// Loop through the sorted frequencies from highest to lowest
for (int j = 25; j >= 0 && freq[j] > 0; j--) {
// Ensure the frequency does not exceed the allowed maximum
if (freq[j] > allowedMaxFrequency) {
deletions += freq[j] - allowedMaxFrequency;
freq[j] = allowedMaxFrequency;
}
// Decrease the next allowed maximum frequency
allowedMaxFrequency = Math.max(0, freq[j] - 1);
}
return deletions;
}
}
This program in Java provides a solution to the problem of making character frequencies in a string unique by determining the minimum number of deletions required. The detailed workings of this solution are as follows:
- Initialize an array
freq
to keep track of the frequency of each character from 'a' to 'z' within the given string. - Loop through the string, and for each character, update its corresponding frequency in the
freq
array. - Sort the
freq
array in ascending order. This helps in processing the frequencies from highest to lowest effectively. - Establish a variable
deletions
to count the total deletions required, and another variableallowedMaxFrequency
set initially to the length of the string. This variable represents the maximum frequency any character can have without needing deletions at that point. - Iterate backwards through the
freq
array, from highest to lowest frequency:- If the current character frequency exceeds
allowedMaxFrequency
, calculate the excess and increment thedeletions
counter by this excess. Adjust the frequency of the character toallowedMaxFrequency
. - Reduce
allowedMaxFrequency
by one after processing each character's frequency, ensuring that the next character's maximum allowable frequency is one less than the previous, thereby avoiding duplicate frequencies.
- If the current character frequency exceeds
- The method returns the total number of deletions required.
This algorithm ensures efficient frequency management by sorting the array and processing from the most frequent to the least, adjusting the allowable frequencies to ensure all are unique while minimizing the number of deletions.
class Solution:
def minimizeDeletions(self, string: str) -> int:
# Initialize character frequency array
char_freq = [0] * 26
for char in string:
char_freq[ord(char) - ord('a')] += 1
char_freq.sort(reverse=True)
deletions_needed = 0
# Define the upper bound for frequency of any character
allowable_freq = len(string)
# Process character frequencies
for current_freq in char_freq:
if current_freq > allowable_freq:
deletions_needed += current_freq - allowable_freq
current_freq = allowable_freq
# Decrease the allowable frequency each step
allowable_freq = max(0, current_freq - 1)
return deletions_needed
The solution provided is a Python function aimed at ensuring each character of a string has a unique frequency by possibly deleting some characters. The function minimizeDeletions
counts the frequency of each character, sorts these frequencies in descending order, and then determines the number of deletions required. Here's a breakdown of how the solution works:
Initialize Frequency Count: A list
char_freq
of size 26 (for each letter of the alphabet) is created to keep track of how many times each character appears in the input string.Sort Frequencies: The list
char_freq
is sorted in reverse order to handle the highest frequencies first, which are more likely to conflict.Calculate Deletions: The function iterates over each frequency in the sorted list. For each frequency, if it's higher than the
allowable_freq
(initially set to the length of the string), it calculates how many deletions are needed to make this frequency unique and reduces it to the allowable limit. After handling one frequency, it ensures that the next highest allowable frequency is either zero or one less than the current adjusted frequency.Return Results: Finally, the total number of deletions needed to achieve the goal of unique frequencies is returned.
This approach ensures efficient management of character frequencies and minimizes the number of deletions necessary by addressing the most common characters first.
No comments yet.