Lexicographically Smallest Equivalent String

Updated on 04 June, 2025
Lexicographically Smallest Equivalent String header image

Problem Statement

You are provided with two strings, s1 and s2, of identical length, and an additional string baseStr. In this problem, characters from s1 and s2 at the same index are considered equivalent. For instance, if s1 = "abc" and s2 = "cde", then 'a' from s1 is equivalent to 'c' from s2, 'b' from s1 to 'd' from s2, and so on.

Equivalence of characters adheres to properties typically associated with equivalence relations:

  • Reflexivity: Every character is equivalent to itself.
  • Symmetry: If character X is equivalent to Y, then Y is also equivalent to X.
  • Transitivity: If character X is equivalent to Y, and Y is equivalent to Z, then X is equally equivalent to Z.

Using the rules of equivalence derived from s1 and s2, the objective is to transform baseStr into its lexicographically smallest form by substituting each of its characters with the smallest character that it is equivalent to based on the mapping from s1 and s2.

Examples

Example 1

Input:

s1 = "parker", s2 = "morris", baseStr = "parser"

Output:

"makkek"

Explanation:

Based on the equivalency information in s1 and s2, we can group their characters as [m,p], [a,o], [k,r,s], [e,i].
The characters in each group are equivalent and sorted in lexicographical order.
So the answer is "makkek".

Example 2

Input:

s1 = "hello", s2 = "world", baseStr = "hold"

Output:

"hdld"

Explanation:

Based on the equivalency information in s1 and s2, we can group their characters as [h,w], [d,e,o], [l,r].
So only the second letter 'o' in baseStr is changed to 'd', the answer is "hdld".

Example 3

Input:

s1 = "vultrcode", s2 = "programs", baseStr = "sourcecode"

Output:

"aauaaaaada"

Explanation:

We group the equivalent characters in s1 and s2 as [a,o,e,r,s,c], [l,p], [g,t] and [d,m], thus all letters in baseStr except 'u' and 'd' are transformed to 'a', the answer is "aauaaaaada".

Constraints

  • 1 <= s1.length, s2.length, baseStr <= 1000
  • s1.length == s2.length
  • s1, s2, and baseStr consist of lowercase English letters.

Approach and Intuition

  1. Establish Equivalence Groups:

    • Initialize a union-find data structure or a simple map to track which characters are considered equivalent based on transitivity, symmetry, and reflexivity.
    • Iterate through each index of the strings s1 and s2 adding pairs to the data structure to maintain the equivalence relationship.

    For instance, from the example input s1 = "parker", s2 = "morris", baseStr = "parser", we create equivalence groups such as [m,p], [a,o], [k,r,s], [e,i].

  2. Construction of Lexicographically Smallest String:

    • For each character in baseStr, find its group and get the smallest lexicographical character from that group.
    • Replace the character in baseStr with this smallest character.
  3. Optimization and Edge Cases:

    • Handle cases where there are no equivalences directly between s1 and s2 but they are implied due to transitivity.
    • Account for the varying lengths of baseStr, which may be up to 1000 characters long, ensuring that the solution remains efficient.
    • Ensure the solution adheres to all given constraints, especially the consistency in the length of s1 and s2.

Utilizing the union-find or an efficient mapping system allows for rapid determination of equivalence classes which in turn speeds up the translation of baseStr into its smallest lexicographical equivalent.

Solutions

  • C++
  • Java
cpp
class Solution {
public:
    array<int, 26> parent;

    int findSet(int index) {
        if (parent[index] == index) {
            return index;
        }
        
        return parent[index] = findSet(parent[index]);
    }
    
    void unionSets(int first, int second) {
        first = findSet(first);
        second = findSet(second);
        
        if (first == second) {
            return;
        }
        
        if (first < second) {
            parent[second] = first;
        } else {
            parent[first] = second;
        }
    }
    
    string minimumEquivalentString(string s1, string s2, string baseStr) {
        for (int i = 0; i < 26; i++) {
            parent[i] = i;
        }
        
        for (int i = 0; i < s1.size(); i++) {
            unionSets(s1[i] - 'a', s2[i] - 'a');
        }
        
        string result;
        for (char c : baseStr) {
            result += (char)(findSet(c - 'a') + 'a');
        }

        return result;
    }
};

The provided C++ code solves the problem of finding the lexicographically smallest equivalent string given two strings (s1 and s2) which describe pairs of equivalent characters and a baseStr which needs to be transformed according to these equivalencies using the union-find data structure.

  • Start by initializing a parent array that will be used to keep track of the root of each character. Initially, each character is its own root.
  • The findSet function recursively finds the root of a given character. It also employs path compression to flatten the structure for efficient future queries.
  • The unionSets function takes two characters. If they are not already in the same set, it merges their sets by linking the root of one to the root of the other. It ensures that the smaller root becomes the parent to preserve the lexicographical order.
  • In the minimumEquivalentString function:
    • Start by setting each character in the alphabetic range as its own set leader.
    • Iterate over the characters of s1 and s2 simultaneously and union their respective characters.
    • For each character in the baseStr, find its representative character using the findSet function and append it to the result string, thus translating baseStr into its lexicographically smallest equivalent.

This function collectively transforms the given baseStr into its smallest equivalent form by applying the described character equivalences. The utilization of the union-find algorithm with path compression ensures that the function is efficient, even for large strings. This makes it suited for applications where performance and minimal generated string size are critical.

java
class UnionFind {
    int parent[] = new int[26];

    // Locate the root of the set
    int getRoot(int x) {
        if (parent[x] == x) {
            return x;
        }

        return parent[x] = getRoot(parent[x]);
    }

    // Unify sets containing x and y
    void unionSets(int x, int y) {
        int rootX = getRoot(x);
        int rootY = getRoot(y);

        if (rootX == rootY) {
            return;
        }

        // Point the larger root to the smaller root
        if (rootX < rootY) {
            parent[rootY] = rootX;
        } else {
            parent[rootX] = rootY;
        }
    }

    public String smallestStringWithSwaps(String first, String second, String target) {
        // Initialize each element as its own parent
        for (int i = 0; i < 26; i++) {
            parent[i] = i;
        }

        // Unify corresponding character indices
        for (int i = 0; i < first.length(); i++) {
            unionSets(first.charAt(i) - 'a', second.charAt(i) - 'a');
        }

        StringBuilder result = new StringBuilder();
        // Build the resulting transformed string
        for (char c : target.toCharArray()) {
            result.append((char) (getRoot(c - 'a') + 'a'));
        }

        return result.toString();
    }
}

In the provided Java solution for generating the lexicographically smallest equivalent string, the implementation leverages the Union-Find (or Disjoint Set Union, DSU) data structure to optimize groupings of characters based on equivalence relations derived from two input strings first and second. The process followed in the code can be summarily described as follows:

  1. Union-Find Class Definition and Initialization:

    • The class UnionFind contains an array parent for representing the disjoint sets, where each set is identified by a unique parent node.
    • Initial setup involves making each character its own parent, indicating no initial grouping or connection.
  2. Finding Root and Union Operations:

    • The getRoot method determines the representative or root of the set to which a particular character belongs, compressing paths in the process to speed up future queries.
    • The unionSets method merges the sets of characters based on their equivalence in the first and second strings. This is done to ensure that if two characters are considered equal in any mapping, they belong to the same group.
  3. Building the Output String:

    • The method smallestStringWithSwaps initializes the parent array and performs union operations based on equivalences found between corresponding positions in the first and second strings.
    • For each character in the target string, its smallest equivalent character (based on the getRoot function) is appended to the result, ultimately forming the transformed string.

By the end of the method smallestStringWithSwaps, the target string is transformed into its lexicographically smallest equivalent by utilizing mapped relations from the first and second strings, thereby achieving the intended transformation effectively. This method is particularly efficient for cases where direct character replacements are required to achieve the smallest lexicographical order based on provided mappings.

Comments

No comments yet.