Shortest Way to Form String

Updated on 09 July, 2025
Shortest Way to Form String header image

Problem Statement

In this problem, you are provided with two strings named source and target. A subsequence of a string is defined as a new string generated from the original string by deleting some (could be none) of the characters without rearranging the remaining characters. The task is to determine the minimum number of subsequences of the source that can be concatenated to precisely form the target string. If constructing the target from subsequences of the source is not feasible, the function should return -1. This involves analyzing the characters and their order within the source to see how segments can be combined to match the target.

Examples

Example 1

Input:

source = "abc", target = "abcbc"

Output:

2

Explanation:

The target "abcbc" can be formed by "abc" and "bc", which are subsequences of source "abc".

Example 2

Input:

source = "abc", target = "acdbc"

Output:

-1

Explanation:

The target string cannot be constructed from the subsequences of source string due to the character "d" in target string.

Example 3

Input:

source = "xyz", target = "xzyxz"

Output:

3

Explanation:

The target string can be constructed as follows "xz" + "y" + "xz".

Constraints

  • 1 <= source.length, target.length <= 1000
  • source and target consist of lowercase English letters.

Approach and Intuition

The goal is to use subsequences from the source string to match the target string exactly, requiring an understanding of how subsequences work and how to efficiently combine them. Here’s a breakdown of how one might approach solving it:

  1. Traverse through the target string and attempt to map its segments to subsequences from the source.
  2. Begin by looking at the first character of the target and find its first occurrence in the source. If it doesn't exist, immediately return -1 as the task is impossible.
  3. Continue scanning the target and match the sequence with source. If you reach the end of source before completing the sequence in target, start again from the beginning of the source.
  4. Count every time you restart from beginning the source as a new subsequence.
  5. Each time a character in the target cannot be found in the source, the task is impossible, thus return -1.
  6. This problem not only checks for character availability but also the order in which characters appear, without which the intended subsequence cannot be formed.

By applying this step-by-step constructive approach, you can determine if and how the target string can be formed and how many subsequences are required. Each example given clarifies different aspects of the problem, from a straightforward construction to the complexities introduced by characters that do not appear in the source or require multiple rearrangements and restarts.

Solutions

  • C++
cpp
class Solution {
public:
    int minimumSubsequences(string src, string tgt) {
    
        // Prepare next possible character index table
        int nextPosition[src.size()][26];
    
        // Set default as not found
        for (int letter = 0; letter < 26; letter++) {
            nextPosition[src.size() - 1][letter] = -1;
        }
        nextPosition[src.size() - 1][src.back() - 'a'] = src.size() - 1;
    
        // Populate the table with indices of characters from the back
        for (int i = src.size() - 2; i >= 0; i--) {
            for (int letter = 0; letter < 26; letter++) {
                nextPosition[i][letter] = nextPosition[i + 1][letter];
            }
            nextPosition[i][src[i] - 'a'] = i;
        }
    
        // Traversal pointer in src
        int srcIndex = 0;
    
        // Holds the number of required subsequences
        int numSubsequences = 1;
    
        // Handle each character of tgt
        for (char ch : tgt) {
    
            // Early return if character is absent in src
            if (nextPosition[0][ch - 'a'] == -1) {
                return -1;
            }
    
            // Check if new subsequence is required
            if (srcIndex == src.size() || nextPosition[srcIndex][ch - 'a'] == -1) {
                numSubsequences++;
                srcIndex = 0;
            }
    
            // Move source index to the next needed character
            srcIndex = nextPosition[srcIndex][ch - 'a'] + 1;
        }
    
        // Final count of subsequences
        return numSubsequences;
    }
};

The C++ code provided demonstrates how to compute the minimum number of subsequences from a source string (src) required to form a target string (tgt). Here's a breakdown of how the solution is implemented:

  • Data Structure Initialization:

    • An array nextPosition is created which maps each character of the src string to its next occurrence index. This facilitates efficient lookups during the subsequences formation process.
  • Initial Setup:

    • The last position of each character in the src string is noted. If a character from a to z does not appear at a position, it is initialized to -1 indicating absence.
  • Populate Next Occurrences:

    • The nextPosition array is populated in reverse order, from the end of src back to its start. This step involves updating the index where each character can next be found moving backwards.
  • Subsequences Formation:

    • The algorithm traverses the tgt string character by character, checking if it can map consecutively onto src without a break. If a break is needed (i.e., the character in tgt is not found going forward in src starting from the current index), a new subsequence is started.
    • This updates the total number of subsequences (numSubsequences) needed to form tgt.
  • Edge Case Handling:

    • If any character from tgt is completely absent in src, the function immediately returns -1, indicating that it's impossible to form tgt from src.
  • Result Compilation:

    • The total count of subsequences required is returned.

This problem primarily focuses on efficiently determining where the next required character in the tgt sequence can be found in src, utilizing a pre-computed table for speed. The approach minimizes the need for repeated searches, handling each character of tgt in constant time complexity relative to its position in src.

  • Java
java
class Solution {
    public int minimumTransforms(String fromText, String toText) {
    
        // To track next occurrence from a specific index for each character
        int[][] nextLoc = new int[fromText.length()][26];
    
        // Initializing last characters
        for (int c = 0; c < 26; c++) {
            nextLoc[fromText.length() - 1][c] = -1;
        }
        nextLoc[fromText.length() - 1][fromText.charAt(fromText.length() - 1) - 'a'] = fromText.length() - 1;
    
        // Fill array with next locations
        for (int i = fromText.length() - 2; i >= 0; i--) {
            for (int c = 0; c < 26; c++) {
                nextLoc[i][c] = nextLoc[i + 1][c];
            }
            nextLoc[i][fromText.charAt(i) - 'a'] = i;
        }
    
        // Iterator for the fromText
        int fromIndex = 0;
    
        // Count of transformations
        int transformations = 1;
    
        // Process each character in toText
        for (char ch : toText.toCharArray()) {
    
            // Check if character exists in the fromText
            if (nextLoc[0][ch - 'a'] == -1) {
                return -1;
            }
    
            // Reset from index if needed or continue to next
            if (fromIndex == fromText.length() || nextLoc[fromIndex][ch - 'a'] == -1) {
                transformations++;
                fromIndex = 0;
            }
    
            // Move to the next occurrence
            fromIndex = nextLoc[fromIndex][ch - 'a'] + 1;
        }
    
        // Total number of transformations required
        return transformations;
    }
}

The provided Java method minimumTransforms determines the minimum number of subsequences required from a string fromText to form another string toText. The solution effectively utilizes dynamic programming to store and retrieve locations of characters, optimizing the process of searching through fromText.

The method operates as follows:

  1. An array nextLoc tracks the next occurrence of every character from the English alphabet within fromText starting from a certain index. This array is pivotal for quickly determining where a character can be found after a specific index.

  2. Initialize the last index for each character in fromText. If a character doesn't appear at the end of fromText, its value is set to -1. This initialization helps facilitate swift lookups during the transformation process.

  3. Populate the nextLoc array with indices. This step involves iterating backward through fromText to ensure that every entry in nextLoc correctly represents the nearest future occurrence of each character, optimizing searches during transformation steps.

  4. Initialize variables to traverse through toText and count the transformations. fromIndex starts at 0, and transformations is initially set at 1, considering the minimal case where at least one transformation is always necessary.

  5. For each character in toText, the algorithm:

    • Checks if the character exists in fromText by looking at the first occurrence stored in nextLoc. If it's -1, the character doesn't exist, and hence toText can't be formed, returning -1.

    • Resets fromIndex if it’s either out of range or the next occurrence of the current character is -1, implying the need to start a new subsequence and increment the transformation count.

    • Updates fromIndex to point to the next occurrence of the current character in fromText to continue checking subsequent characters.

  6. Conclude by returning the transformations count which now represents the minimal number of subsequences required to form toText from fromText.

This implementation is both time-efficient due to the preparatory indexing and space-efficient with its use of a 2D array for position management, making it effective for large inputs within practical constraints.

  • C
c
int minSequencesToForm(char * src, char * tgt) {
    // Calculate length of the source
    int lenSrc = strlen(src);
    
    // Array for tracking next occurrences of each character
    int nextCharIndex[lenSrc][26];
    
    // Initial setup for the last character occurrences
    for (int i = 0; i < 26; i++) {
        nextCharIndex[lenSrc - 1][i] = -1;
    }
    nextCharIndex[lenSrc - 1][src[lenSrc - 1] - 'a'] = lenSrc - 1;
    
    // Populate the next occurrence table
    for (int j = lenSrc - 2; j >= 0; j--) {
        for (int i = 0; i < 26; i++) {
            nextCharIndex[j][i] = nextCharIndex[j + 1][i];
        }
        nextCharIndex[j][src[j] - 'a'] = j;
    }
    
    // Source index for tracking position in source
    int srcIdx = 0;
    
    // Count of sequences needed
    int sequenceCount = 1;
    
    // Evaluate all characters in target using source
    for (int idx = 0; tgt[idx] != '\0'; idx++) {
    
        // If the current character of target is absent from the entire source
        if (nextCharIndex[0][tgt[idx] - 'a'] == -1) {
            return -1;
        }
    
        // If end of source is reached or character is not in source starting from current index
        if (srcIdx == lenSrc || nextCharIndex[srcIdx][tgt[idx] - 'a'] == -1) {
            sequenceCount++;
            srcIdx = 0;
        }
    
        // Move source index to the next occurrence of current target character
        srcIdx = nextCharIndex[srcIdx][tgt[idx] - 'a'] + 1;
    }
    
    // Return the total sequences needed
    return sequenceCount;
}

The provided C code addresses the problem of determining the minimum number of subsequences from a source string (src) that are required to form a target string (tgt). The code adopts a strategy of using an auxiliary array nextCharIndex to keep track of the next occurrence index for each character in the source string. This approach facilitates efficient lookups while iterating through the target string.

Here's the operational breakdown of the code:

  • Initializes nextCharIndex such that for each character of the alphabet, it indicates where the next occurrence of that character can be found in src starting from a given index.
  • Iterates backwards from the end of src to populate nextCharIndex with indices referring to where each character will next appear.
  • Utilizes nextCharIndex to efficiently navigate through src while checking against characters in tgt. This assists in determining where and if tgt characters appear in src.
  • If at any point a character in tgt cannot be found in src, the function immediately returns -1, indicating the target string cannot be formed.
  • If the end of src is reached or a character in tgt doesn't appear in src from the current starting index srcIdx, the sequence count is incremented. srcIdx is reset to 0 to start a new subsequence.
  • Continuously updates srcIdx based on where characters in tgt appear in src.

By employing an intelligent preprocessing step with nextCharIndex, the solution ensures a direct and fast mapping from tgt characters to their positions in src, hence optimizing the process of counting the minimum number of subsequences needed to form tgt.

The function finally returns the count of such subsequences, providing a clear metric to solve the problem effectively.

  • JavaScript
js
var minimumSequences = function(src, tgt) {
    const lengthSrc = src.length;
      
    const nextCharIndex = Array.from({length: lengthSrc}, () => Array(26).fill(-1));
        
    for (let charCode = 0; charCode < 26; charCode++) {
        nextCharIndex[lengthSrc - 1][charCode] = -1;
    }
    nextCharIndex[lengthSrc - 1][src[lengthSrc - 1].charCodeAt(0) - 'a'.charCodeAt(0)] = lengthSrc - 1;
    
    for (let i = lengthSrc - 2; i >= 0; i--) {
        for (let charCode = 0; charCode < 26; charCode++) {
            nextCharIndex[i][charCode] = nextCharIndex[i + 1][charCode];
        }
        nextCharIndex[i][src[i].charCodeAt(0) - 'a'.charCodeAt(0)] = i;
    }
    
    let currentIdx = 0;
    let sequenceCount = 1;
    
    for (let i = 0; i < tgt.length; i++) {
        if (nextCharIndex[0][tgt[i].charCodeAt(0) - 'a'.charCodeAt(0)] == -1) {
            return -1;
        }
    
        if (currentIdx == lengthSrc || nextCharIndex[currentIdx][tgt[i].charCodeAt(0) - 'a'.charCodeAt(0)] == -1) {
            sequenceCount++;
            currentIdx = 0;
        }
    
        currentIdx = nextCharIndex[currentIdx][tgt[i].charCodeAt(0) - 'a'.charCodeAt(0)] + 1;
    }
    
    return sequenceCount;
};

The provided JavaScript function minimumSequences is focused on finding the minimum number of subsequences required to form a target string tgt using the source string src. This algorithm makes use of dynamic programming to preprocess the source string to optimize the formation of the target string.

  • Initialize a 2D array nextCharIndex to keep track of the nearest index of each character from 'a' to 'z' in the source string at every position. This helps in navigating through the source string efficiently when forming the target string.

  • Populate the nextCharIndex by starting from the end of the source string going backwards. This way, for each character at position i, and for every possible character from 'a' to 'z', store the next position in the source string where this character occurs. If a character does not occur, mark it with -1.

  • Process the target string by moving through each character. Use nextCharIndex to find the next occurrence of the current character of the target string in the source:

    • If the first character of the target doesn't exist in the source at all (checked using nextCharIndex[0] for that character), return -1, as it’s impossible to form the target.
    • If the current position in the source string is out of bounds or if there is no occurrence of the current character in the remaining substring (from the current index to end), increment the sequenceCount (indicative of having to start a new subsequence from the start of the source string) and reset the current index to the next valid position for the target character.
  • At the end, the value contained in sequenceCount will provide the minimum number of subsequences required to form the target string from the source string.

This is an efficient approach for problems where one needs to form one string as a subsequence of another using the smallest number of disjoint subsequences of a source string. This can be critical in applications like patching files or streams of text where minimal patching operations are ideal.

  • Python
python
class Solution:
    def minSequencesRequired(self, src: str, tgt: str) -> int:
        # Calculating the length of the source string
        len_src = len(src)
    
        # Initialize next character occurrence mapping
        next_char_index = [defaultdict(int) for _ in range(len_src)]
    
        # Set the very last character position as its only occurrence
        next_char_index[len_src - 1][src[len_src - 1]] = len_src - 1
    
        # Fill the table for next character occurrences
        for i in range(len_src - 2, -1, -1):
            next_char_index[i] = next_char_index[i + 1].copy()
            next_char_index[i][src[i]] = i
    
        # Initialize the source index and count of paths needed
        src_index = 0
        required_paths = 1
    
        # Check characters in target against the source
        for character in tgt:
            # If the first index does not contain the character, impossible match
            if character not in next_char_index[0]:
                return -1
    
            # If source needs to be reset or character is ahead of current index
            if src_index == len_src or character not in next_char_index[src_index]:
                required_paths += 1
                src_index = 0
    
            # Move to the next occurrence of the character
            src_index = next_char_index[src_index][character] + 1
    
        # Return the total paths required to form target from sequences in source
        return required_paths

The provided Python solution aims at counting the minimum number of subsequences from a source string that can be concatenated together to form a target string.

  • Start by defining necessary structures and calculating the length of the source string.
  • A dictionary, next_char_index, is created to keep track of all occurrences of each character in the source starting from the end towards the beginning. This helps in finding the next occurrence of a character efficiently.
  • Initialize counters src_index and required_paths to manage the current position in the source during the checking process and the count of subsequences used, respectively.
  • Traverse each character of the target string:
    • If a character in the target doesn't exist in the source from the beginning, return -1 indicating it's impossible to form the target.
    • If the current source index (src_index) reaches the end or character is not ahead in the source, increment the required_paths counter and reset the source index to start from the beginning again.
    • Update the src_index to the position after the next occurrence of the current target character.
  • Return the count required_paths which represents the minimum number of subsequences needed to form the target string from the source.

The solution efficiently constructs the target string by optimizing character search in the source string using backtracking and indexing techniques.

Comments

No comments yet.