14. Longest Common Prefix - Solutions and Explanation

Problem Statement

The task involves writing a function that determines the longest common prefix (LCP) across a set of strings contained within an array. The LCP is the initial segment of characters that is shared by all the strings in the input array. For example, from the strings "flower", "flow", and "flight", the LCP is "fl". This problem specifies that if the strings do not share any common prefix, the function should return an empty string "". The procedure should efficiently handle varying lengths and contents of strings while adhering to specific constraints.

Examples

Example 1

Input:

strs = ["flower","flow","flight"]

Output:

"fl"

Example 2

Input:

strs = ["dog","racecar","car"]

Output:

""

Explanation:

There is no common prefix among the input strings.

Constraints

1 <= strs.length <= 200
0 <= strs[i].length <= 200
strs[i] consists of only lowercase English letters if it is non-empty.

Approach and Intuition

To solve the problem of finding the longest common prefix amongst an array of strings, we can consider several strategies. Let's break down the intuitive approach using the given examples and constraints:

Horizontal Scanning:
- Start by assuming the first string in the array is the common prefix.
- Compare this prefix with the next string in the array, shortening the prefix from the end until it matches the start of this string.
- Continue this process with each string in the array until you either find a common prefix for all or determine there is none.
Vertical Scanning:
- Look at each character position across all strings simultaneously, confirming all strings have the same character at each position until you encounter a mismatch.
- This method is efficient since it stops comparing as soon as a mismatch is found, potentially reducing the number of total comparisons.
Binary Search Approach:
- Use binary search on the length of the shortest string to find the maximum length of the LCP.
- Check incrementally (using a divide and conquer approach) whether a substring is the common prefix of all strings.

Given the constraints:

Each array contains at least 1 and a maximum of 200 strings.
Each string can vary in length from 0 to 200 and contains only lowercase English letters.

From these constraints, both horizontal and vertical scanning are viable. Vertical scanning might be particularly efficient since you can stop as soon as any string does not contain a particular character at a checked index, optimizing character comparisons based on the shortest string’s length.

Solutions

C++
Java
C
JavaScript
Python

                            cpp
                            
                        
class Node {
public:
    Node* links[26];
    bool isTerminal;
    int childCount;
    
    Node() : isTerminal(false), childCount(0) {
        for (int i = 0; i < 26; ++i) {
            links[i] = nullptr;
        }
    }
    
    void insertChild(char ch, Node* node) {
        if (links[ch - 'a'] == nullptr) {
            links[ch - 'a'] = node;
            childCount++;
        }
    }
    
    bool hasChar(char ch) { return links[ch - 'a'] != nullptr; }
    
    int childLinks() const { return childCount; }
};
    
class TrieStructure {
public:
    Node* root;
    
    TrieStructure() { root = new Node(); }
    
    void addWord(string word) {
        Node* currentNode = root;
        for (char c : word) {
            if (!currentNode->hasChar(c)) {
                currentNode->insertChild(c, new Node());
            }
            currentNode = currentNode->links[c - 'a'];
        }
        currentNode->isTerminal = true;
    }
    
    string findLongestPrefix(string word) {
        Node* currentNode = root;
        string longPrefix;
        for (char c : word) {
            if (currentNode->hasChar(c) && currentNode->childLinks() == 1 && !currentNode->isTerminal) {
                longPrefix += c;
                currentNode = currentNode->links[c - 'a'];
            } else {
                break;
            }
        }
        return longPrefix;
    }
};
    
class Solution {
public:
    string longestCommonPrefix(string query, vector<string>& strings) {
        if (strings.empty()) return "";
        if (strings.size() == 1) return strings[0];
    
        TrieStructure trie;
        for (int i = 1; i < strings.size(); i++) {
            trie.addWord(strings[i]);
        }
        return trie.findLongestPrefix(query);
    }
};

The provided C++ code implements a solution to find the longest common prefix among a set of strings using a trie. The implementation involves several classes that manage the trie data structure and the search logic.

Classes used in the solution:

Node: Represents a single node in the trie. Each node contains:
- An array of Node pointers representing the 26 lowercase English letters.
- A boolean isTerminal to indicate if the node marks the end of a word.
- An integer childCount to track the number of children a node has.
TrieStructure: Manages the overall trie operations. Key functions include:
- addWord(string): Inserts a word into the trie.
- findLongestPrefix(string): Determines the longest prefix for a given word that matches prefixes in the trie.
Solution: Contains the longestCommonPrefix function which uses the TrieStructure to compute the longest common prefix for a query string compared against a list of other strings.

Operational Steps:

Initialize the trie.
Insert each string from the provided list into the trie, excluding the first string.
Use the trie to extract the longest common prefix for the first string in the list relative to the other strings.

Edge Cases Considered:

If the list of strings is empty, the function returns an empty string.
If there's only one string in the list, it returns that string as the longest common prefix.

This implementation efficiently builds a trie with the strings and uses it to determine the common prefix, ensuring that the prefix is shared across all provided strings. The use of the trie structure optimizes the search and check operations for the common prefix.

                            java
                            
                        
class PrefixSolver {
    public String findLongestCommonPrefix(String base, String[] words) {
        if (words == null || words.length == 0) return "";
        if (words.length == 1) return words[0];
        PrefixTree prefixTree = new PrefixTree();
        for (int i = 1; i < words.length; i++) {
            prefixTree.insert(words[i]);
        }
        return prefixTree.searchLongestPrefix(base);
    }
}
    
class PrefixTreeNode {
    private PrefixTreeNode[] children;
    private final int ALPHABET_SIZE = 26;
    private boolean terminal;
    private int childCount;
    
    public PrefixTreeNode() {
        children = new PrefixTreeNode[ALPHABET_SIZE];
    }
    
    public boolean contains(char c) {
        return children[c - 'a'] != null;
    }
    
    public PrefixTreeNode get(char c) {
        return children[c - 'a'];
    }
    
    public void put(char c, PrefixTreeNode node) {
        children[c - 'a'] = node;
        childCount++;
    }
    
    public int getChildrenCount() {
        return childCount;
    }
    
    public void markEnd() {
        terminal = true;
    }
    
    public boolean isEnd() {
        return terminal;
    }
}
    
class PrefixTree {
    private PrefixTreeNode root;
    
    public PrefixTree() {
        root = new PrefixTreeNode();
    }
    
    public void insert(String word) {
        PrefixTreeNode current = root;
        for (int i = 0; i < word.length(); i++) {
            char letter = word.charAt(i);
            if (!current.contains(letter)) {
                current.put(letter, new PrefixTreeNode());
            }
            current = current.get(letter);
        }
        current.markEnd();
    }
    
    public String searchLongestPrefix(String word) {
        PrefixTreeNode node = root;
        StringBuilder prefix = new StringBuilder();
        for (int i = 0; i < word.length(); i++) {
            char letter = word.charAt(i);
            if (node.contains(letter) && node.getChildrenCount() == 1 && !node.isEnd()) {
                prefix.append(letter);
                node = node.get(letter);
            } else {
                break;
            }
        }
        return prefix.toString();
    }
}

The provided Java solution incorporates a trie (prefix tree) data structure to solve the "Longest Common Prefix" problem. The approach builds a trie from a list of words and then utilizes this trie to determine the longest common prefix starting from a given base word. Below is an outline of the code's functionality:

Classes and Methods:
- PrefixSolver class:
  - findLongestCommonPrefix() method: Initializes the process by constructing a PrefixTree and inserting words from the list into this tree, except for the first word which serves as the 'base'. It then searches for the longest prefix in the base word that matches prefixes in the constructed tree.
- PrefixTreeNode class: Represents each node in the trie with pointers to child nodes, indication of end of word, and the count of children.
  - Methods like contains(), get(), put(), markEnd(), isEnd(), and a method to count children help manage node properties.
- PrefixTree class:
  - insert() method: Facilitates the insertion of a word into the trie.
  - searchLongestPrefix() method: Searches for the longest common prefix given the base word utilizing characteristics of the trie such as single children and non-terminal nodes to ensure the prefix is common among words.
Flow of Execution:
1. A PrefixTree is instantiated and all words, except the base word, are inserted into this trie.
2. For each character in the base word, the trie is checked:
  - If the node for the character exists, has only one child, and isn't terminal, it adds the character to the prefix result.
  - The search stops when either of these conditions fails.
3. The accumulated characters form the longest common prefix which is then returned.
Edge Cases Handled:
- The function returns an empty string if the list of words is null or empty.
- Directly returns the single word if only one word exists in the list.

This solution efficiently identifies the longest common prefix using a trie structure, optimizing searches and insertions, and is particularly effective for large sets of words or when multiple prefix queries are required.

                            c
                            
                        
typedef struct TrieElement {
    struct TrieElement *childNodes[26];
    bool endsWord;
    int childCount;  // Number of non-NULL child nodes
} TrieElement;
    
TrieElement *newTrieNode() {
    TrieElement *element = (TrieElement *)malloc(sizeof(TrieElement));
    if (element) {
        memset(element, 0, sizeof(TrieElement));
        element->childCount = 0;
    }
    return element;
}
    
void addItem(TrieElement *root, char *item) {
    TrieElement *current = root;
    for (int pos = 0; item[pos] != '\0'; pos++) {
        int index = item[pos] - 'a';
        if (current->childNodes[index] == NULL) {
            current->childNodes[index] = newTrieNode();
            current->childCount++;
        }
        current = current->childNodes[index];
    }
    current->endsWord = true;
}
    
char *findLongestPrefix(TrieElement *root, char *item) {
    TrieElement *current = root;
    static char longestPrefix[101];
    int length = 0;
    for (int pos = 0; item[pos] != '\0'; pos++) {
        int index = item[pos] - 'a';
        if (current->childNodes[index] != NULL && current->childCount == 1 &&
            !current->endsWord) {
            longestPrefix[length++] = item[pos];
            current = current->childNodes[index];
        } else {
            break;
        }
    }
    longestPrefix[length] = '\0';
    return longestPrefix;
}
    
char *longestCommonPrefix(char *query, char **strings, int count) {
    if (count == 0) return "";
    if (count == 1) return strings[0];
    
    TrieElement *root = newTrieNode();
    for (int i = 1; i < count; i++) {
        addItem(root, strings[i]);
    }
    return findLongestPrefix(root, query);
}

The provided C code demonstrates an algorithm to find the longest common prefix among a list of strings by using a Trie (prefix tree) data structure. Here's a breakdown of the solution:

Trie Structure:
- The code defines a TrieElement struct, representing each node in the trie. Each node contains pointers to child nodes (childNodes), a boolean indicating if a word ends at this node (endsWord), and a count of non-null child nodes (childCount).
Trie Functions:
- newTrieNode(): Initializes and returns a new trie node with all values set to defaults.
- addItem(TrieElement *root, char *item): Inserts a string item into the trie rooted at root. It traverses the trie character by character, adding new nodes as necessary.
- findLongestPrefix(TrieElement *root, char *item): Starting at the root, this function checks through each character of item and traverses accordingly. The function captures the longest sequence of nodes with exactly one child and not marking the end of any other string, effectively determining the longest common prefix relative to item.
Longest Common Prefix Calculation:
- longestCommonPrefix(char *query, char **strings, int count): Handles edge cases such as empty or single string lists. For other cases, it constructs the trie with all strings except the first, and then it determines the longest common prefix to query using the trie.

Utilize the functions provided to effectively manage and query data stored in the trie structure for various applications, especially in scenarios where prefix matching is frequently required, such as in auto-complete systems.

                            js
                            
                        
class Node {
    constructor() {
        this.children = {};
        this.endOfWord = false;
        this.childLinks = 0; // Track number of unique children
    }
}
    
class TrieStructure {
    constructor() {
        this.root = new Node();
    }
    
    insert(word) {
        let currentNode = this.root;
        for (let character of word) {
            if (!currentNode.children[character]) {
                currentNode.children[character] = new Node();
                currentNode.childLinks++;
            }
            currentNode = currentNode.children[character];
        }
        currentNode.endOfWord = true;
    }
    
    findLongestCommonPrefix(standard) {
        let currentNode = this.root;
        let commonPrefix = "";
        for (let character of standard) {
            if (currentNode.children[character] && currentNode.childLinks == 1 && !currentNode.endOfWord) {
                commonPrefix += character;
                currentNode = currentNode.children[character];
            } else {
                break;
            }
        }
        return commonPrefix;
    }
}
    
var longestCommonPrefix = function (query, stringList) {
    if (stringList.length === 0) return "";
    if (stringList.length === 1) return stringList[0];
    let trie = new TrieStructure();
    for (let i = 1; i < stringList.length; i++) {
        trie.insert(stringList[i]);
    }
    return trie.findLongestCommonPrefix(query);
};

The solution implements a function in JavaScript to find the longest common prefix among a list of strings by using a Trie data structure. This approach involves creating a framework with two classes: Node and TrieStructure.

Node class:
- Manages individual nodes within the Trie, each holding:
  - A dictionary of child nodes.
  - A boolean flag to indicate the end of a word.
  - A count of unique child nodes directly connected to the current node, childLinks.
TrieStructure class:
- Manages the entire Trie with operations:
  - insert: Takes a word and inserts it into the Trie. For each character in the word, if it doesn't exist as a child of the current node, it is added, and childLinks is incremented.
  - findLongestCommonPrefix: Accepts a reference string (standard). It iterates over each character of the string, advancing through the Trie while there is a matching child and there is only one unique child (childLinks equals 1) and the end of a word has not been reached. It accumulates matched characters as the common prefix.
longestCommonPrefix function:
- Handles edge cases and employs the Trie to ascertain the longest common prefix.
  - Returns an empty string if the provided list of strings (stringList) is empty.
  - Directly returns the single string if stringList contains only one item.
  - Iterates over the list (excluding the first string assumed to be the query string for comparison), inserts them into the Trie, and invokes findLongestCommonPrefix with the query.

The provided solution optimizes the process of finding common prefixes by structurally mapping each string once into the Trie and subsequently leveraging the Trie's properties to efficiently discover the longest common sequence that appears at the start of the query and other strings in stringList.

                            python
                            
                        
class TrieNode:
    def __init__(self):
        self.edges = {}
        self.endOfString = False
        self.countLinks = 0
    
    def addNode(self, letter):
        if letter not in self.edges:
            self.edges[letter] = TrieNode()
            self.countLinks += 1
    
    
class PrefixTrie:
    def __init__(self):
        self.head = TrieNode()
    
    def insertWord(self, string):
        currentNode = self.head
        for char in string:
            if char not in currentNode.edges:
                currentNode.addNode(char)
            currentNode = currentNode.edges[char]
        currentNode.endOfString = True
    
    def longestMatchingPrefix(self, string):
        currentNode = self.head
        prefixCharacters = []
        for letter in string:
            if letter in currentNode.edges and currentNode.countLinks == 1 and not currentNode.endOfString:
                prefixCharacters.append(letter)
                currentNode = currentNode.edges[letter]
            else:
                break
        return "".join(prefixCharacters)
    
    
class PrefixSolution:
    def longestPrefixMatch(self, query, words):
        if not words:
            return ""
        if len(words) == 1:
            return words[0]
        trie = PrefixTrie()
        for word in words[1:]:
            trie.insertWord(word)
        return trie.longestMatchingPrefix(query)

The solution provided uses a Trie (prefix tree) data structure to efficiently determine the longest common prefix among a group of strings. The solution is implemented in Python and involves the following classes and methods:

TrieNode: This class represents each node in the prefix tree. It contains:
- edges: a dictionary to store links to child nodes.
- endOfString: a Boolean flag indicating if the node corresponds to the end of a string in the trie.
- countLinks: an integer that counts the child nodes. This helps in determining the common prefix during traversal.
PrefixTrie: This class manages the trie operations. It includes:
- head: the root node of the trie.
- insertWord: a method to insert words into the trie. It adds nodes for each character of the string if not already present.
- longestMatchingPrefix: a method to determine the longest common prefix by traversing the trie from the root node and collecting characters that have exactly one link and are not the end of a string.
PrefixSolution: This class is designed to utilize the PrefixTrie for finding the longest common prefix of a query string compared to other words:
- longestPrefixMatch: a method to set up the trie with the given list of words and then determine the longest prefix that matches the query.

Note the following steps to utilize the code for solving the problem:

Create an instance of PrefixSolution.
Call the longestPrefixMatch with the query string and the list of words.

This approach ensures that each query runs in O(m) time, where m is the length of the string, after an initial setup time of O(n * l) for inserting n words of average length l into the trie. This is significantly efficient compared to brute force solutions, especially for a large number of queries and words.

Comments

No comments yet.

Longest Common Prefix

Problem Statement

Examples

Example 1

Example 2

Constraints

Approach and Intuition

Solutions

Comments

Products

Features

Solutions

Marketplace

Resources

Company

Tech Talks

Vultr Blogs