Longest Word With All Prefixes

Updated on 05 June, 2025
Longest Word With All Prefixes header image

Problem Statement

In this problem, we are given an array of strings words. Our goal is to find the longest string in the array such that each of its prefixes also appears as separate strings within the array. A prefix of a string is a substring that starts at the beginning of the string and continues up to any point within the string. For instance, in the array ["a", "app", "ap"], the string app and all its prefixes (ap, a) are in the array, making it a valid candidate.

The solution requires checking each string and validating all its prefixes against the provided list. If multiple strings fulfill the criteria of having all prefixes present and are of the same length, the string that is lexicographically smaller (i.e., alphabetically ordered as in a dictionary) should be selected. If no such string exists, the function should return an empty string "".

Examples

Example 1

Input:

words = ["k","ki","kir","kira", "kiran"]

Output:

"kiran"

Explanation:

"kiran" has prefixes "kira", "kir", "ki", and "k", and all of them appear in words.

Example 2

Input:

words = ["a", "banana", "app", "appl", "ap", "apply", "apple"]

Output:

"apple"

Explanation:

Both "apple" and "apply" have all their prefixes in words.
However, "apple" is lexicographically smaller, so we return that.

Example 3

Input:

words = ["abc", "bc", "ab", "qwe"]

Output:

""

Constraints

  • 1 <= words.length <= 105
  • 1 <= words[i].length <= 105
  • 1 <= sum(words[i].length) <= 105
  • words[i] consists only of lowercase English letters.

Approach and Intuition

To find the longest string s in words where every prefix of s is also in words, we can follow a systematic approach leveraging data structures for efficient checking:

  1. First, store all the given strings in a set. This will allow for quickly checking the existence of any string (or prefix) in the array. Using a set is advantageous due to its average O(1) time complexity for look-up operations.

  2. Then, for each string, generate all potential prefixes and verify their existence in the set:

    • This can be done by iteratively considering sub-strings. For a string of length n, its prefixes will be substrings starting at the first character and ending at the 1st, 2nd, ..., nth character respectively.
    • If all prefixes of a string are found in the set, this string is a candidate.
  3. Keep track of the longest candidate found so far. If another string with the same length but lexicographically smaller is found, update the stored answer.

  4. Continue this process for all strings. The string stored after evaluating all potential candidates is the required result.

This outlined approach will ensure that we consider all potential strings efficiently while checking for prefix existence using the set structure for fast look-ups, thus making the solution optimal within the given constraints.

Solutions

  • C++
  • Java
  • Python
cpp
class Solution {
public:
    string findLongestWord(vector<string>& wordsList) {
        WordTrie trie;
        string bestWord = "";

        for (string& word : wordsList) {
            trie.addToTrie(word);
        }

        for (string& word : wordsList) {
            if (trie.checkAllPrefixes(word)) {
                if (word.size() > bestWord.size() ||
                    (word.size() == bestWord.size() && word < bestWord)) {
                    bestWord = word;
                }
            }
        }

        return bestWord;
    }

private:
    class WordTrie {
    public:
        struct Node {
            Node* branches[26] = {nullptr};
            bool endsWord = false;
        };

        Node* rootNode;

        WordTrie() : rootNode(new Node()) {}

        void addToTrie(string& word) {
            Node* currentNode = rootNode;
            for (char letter : word) {
                int idx = letter - 'a';
                if (currentNode->branches[idx] == nullptr) {
                    currentNode->branches[idx] = new Node();
                }
                currentNode = currentNode->branches[idx];
            }
            currentNode->endsWord = true;
        }

        bool checkAllPrefixes(string& word) {
            Node* currentNode = rootNode;
            for (char letter : word) {
                currentNode = currentNode->branches[letter - 'a'];
                if (currentNode == nullptr || !currentNode->endsWord) {
                    return false;
                }
            }
            return true;
        }

        ~WordTrie() { clearMemory(rootNode); }

    private:
        void clearMemory(Node* currentNode) {
            if (currentNode == nullptr) return;
            for (int i = 0; i < 26; i++) {
                clearMemory(currentNode->branches[i]);
            }
            delete currentNode;
        }
    };
};

In this solution, the task is to find the longest word from a list where every prefix of the word also appears in the list. The solution leverages the WordTrie class, an implementation of the Trie data structure, to efficiently manage the words and their prefixes.

Follow these steps for the implementation:

  1. Define the WordTrie class with a nested Node structure that contains an array for child nodes and a boolean to indicate if a word ends at this node.
  2. Implement the addToTrie method, which iterates through each letter of a word, navigates or creates the necessary nodes in the trie, and marks the end of the word.
  3. Implement the checkAllPrefixes method to ensure that each prefix of a word, up to and including the word itself, terminates at a node indicating the end of a word.
  4. In the main findLongestWord function, for each word in the input list, use addToTrie to populate the trie.
  5. Check each word with checkAllPrefixes; update bestWord if the current word is longer than the previously stored longest appropriate word, or if it's of the same length but lexicographically smaller.
  6. Finally, bestWord is returned as the result, providing the longest word fitting the prefix condition.

The solution is efficient because the trie structure allows for fast insertion and prefix checking, crucial for optimal performance with large lists of words or long words. Moreover, memory management is handled by a destructor in the WordTrie to ensure no memory leaks.

java
class Solution {

    public String findLongestWord(String[] dictionary) {
        Trie wordTrie = new Trie();
        String maxValidWord = "";

        // Insert entries into trie
        for (String entry : dictionary) {
            wordTrie.insert(entry);
        }

        // Determine the longest word based on certain conditions
        for (String entry : dictionary) {
            if (wordTrie.checkAllPrefixes(entry)) {
                if (
                    entry.length() > maxValidWord.length() ||
                    (entry.length() == maxValidWord.length() &&
                    entry.compareTo(maxValidWord) < 0)
                ) {
                    maxValidWord = entry;
                }
            }
        }

        return maxValidWord;
    }

    private static class Trie {

        private static class TrieNode {

            TrieNode[] childNodes = new TrieNode[26];
            boolean isCompleteWord;
        }

        private final TrieNode rootNode = new TrieNode();

        // Inserting individual word into trie
        public void insert(String word) {
            TrieNode node = rootNode;
            for (char c : word.toCharArray()) {
                int index = c - 'a';
                if (node.childNodes[index] == null) {
                    node.childNodes[index] = new TrieNode();
                }
                node = node.childNodes[index];
            }
            node.isCompleteWord = true;
        }

        // Verifying if every prefix of word exists
        public boolean checkAllPrefixes(String word) {
            TrieNode node = rootNode;
            for (char c : word.toCharArray()) {
                node = node.childNodes[c - 'a'];
                if (node == null || !node.isCompleteWord) {
                    return false;
                }
            }
            return true;
        }
    }
}

The provided Java program aims to find the longest word in a given array of strings (dictionary) such that every prefix of the word is also a word in the dictionary. This task is efficiently achieved using a Trie data structure, which is especially suited for quickly querying and managing prefixes of strings. Here's how the solution works:

  • Trie Structure and Insertion:

    • The Trie consists of nested TrieNode objects, each containing an array of 26 TrieNode references (one for each letter of the alphabet) and a boolean indicating if the node marks the completion of a valid word present in the dictionary.
    • The insert method populates the Trie with words from the dictionary. For each character in the word, a node is created if it does not already exist in the correct position, thereby building a path through the Trie corresponding to each word.
  • Word Validation:

    • The checkAllPrefixes method validates whether all prefixes of a given word are complete words in the dictionary. It traverses the Trie following the path dictated by the word's characters. If it encounters a node that either doesn't exist or isn't marked as a complete word, the method returns false.
  • Finding the Longest Word:

    • After the Trie is populated, the program checks each word in the dictionary to see if all its prefixes exist using the checkAllPrefixes method.
    • During this check, words meeting the prefix criteria are compared against each other to find the longest one. In case of ties (words of the same length), lexicographical order determines the preferred word.

With this approach, the program efficiently identifies the longest word with all prefixes present in the dictionary without redundant checks, making optimal use of the Trie's structure to minimize unnecessary work. This method is far more efficient than brute-force checking all possible prefixes of all words directly against the dictionary.

python
class Solution:
    def findLongestWord(self, word_list: List[str]) -> str:
        trie_structure = TrieStructure()
        longest_word = ""

        # Inserting all words
        for word in word_list:
            trie_structure.insertWord(word)

        # Validate each word for longest with all prefixes
        for word in word_list:
            if trie_structure.checkAllPrefixes(word):
                if len(word) > len(longest_word) or (len(word) == len(longest_word) and word < longest_word):
                    longest_word = word

        return longest_word


class TrieStructure:
    class Node:
        def __init__(self):
            self.children = {}
            self.terminal = False

    def __init__(self):
        self.root = self.Node()

    # Adding a word
    def insertWord(self, word):
        node = self.root
        for letter in word:
            if letter not in node.children:
                node.children[letter] = self.Node()
            node = node.children[letter]
        node.terminal = True

    # Prefix Check
    def checkAllPrefixes(self, word):
        node = self.root
        for letter in word:
            if letter not in node.children or not node.children[letter].terminal:
                return False
            node = node.children[letter]
        return True

The provided Python code defines a solution to extract the longest word from a list that contains all its prefixes, using a Trie structure for efficient query and insertion operations.

Here’s a breakdown of the solution:

  • The Solution class has a method findLongestWord that accepts word_list as input and returns the longest word that includes all its prefixes using the TrieStructure.

  • Each word from word_list is first inserted into the Trie using the insertWord method of the TrieStructure.

  • The Trie verifies each word to see if all its prefixes exist using the checkAllPrefixes method, keeping track of the longest word that satisfies this condition.

  • Each word is also checked to ensure that it is lexicographically smaller in cases where words of the same length exist.

  • The TrieStructure class uses an inner Node class:

    • Each Node has a dictionary children for storing characters and a boolean flag terminal to mark the end of a word.

    • insertWord creates a path in the Trie for each character of the word. If a character isn't present, it initializes a new Node.

    • checkAllPrefixes verifies if each character while traversing the word exists and checks if it marks a terminal of a prefix word.

This use of a Trie significantly improves the search efficiency, especially beneficial when working with a large set of words to handle multiple prefix checks quickly. With the given implementation, you are equipped to efficiently find the longest word meeting the criteria from an extensive list.

Comments

No comments yet.