Trie is an efficient information retrieval data structure. Using trie, search complexities can be brought to optimal limit (key length). If we store keys in binary search tree, a well balanced BST will need time proportional to M * log N, where M is maximum string length and N is number of keys in tree. Using trie, we can search the key in O(M) time. However the penalty is on trie storage requirements.
Every node of trie consists of multiple branches. Each branch represents a possible character of keys. We need to mark the last node of every key as leaf node. A trie node field value will be used to distinguish the node as leaf node (there are other uses of the value field). A simple structure to represent nodes of English alphabet can be as following,
struct
trie_node
{
int
value; /* Used to mark leaf nodes */
trie_node_t *children[ALPHABET_SIZE];
};
Searching for a key is similar to insert operation, however we only compare the characters and move down. The search can terminate due to end of string or lack of key in trie. In the former case, if the value field of last node is non-zero then the key exists in trie. In the second case, the search terminates without examining all the characters of key, since the key is not present in trie.
The following picture explains construction of trie using keys given in the example below,
root / \ \ t a b | | | h n y | | \ | e s y e / | | i r w | | | r e e | r
Insert and search costs O(key_length), however the memory requirements of trie isO(ALPHABET_SIZE * key_length * N) where N is number of keys in trie. There are efficient representation of trie nodes (e.g. compressed trie, ternary search tree, etc.) to minimize memory requirements of trie.
Code (Java):
From this website: http://pathakalgo.blogspot.com/2012/11/trie-data-structure-implementation-in.html
public class PrefixTree { static TrieNode createTree() { return(new TrieNode('\0', false)); } static void insertWord(TrieNode root, String word) { int offset = 97; int l = word.length(); char[] letters = word.toCharArray(); TrieNode curNode = root; for (int i = 0; i < l; i++) { if (curNode.links[letters[i]-offset] == null) curNode.links[letters[i]-offset] = new TrieNode(letters[i], i == l-1 ? true : false); curNode = curNode.links[letters[i]-offset]; } } static boolean find(TrieNode root, String word) { char[] letters = word.toCharArray(); int l = letters.length; int offset = 97; TrieNode curNode = root; int i; for (i = 0; i < l; i++) { if (curNode == null) return false; curNode = curNode.links[letters[i]-offset]; } if (i == l && curNode == null) return false; if (curNode != null && !curNode.fullWord) return false; return true; } static void printTree(TrieNode root, int level, char[] branch) { if (root == null) return; for (int i = 0; i < root.links.length; i++) { branch[level] = root.letter; printTree(root.links[i], level+1, branch); } if (root.fullWord) { for (int j = 1; j <= level; j++) System.out.print(branch[j]); System.out.println(); } } public static void main(String[] args) { TrieNode tree = createTree(); String[] words = {"an", "ant", "all", "allot", "alloy", "aloe", "are", "ate", "be".}; for (int i = 0; i < words.length; i++) insertWord(tree, words[i]); char[] branch = new char[50]; printTree(tree, 0, branch); String searchWord = "all"; if (find(tree, searchWord)) { System.out.println("The word was found"); } else { System.out.println("The word was NOT found"); } } } class TrieNode { char letter; TrieNode[] links; boolean fullWord; TrieNode(char letter, boolean fullWord) { this.letter = letter; links = new TrieNode[26]; this.fullWord = fullWord; } }
No comments:
Post a Comment