C++ Trie Datastructure














































C++ Trie Datastructure



Tries:

Tries are an extremely special and useful data-structure that are based on the prefix of a string. They are used to represent the "Retrieval" of data and thus the name Trie.

Prefix : What is prefix:

The prefix of a string is nothing but any n letters n%u2264|S| that can be considered beginning strictly from the starting of a string. For example , the word "abacaba" has the following prefixes:

a
ab
aba
abac
abaca
abacab

A Trie is a special data structure used to store strings that can be visualized like a graph. It consists of nodes and edges. Each node consists of at max 26 children and edges connect each parent node to its children. These 26 pointers are nothing but pointers for each of the 26 letters of the English alphabet A separate edge is maintained for every edge.

Strings are stored in a top to bottom manner on the basis of their prefix in a trie. All prefixes of length 1 are stored at until level 1, all prefixes of length 2 are sorted at until level 2 and so on.

For example , consider the following diagram : enter image description here

Now, one would be wondering why to use a data structure such as a trie for processing a single string? Actually, Tries are generally used on groups of strings, rather than a single string. When given multiple strings , we can solve a variety of problems based on them. For example, consider an English dictionary and a single string s, find the prefix of maximum length from the dictionary strings matching the string s. Solving this problem using a naive approach would require us to match the prefix of the given string with the prefix of every other word in the dictionary and note the maximum. The is an expensive process considering the amount of time it would take. Tries can solve this problem in much more efficient way.

Before processing each Query of the type where we need to search the length of the longest prefix, we first need to add all the existing words into the dictionary. A Trie consists of a special node called the root node. This node doesn't have any incoming edges. It only contains 26 outgoing edfes for each letter in the alphabet and is the root of the Trie.

So, the insertion of any string into a Trie starts from the root node. All prefixes of length one are direct children of the root node. In addition, all prefixes of length 2 become children of the nodes existing at level one.

The pseudo code for insertion of a string into a tire would look as follows:

void insert(String s)
{
for(every char in string s)
{
if(child node belonging to current char is null)
{
child node
=new Node();
}
current_node
=child_node;
}
}

The pseudo code to check wether a single word exists in a dictionary of words or not is as follows:

boolean check(String s)
{
for(every char in String s)
{
if(child node is null)
{
return false;
}
}
return true;
}
Lets us see one program on trie

We're going to make our own Contacts application! The application must perform two types of operations:

  1. add name, where name is a string denoting a contact name. This must store name as a new contact in the application.
  2. find partial, where partial is a string denoting a partial name to search the application for. It must count the number of contacts starting with partial and print the count on a new line.

Given n sequential add and find operations, perform each operation in order.

Input Format:

The first line contains a single integer, n, denoting the number of operations to perform. 
Each line i of the n subsequent lines contains an operation in one of the two forms defined above.

Output Format

For each find partial operation, print the number of contact names starting with partial on a new line.

Below is the implementation in C++ :

#include<bits/stdc++.h>
#include<iostream>
using namespace std;
const int alpha=26;
struct node{
node* child[alpha];
bool end;
int countchild;
};
node* getnode(){
node* pnode=new node;
pnode->end=false;
pnode->countchild=0;
for(int i=0;i<alpha;i++){
pnode->child[i]=NULL;
}
return pnode;
}
void insert(node* root,string key){
node* pcrawl=root;
int size=key.length();
for(int i=0;i<size;i++){
int index=key[i]-'a';
if(!pcrawl->child[index]){
pcrawl->child[index]=getnode();
}
pcrawl->countchild+=1;
pcrawl=pcrawl->child[index];
}
pcrawl->end=true;
}

bool search(node* root,string key){
node* pcrawl=root;
int size=key.length();
for(int i=0;i<size;i++){
int index=key[i]-'a';
if(!pcrawl->child[index]){
return false;
}
pcrawl=pcrawl->child[index];
}
return (pcrawl!=NULL and pcrawl->end);
}
int searchchild(node* root,string key){
node* pcrawl=root;
int size=key.length();
for(int i=0;i<size;i++){
int index=key[i]-'a';
if(!pcrawl->child[index]){
return 0;
}
pcrawl=pcrawl->child[index];
}
if (pcrawl==NULL){
return 0;
}
else{
return pcrawl->countchild;
}
}
int main(){
int n;
string s,s1,s2;
node* root=getnode();
cin>>n;
cin.sync();
for(int i=0;i<n;i++){
getline(cin,s);
bool find=false;
s1="";
s2="";
for(unsigned int i=0;i<s.length();i++){
if(s[i]==' '){
find=true;
continue;
}
if(!find){
s1+=s[i];
}
else{
s2+=s[i];
}
}
if(s1.compare("add")==0){
insert(root,s2);
}
else{
cout<<searchchild(root,s2)<<endl;
}

}
}

Sample Output:

4
add hack
add hackerrank
find hac
find hak

  2

0



More Articles of M Mounika:

Name Views Likes
C++ Segmented Sieve (Print Primes In a Range) 162 0
C++ Sieve Of Erastosthenes 135 0
C++ Gold Mine Problem 295 0
C++ Merge K Sorted Arrays 117 0
C++ K Centers Problem 240 0
C++ Find Nth Catalan Number 311 0
C++ Inplace Rotate square matrix by 90 degrees 285 0
C++ Find Non Repeating Elements in Array 87 0
C++ Merge Two Binary Trees 120 0
C++ Sum of Numbers From Root To Leaf Paths 89 0
C++ Meta Strings 91 0
C++ Flood Fill Algorithm 402 0
C++ smallest substring with maximum distinct characters 199 0
C++ Smallest window with all characters in string 93 0
C++ Minimum Removal of Characters from string to make its permutation as palindrome 86 0
C++ Minimum characters added at front of string in palindrome conversion 69 0
C++ Number of Bracket Reversals needed to make expression Balanced 72 0
C++ String to Palindrome with Append Function 83 0
C++ WildCard pattern matching 75 0
C++ Anagram substring Search 72 0
C++ Manachars Algorithm 74 0
C++ Search String in Grid 83 0
C++ String Matching(Z Algorithm) 67 0
C++ String Matching(Naive Algorithm) 113 0
C++ String Matching(KMP Algorithm) 140 0
C++ Remove Duplicates From String 110 0
C++ Basics of String Manipulation 85 1
C++ Disjoint Data Structure Cycle Detection 86 0
C++ Problem On Disjoint Data Structures 94 0
C++ Disjoint Data Structures Part3 79 0
Disjoint Data Structures Part2 90 0
Disjoint Data Structures 93 1
C++ Segment Trees 321 2
C++ Trie Cost of Data 290 1
C++ Trie Datastructure 279 1
C++ Greedy Approach Minimum number of coins 525 0
C++ Greedy Approach Maximum height Pyramid 328 1
C++ Greedy Approach String lexicographically largest subsequence 246 0
C++ Greedy Approach Lexicographically largest subsequence 364 0
C++ Greedy Approach Prims MST 398 1
C++ Greedy Approach Krushkals MST 458 1
C++ Greedy Approach N-array maximum sum 333 1
C++ Greedy Approach Policemen Catch Thieves 563 1
C++ Greedy Approach Maximum product Subset 546 1
C++ Greedy Approach Minimum Product Subset 348 1
C++ Greedy Approach Fractional Knapsack 737 1
C++ Greedy Approach-Activity Selection Problem 745 1
C++ Greedy Approach-Egyptian Fractions 639 0
C++ Greedy Approach-Huffman Codes 1031 1
C++ Introduction to Greedy Approach 955 2

Comments