Tokenize text using NLTK in Python.














































Tokenize text using NLTK in Python.



Description:

NLTK is one of the leading platforms for working with human language data and Python, the module NLTK is used for natural language processing. NLTK is literally an acronym for Natural Language Toolkit.

Follow this link if you want to  know more about NLTK.



Tokenize Words:

 A sentence or data can be split into words using the method word_tokenize().

from nltk.tokenize import sent_tokenize,word_tokenize data="Hey! How are You ?" print(word_tokenize(data))

Output:
['Hey', '!', 'How', 'are', 'You', '?']

All of them are words except the comma. Special characters are treated as separate tokens.

Tokenizing Sentences

The same principle can be applied to sentences. We can tokenize the sentences by using method sent_tokenize().

from nltk.tokenize import sent_tokenize data="Hey! How are You ? Hey! How are You ? " print(sent_tokenize(data))

Output:

['Hey!', 'How are You ?', 'Hey!', 'How are You ?']

NLTK and Arrays 

from nltk.tokenize import sent_tokenize,word_tokenize data="Hey! Are you okay? Hey! Are you okay?" sen=sent_tokenize(data) words=word_tokenize(data) print(sen) print(words)
 
Output

['Hey!', 'Are you okay?', 'Hey!', 'Are you okay?']
['Hey', '!', 'Are', 'you', 'okay', '?', 'Hey', '!', 'Are', 'you', 'okay', '?']


More Articles of Khushboo Singh:

Name Views Likes
Python program to insert an element in binary tree. 867 20
Tokenize text using NLTK in Python. 1249 12
Python Remove multiple elements from list while Iterating. 773 22
Python How to Check if an item exists in list ? 4320 14
Python How to remove multiple elements from list ? 786 26
Python program to check if two trees are mirror of each other without using recursion. 695 19
Python program to find maximum in Binary tree. 972 19
Python Check if all elements are same using Set 752 15
Python program to find diameter of a binary tree. 1125 20
Python program to print root to leaf paths without using recursion. 870 20
Python program to find root of the tree where children id sum for every node is given. 710 23
Introduction of Python NLTK library 1402 25
Categorizing and Tagging Sentences using NLTK in Python . 1058 19
Python program to find height of a tree without using recursion. 689 16
Python program to find sum of all nodes of the given perfect binary tree. 696 19
Python program to find minimum in binary tree. 859 23
Python Check if element exist in list using list.count() function. 720 13
Python program to convert a given binary tree to doubly linked list. 922 20
Python program to find distance between two nodes of a binary tree. 1561 20
NLTK stop Words 1194 13
Python program to find largest binary search tree in a Binary Tree. 973 20
Python program to find inorder successor in binary search tree with recursion. 1243 18
Python program to convert a binary tree into doubly linked list in spiral fashion. 828 15
Python List check if element are same using all() 713 12
Python program to check if two trees are identical using recursion. 738 30
Python Find the occurrence count of an element in the tuple using count() 1000 23
Python Convert two lists to a dictionary 761 19
Python program to construct a complete binary tree from given array. 1480 14
Python program to find diameter of binary tree in O(n). 907 17
Introduction to the AVL tree. 809 15
Python program to check if two trees are identical without using recursion 692 17
Python Convert a list of tuples to dictionary. 1132 24
Python program to convert a binary tree to a circular doubly link list. 687 21
Python Check if element exist in list based on own logic. 782 23
Python program to merge two binary trees by doing node sum using recursion 1042 27
Python program to check whether a given binary tree is perfect or not. 710 17
Python Check if all elements are same using list.count(). 1125 28
Python program to find an element into binary tree 657 12
Python program to find lowest common ancestor in a binary tree 1254 24

Comments