Python re Introduction

Python re Introduction


A regular expression (RegEx) is a special sequence of characters that uses a search pattern to find
a string or set of strings. It can tell us whether a substring is present in the given string and tell us the position of the string too. It can also perform many other functions like splitting the string into one or more sub-strings based on the given delimiter

Regular expression and raw strings

Regular expressions use the backslash character (‘’) to indicate special forms or to allow a special character to be used without invoking any special meaning to it. This collides with the python usage of the same character for the same purpose in string literals. For example: if we write a string as "\n" it is considered the newline character of one byte whereas if we write r"\n" then it is considered as two bytes character string containing '' and 'n'. 

For using RegEx in python, Python provides us with the module re which is already a pre-installed module in it

History of Regular Expressions:

It originated in the year 1951, originated from Regular expressions by Stephen Cole Kleen when using his mathematical notation called regular events. Regular expressions became popular in two cases in 1968, first, pattern matching in a text editor, and second, lexical analysis in a compiler. The regular expression made its first appearance in the program when Ken Thompson built Kleene’s notation into the editor QED as a means to match patterns in a text editor. More complicated regexes were where in use in Perl, derived from the regex library written by Henry Spencer. Today RegEx is used in many programming languages, text processing programs ( mainly lexers), advanced text editors, and some other programs. Now it has become a standard library in many programming languages like java and python

Application of Regular Expression:

1.     It is used in extracting emails in text documents.

2.     Used in web scrapping (Data Collection)

3.     Working with Date and Time features

4.     RegEx can be used in text-processing (NLP)