Introduction to difflib module in python














































Introduction to difflib module in python



The difflib module in python as the name suggests provides
classes and functions for comparing different sequences and datasets. Thus this
module gives you the differences between files in various formats including
HTML.



The difflib module is mainly used for comparing  strings but it can also be used for comparing
other datatypes as long as their value remains the same during its life time
i.e. the objects are
Hashable.

Understanding the most commonly used classes in difflib
m
odule:

I] Sequence Matcher



The Sequence matcher class in
difflib compares the pair of sequences  that
are provided by the user and returns the data that presents how similar the two
strings are.


Different  functions of Sequence Matcher class:



1



.ratio()



2



.get_matching_blocks()



3



.find_longest_match()



Lets take an Example :


CODE




OUTPUT




The above code takes two input
sequences from the user which are


String1: Welcome to cppsecrets.com

String2: Welcome to cppsecrets.com
a professional handbook for python


And compares it using Sequence
matcher class. Then the result value is printed using  .ratio() function.  This ratio() determines the ratio of
characters that are similar in the two strings and the result is then returned
as a decimal.


II] Differ

The differ class of difflib module is the opposite of Sequence
Matcher class. It returns the differences between the strings that are provided
by the user. The Differ class is special in its utilization of
deltas, making it even more efficient and readable for humans for spotting the
differences.

Each line of a Differ delta begins with a
two-letter code:

    '-'    line unique to sequence 1

    '+'    line unique to sequence 2

    '  '    line common to both sequences

    '?'    line not present in either input
sequence


Lets take an Example:


CODE




OUTPUT




The above code
takes two sting inputs from the user


String 1:
Welcome to cppsecrets

String 2:
Welcome to cppsecrets.com


Then the splitlines() function is used on the two strings which compares the string by each
line

The dif
variable in the above code contains the differ class and another variable
difference contains the differ with compare() object, taking in the two string
as parameters.

Then the result
is printed where ‘+’ sign is printed below the newly added characters of the
string.






Comments