Understanding the Unified_diff Class of Difflib Module














































Understanding the Unified_diff Class of Difflib Module



unified_diff

unified_diff() is one of the  classes of difflib module of python. The unified_diff and context_diff classes work in identical ways. The only difference between the two is the Output.

The unified_diff compares two strings and returns a delta (a generator generating the delta lines) in unified diff format. The output shows just the lines that have changed plus few lines of context in a compact way. The changes are shown in an inline style (instead of separate before/after blocks).

SYNTAX:

  unified_diff(a, b, fromfile='', tofile='', fromfiledate='',
tofiledate='', n=3, lineterm='\n')


  1. a & b:  List of input string that are to be compared
  2. fromfile & tofile :  A header for file names that are compared. If not specified, the strings default to blanks.
  3. fromfiledate & tofiledate : The modification times of the files that are compared. The modification times are normally expressed in the ISO 8601 format. If not specified, the strings default to blanks.
  4. n :  The number of context lines. By default the value of n is set to 3.
  5. lineterm : This is for the inputs that do not have the trailing newlines. We can set the lineterm argument to " ", so that the output will be uniformly newline free.
By default, the diff control lines (those with ---, +++, or @@)  are created with a trailing newline. The inputs created from io.IOBase.readlines() result in diffs that are suitable for use with io.IOBase.writelines() since both the inputs and outputs have trailing newlines.

Lets see some Example

CODE:


import difflib
import sys
from difflib import unified_diff
str1 = [
'DeepLearning\n', 'NLP\n', 'Algorithms\n',

'Artificial Intelligence\n', 'Machine Learning\n']
str2 = [
'DeepLearning\n', 'NLP\n', 'Robotics\n',

'Artificial Intelligence\n','ComputerVision\n', 'Machine Learning\n']

Difference= unified_diff(str1, str2,
fromfile='', tofile='',n=3
)
sys.stdout.writelines(Difference)

Thus in the above code we have imported the required modules that are difflib and sys. We have defined two variables that are the two input list of strings. Then the unified_diff function is used to remove the words from the first variable and add the words from the second variable to the first one.

In this code the number of context lines is set to default i.e. n=3 and also the header for filenames and modification times is not given.

OUTPUT:


The output returns the removed words with a prefix  "-"  and the added words prefixed with "+'.  The word present in both the lists is returned with no sign.


EXAMPLE 2:


import difflib
import sys
from difflib import unified_diff

str1 = [
'DeepLearning\n', 'NLP\n', 'Algorithms\n',

'Artificial Intelligence\n', 'Machine Learning\n']
str2 = [
'DeepLearning\n', 'NLP\n', 'Robotics\n',

'Artificial Intelligence\n','ComputerVision\n', 'Machine Learning\n']

Difference= unified_diff(str1, str2,
fromfile='Before', tofile='After',n=0)
sys.stdout.writelines(Difference)

Here the number of context lines is changed to n=0, Also the fromfile & tofile are mentioned.

Thus the Output returns only the words which are altered.




Comments