When comparing two files,
diff finds sequences of lines common to
both files, interspersed with groups of differing lines called
hunks. Comparing two identical files yields one sequence of
common lines and no hunks, because no lines differ. Comparing two
entirely different files yields no common lines and one large hunk that
contains all lines of both files. In general, there are many ways to
match up lines between two given files.
diff tries to minimize
the total hunk size by finding large sequences of common lines
interspersed with small hunks of differing lines.
For example, suppose the file
F contains the three lines
c, and the file
G contains the same
three lines in reverse order
diff finds the line
c as common, then the command
diff F G produces this output:
diff notices the common line
b instead, it produces
It is also possible to find
a as the common line.
does not always find an optimal matching between the files; it takes
shortcuts to run faster. But its output is usually close to the
shortest possible. You can adjust this tradeoff with the
--minimal option (see diff Performance).