Skip to content

Diff Programs Diffed

In my very first post, I mentioned that a lot of tools for programmers are readily adapted to the task of writing. When writing long, complex documents, it is sometimes necessary to compare versions. If you have embraced the power of plain text, a diff program can help. Diff programs display differences between files. This post will examine a number of different diff programs from a writer’s perspective.

As our sample text we will use the following passage from Moby Dick (available at Project Gutenberg):

While Daggoo and Queequeg were stopping the strained planks; and as the whale swimming out from them, turned, and showed one entire flank as he shot by them again; at that moment a quick cry went up. Lashed round and round to the fish’s back; pinioned in the turns upon turns in which, during the past night, the whale had reeled the involutions of the lines around him, the half torn body of the Parsee was seen; his sable raiment frayed to shreds; his distended eyes turned full upon old Ahab.

The harpoon dropped from his hand.

and this slightly modified version:

While Dagoo and Queequeg were stopping the strained planks; and as the whale swimming out from them, turned, and showed one entir flank as he shot by them again; at that moment a quick crie went up. Lashed round and round to the fish’s back; pinioned in the turns upon turns in which, during the past night, the whale had reeled the involutions of the lines around him, the half torn body of the Parsee was seen; his sable raiment frayed to shreds; his distended eyes turned full upon old Ahab.

The harpoon dropped from his hand.

I saved these as text files called, respectively, passageone.txt and passagetwo.txt.

First up is GNU diff. See the man page for diff’s options. An extensive manual is provided by Texinfo. Simply type:

info diff

in the terminal. The command:

diff passageone.txt passagetwo.txt

yields the following output:

1c1
< While Daggoo and Queequeg were stopping the strained planks; and as the whale swimming out from them, turned, and showed one entire flank as he shot by them again; at that moment a quick cry went up. Lashed round and round to the fish's back; pinioned in the turns upon turns in which, during the past night, the whale had reeled the involutions of the lines around him, the half torn body of the Parsee was seen; his sable raiment frayed to shreds; his distended eyes turned full upon old Ahab. 
---
> While Dagoo and Queequeg were stopping the strained planks; and as the whale swimming out from them, turned, and showed one entir flank as he shot by them again; at that moment a quick crie went up. Lashed round and round to the fish's back; pinioned in the turns upon turns in which, during the past night, the whale had reeled the involutions of the lines around him, the half torn body of the Parsee was seen; his sable raiment frayed to shreds; his distended eyes turned full upon old Ahab. 
3c3
<    The harpoon dropped from his hand.
---
> The harpoon dropped from his hand.
\ No newline at end of file

One limitation, from the writer’s perspective is immediately apparent. GNU diff compares line differences. But as paragraphs are long lines, multiple differences within a paragraph are not displayed. What would be more useful is a representation of word differences.

Fortunately, there is a frontend for GNU diff that displays word differences, wdiff. wdiff is available from fink and MacPorts. The command:

wdiff passageone.txt passagetwo.txt

yields the following output:

While [-Daggoo-] {+Dagoo+} and Queequeg were stopping the strained planks; and as the whale swimming out from them, turned, and showed one [-entire-] {+entir+} flank as he shot by them again; at that moment a quick [-cry-] {+crie+} went up. Lashed round and round to the fish's back; pinioned in the turns upon turns in which, during the past night, the whale had reeled the involutions of the lines around him, the half torn body of the Parsee was seen; his sable raiment frayed to shreds; his distended eyes turned full upon old Ahab. 

The harpoon dropped from his hand.

[-foo-] is a word that occurs in passageone.txt and {+foo+} is a word that occurs in passagetwo.txt.

In an earlier post I observed that the GUI has its place even in the manipulation of text. If you install the developer tools on Mac OS X, one gem that you will get is FileMerge, a descendant of NEXTSTEP’s Merge utility. FileMerge provides a visual comparison of text files. FileMerge can be invoked from the command line with opendiff. A screenshot of the output is below:

FileMerge

Notice, FileMerge, like GNU diff, captures line differences, but, like wdiff, it also highlights word differences.

FileMerge is Apple software, so it is simple and easy to use. If you need a more powerful GUI diff program, kdiff3 might be for you. Here’s a screenshot:

kdiff3

{ 2 } Trackbacks

  1. Subversion and TextMate at Excursus | May 11, 2007 at 11:10 pm | Permalink

    […] commit. The diff command displays line differences—not great for prose, as I have emphasized here and here. Fortunately, selecting commit brings up the following dialogue […]

  2. Gitting BibTeX at Excursus | June 17, 2008 at 10:15 pm | Permalink

    […] entries that I have made independently. Manually merging this material can be a pain, even with diff […]

Post a Comment

You must be logged in to post a comment.
FireStats icon Powered by FireStats