Skip to content

LaTeX, Word Count, and TextMate

I love LaTeX. But some things that are easy to do are just not obvious, especially to the uninitiated. Like word count. Someone recently asked me how to determine the word count of a LaTeX document. The problem is that we want to ignore the LaTeX markup, so just counting ‘words’ with:

wc -w 

is going to give an inflated estimated. Explicitly filtering the LaTeX markup is impractical—too many packages not to mention user-defined commands. So what are we to do?

Fortunately, the utility:

ps2ascii

can help. It converts postscript and pdf into text. By typesetting the LaTeX document we, in effect, strip out the markup. So if we typeset our LaTeX document with

pdflatex

running the following command in the terminal will return the word count:

ps2ascii mydocument.pdf | wc -w

That was easy. Let’s make it easier. If you are lucky enough to be writing your LaTeX document in TextMate, you might want to check the word count of your document as you are writing it. You could use the statistics command, ⌃ ⇧ N, but that would give the inflated estimate. It would be better to check the LaTeX document’s directory for the typeset pdf, if any, and then run the above command. Here is a command that does just that:

NAME="${TM_FILENAME}"
BASENAME="${NAME%.*}"

if [ -a "$TM_DIRECTORY"/"$BASENAME".pdf ] 
    then ps2ascii "$TM_DIRECTORY"/"$BASENAME".pdf | wc -w
    else echo "You must typeset your document before a word count can be determined."
fi

Here is a screenshot of the command in the Bundle Editor:

Bundle Editor

You can download the command here. Now go count some words.

{ 3 } Trackbacks

  1. benedict o'neill | April 16, 2007 at 1:35 pm | Permalink

    […] quickly Google’d and found a blog post on how to do it […]

  2. LaTeX and Word Count Revisited at Excursus | December 30, 2007 at 4:41 am | Permalink

    […] an earlier post I described a TextMate command for determining the word count for a LaTeX document. Simpling […]

  3. […] an earlier post I described a TextMate command for determining the word count for a LaTeX document. Simpling […]

FireStats icon Powered by FireStats