Skip to content

Flashbake or Git Gateway Technology?

Flashbake aims to bring version control to writers—or at least writers who have harnessed the power of plain text. Flahsbake is a simplified front end to Git that runs in the background automatically committing changes and recording various ambient information as you write (such as what you were listening to when the commit was made).

Written by commandline (aka Thomas Gideon) at the request (behest?) of Cory Doctorow, Flahsbake was meant to address the problem of retaining an archival record of the production of digital texts. Cory Doctorow explains:

I was prompted to do this after discussions with several digital archivists who complained that, prior to the computerized era, writers produced a series complete drafts on the way to publications, complete with erasures, annotations, and so on. These are archival gold, since they illuminate the creative process in a way that often reveals the hidden stories behind the books we care about. By contrast, many writers produce only a single (or a few) digital files that are modified right up to publication time, without any real systematic records of the interim states between the first bit of composition and the final draft.

The problem is genuine, I have written about it before. Moreover, I agree that version control has a role to play in its solution. However, I have doubts about the utility of Flashbake. It’s simplicity is its virtue, but it is too simple. No commit messages? A record of ambient information is no real substitute. And Flahsbake’s users are supposed to be geeky enough to use a command line tool, but not geeky enough to master the following simplified workflow?:

$ git init
$ git add mynovel.txt
$ git commit -m "initial commit"
write write write
$ git commit -a -m "new commit message"

I am not sure I get it. Still, any version control is better than none. And maybe Flahsbake will function as a Git gateway technology. If you are interested in a less puzzled reaction to Flashbake see the Lifehacker article. But if you want to be a really nerdy writer, just use Git neat.

Typography and Poetry

I wonder how many typographers are closet poets. My last post presented a charming verse of doggerel by Berton Braley. Apparently he is not the only poet-typographer.

By the way, in a previous post when I wrote:

I would like to support small type foundries by buying their fonts. There are some brilliant type designers out there, and they should be rewarded. Unfortunately, since the main thing I want these fonts for is for web distributed PDFs, I can’t do that without violating licensing restrictions. And that’s not support.

one font that I had in mind was Hoefler & Frere-JonesMercury Text:

Mercury Text

Not that I am at all unhappy with the Mac native Hoefler Text.

On the Cognitive Utility of Typography

For art in printing is not the way
Of wild extravagance, weird display,
But rather the unobtrusive thrall
Of type that gives you no shock at all,
But draws your eye to the page with zest
And holds your mind to the thought expressed;
We must keep ourselves to this simple creed,
Type was made - and is meant - to READ!

Unfortunately, the source provided no citation.

Update

On a Need to Know Basis

John Cook Wilson

Advanced features of LaTeX are best learned on a need to know basis. The LaTeX Companion can be daunting in its length. If you thought you needed to master all of that material before writing a LaTeX document you would never do so. So start simple, and learn how to do more advanced things as and when you need to.

Learn on a need to know basis, and when you do learn, hack. Don’t feel like you need to know whether the code will work before running it. The worst thing that will happen is that LaTeX will throw an error. LaTeX errors are instructive. Learn to love them.

These reflections arose today when I needed to do something which should have been simple. It was simple—once I learned how to do it. I was preparing a handout for my seminar. On the handout is a summary of the material that I presented on Cook Wilson and the nature of representation and notes on the ensuing discussion. It would be useful if these were visually distinguished.

My first thought was to use the color package. It has a command \colorbox that specifies the background color of a box. However, the results were suboptimal—small strips of white separate the colored lines. Disappointed, I thought, well, perhaps I can just color the relevant text with \textcolor. The discussion however could run for several paragraphs and it turns out that \textcolor chokes when it wraps multiple paragraphs. It was obvious to me, then, that there was no straightforward, out of the box, solution.

I decided to define a new environment, discussion:

\begin{discussion}
    
\end{discussion}

It requires the color package, and you need to define a color that will be used for the background. I used \definecolor{myblue}{rgb}{0.8,0.8,1}. (Readers of the wonderful Tikz Pgf manual will recognize this as the background color of the code blocks.) The strategy was simple—wrap a minipage in a box. The minipage environment defines, well, a small page:

\begin{minipage}[position]{width}
  
\end{minipage}

The minipage is wrapped in a box. Boxes are usually created by commands, but there is a box environment lrbox:

\begin{lrbox}{}
    
\end{lrbox}

The text within this environment is saved in the box <cmd>. Here is the discussion environment with all the bells and whistles.

\makeatletter\newenvironment{discussion}{%
   \noindent\begin{lrbox}{\@tempboxa}\begin{minipage}{\columnwidth}\setlength{\parindent}{1em}}{\end{minipage}\end{lrbox}%
   \colorbox{myblue}{\usebox{\@tempboxa}}
}\makeatother

Here’s the gist:

(Sorry for the duplication, but embedded gists are not showing up in some news readers.)

Two comments. The initial \noindent is required otherwise the entire box will be indented—which would be awkward since the width of the minipage is set to the width of the column. Another issue is that there is no paragraph indentation within the minipage environment. In effect, within that environment, LaTeX sets \parindent to 0. This, however, can be overridden with the \setlength command:

\setlength{\parindent}{1em}

Here is the result: Discussion Environment

The “Blog” of “Unnecessary” Quotation Marks

More evidence of typographic rage: The “Blog” of “Unnecessary” Quotation Marks. I have posted on this phenomenon before, here, here, here, and here. Perhaps I need a new tag for this issue.

However, it’s the difference wherein the interest lies. Unlike obsessing about straight versus typographic quotes this is more straightforwardly a “semantic” issue about the significance of embedding material within quotations. But the actual usage of quotation is quite complex as anyone familiar with the recent philosophical literature on quotation will attest. Far more complex than the simple semantics implicit in the charge of rampant use/mention conflation. Personally I abhor the use of quotation marks for emphasis, but the my distaste for it is simply that, distaste. It’s vulgar and largely confined to cheap marketing.

Semantic drift is a reality. Don’t fear change. It will only result in needless rage.

Dirty Prompts

Playing with your bash prompt can seem like nothing more than an idle diversion. It is an idle diversion, it is just the “nothing more” bit that I would argue with. In a previous post I discussed how the bash prompt can reflect what git branch you are on. Now that’s useful. Seriously. But what about the “dirty state” of the branch—whether or not there is any uncommitted changes. I have gotten into the habit of running git status before I do anything in large part to check the dirty status of the branch. Couldn’t this reflex be automated? And reflected in the bash prompt? Yes, yes it can. Inspired by this post and this, I decided to update my bash prompt once again.

Here is a screenshot illustrating the clean and then dirty state of the development branch.

Dirty prompt

Font Restrictions

As I have remarked before, good typography does not merely have aesthetic virtue. Importantly, it has cognitive virtue as well. Good typesetting makes your work easier to understand. A good font is but one element of typesetting, and a font may be appropriate to one context but not others. Still, font choice is one of those important decisions in typesetting your documents that you are forced to make.

Legislation that has not kept abreast of changing technology can make the choice difficult.

As a philosopher, I write research papers, drafts of which are distributed on the web as PDFs. Open access to evolving research is important, and I am committed to it. Since I want to give my work the best chance of being understood, I take the time to properly typeset the PDFs with XeLaTeX. There is a problem, however, with distributing PDFs over the web.

PDF files can contain font information in a way that is easily extractable from the file. While the licensing of some type foundries allow embedded fonts in PDFs, many (especially smaller type foundries) do not. Indeed the ones that did probably did so at Adobe’s urging when PDF distribution on the web was relatively small and so not that great of a risk.

I would like to support small type foundries by buying their fonts. There are some brilliant type designers out there, and they should be rewarded. Unfortunately, since the main thing I want these fonts for is for web distributed PDFs, I can’t do that without violating licensing restrictions. And that’s not support.

There are of course open source fonts. Some of them are fine pieces of work. But the choice is limited, and important design decisions should not be so constrained.

DRM is not the answer, as the recent history of music distribution online sadly reveals.

I don’t know how to resolve this problem. It is partly technological, partly, legal. But I thought I would highlight for other academics who distribute their work online.

Upon finishing this post, I came across this essay that has more information about the legal and technological obstacles with some discussion of potential solutions.

Gist-ing from TextMate

Well that didn’t take long. In an earlier post, I remarked that with command line support for Gist, the git powered pastebin service, TextMate support for Gist was now within reach. There is now a gist command in the GitHub bundle. You can either post private or public gists. The gist that figured in the previous post was posted from within TextMate.

To install the GitHub bundle do the following:

$ sudo gem install git
$ cd ~/"Library/Application Support/TextMate/Bundles/"
$ git clone git://github.com/drnic/github-tmbundle.git "GitHub.tmbundle"
$ osascript -e 'tell app "TextMate" to reload bundles'

LaTeX TODO

The Problem

One of the great features of using TextMate to produce LaTeX documents is the TODO Bundle. The TODO Bundle let’s you to insert TODOs into comments and display these in a nicely formatted HTML window with links to the lines where the TODOs occurred.12

There are two limitations with the TODO Bundle, however, which made me look for an alternative (I still use the TODO Bundle, the alternative is merely a supplement):

  1. While I try to proof my documents as much as I can onscreen, sometimes I need to proofread the hardcopy. Proofreading hardcopy is easier, and any incremental decrease in distraction is a real boon in proofreading since it requires a lot of attention.
  2. Suppose you are collaborating on a LaTeX document and your collaborator isn’t using TextMate. Of course, they can still follow the TODOs in the comments, but it would be great if these could be made more salient.

The Solution

The solution to both of these problems is to use LaTeX to generate the TODO list for you. To do this, I used the index package. The index package reimplements the internal LaTeX index macros adding functionality. Of particular interest is its support for multiple indexes. Add the following to your preamble:

The first line loads the color package (since our TODOs will be colored to make them stand out from the surrounding text). The second line loads the index package. The next two lines:

\newindex{todo}{tod}{tnd}{TODO List} % start todo list
\newindex{fixme}{fix}{fnd}{FIXME List} % start fixme list

define two indexes. The \newindex command takes four arguments. These arguments correspond to the four pieces of information required to generate the index:

  1. A short, unique tag that identifies the index.
  2. The extension of the output file where the raw index information will be put by LaTeX.
  3. The extension of the input file where the processed information created by MakeIndex will be stored to be read in later by LaTeX.
  4. The title of the index.

The next two lines:

\newcommand{\todo}[1]{\textcolor{blue}{TODO: #1}\index[todo]{#1}} % macro for todo entries
\newcommand{\fixme}[1]{\textcolor{red}{FIXME: #1}\index[fixme]{#1}} % macro for fixme entries

define two new commands: \todo and fixme. To add a TODO simply add the following at the appropriate place in the text:

\todo{My TODO entry}

Similarly, for FIXMEs.

New commands can be added on a similar pattern. So, for example, suppose you want to add a CHANGE command. To do this, be sure to also define a new index:

\newindex{change}{chg}{cnd}{CHANGE List} % start change list
\newcommand{\change}[1]{\textcolor{green}{CHANGE: #1}\index[change]{#1}} % macro for change entries

Notice, that the commands are color coded, so be sure to change the color. Since the previous two commands were blue and red, I made the CHANGE command green.

If you want to generate a list of the TODOs, FIXMEs, etc, at the end of your document you need to use the \printindex command. This takes as an option the name of the index, so for TODOs and FIXMEs we would use:

\printindex[todo]
\printindex[change]

Before we typeset these indexes, we need to run the makeindex command. Suppose that the name of your LaTeX document is foo.tex. Then, in the terminal we would run:

$ makeindex -o foo.tnd foo.tod
$ makeindex -o foo.fnd foo.fix

The -o option is used to specify the name of the file generated by makeindex that is used, in turn, by LaTeX to typeset the index. Having run makeindex, if we now typeset the document, the indexes will be printed at the end of the document after a pagebreak. And if you are using the hyperref package, these will have links to the pages where the TODOs and FIXMEs are inserted. This step can be automated with a makefile such as Latexmk.pl.

iPhone Blogging

Would you really want to blog from your iPhone? While Twitter apps really come into their own on mobile devices, blogging is a longer form not well suited for text input on an iPhone. Nevertheless, part of me is glad that it can be done. This post is being written on my iPhone thanks to the Wordpress app. It seems well designed, but, you know, I need a keyboard for my thoughts to flow. So I can’t say that I will be doing this too often. I miss my text editor too much and the power it invests in me.

Progress

Git commits by day and hour on the Philosophy BibTeX project.

Been working on some scripts to clean up the BibTeX file, to normalize cite keys, to render consistent author and journal names, to strip out local URLs, etc. So look forward to a new development branch and a directory of utilities.

Donald Knuth no Ringo Starr

Donald Knuth is renowned for offering a bounty for bugs found in TeX. Many of these checks remain forever uncashed, the recipients rightly regarding the signed check an honor greater than the money it represents. Sadly, this practice has come to an end. No Donald Kuth has not died, nor is he, like Ringo Starr, refusing to respond further to inquiries. Rather, the relative ease of financial fraud and the fact that he has been the victim of such fraud has forced him to give way to prudence and end the tradition. The bounty still exists, but signed checks will no longer be forthcoming.

Google Book Search

Google Book Search, a surprisingly controversial if welcome Google app, has reached a ground breaking settlement. See here. A highlight: US users—alas not me, an expatriate American—will have access to out of print but not out of copyright books as well as the ability to buy these. Of course there is more. See also the Google blog. Google Book Search has been a tremendous boon for scholars. Even though the preview has been crippled (something that will improve under the new agreement), just being able to get a glimpse at some material has been a real benefit. This is huge. We can only hope that this is but a first step to wider access to our literary heritage.

Akismet Stats

Akismet, Matt Mullenweg’s anti-spam WordPress plugin, now provides statistics. These statistics are displayed in useful graphics. In checking them out, I was struck by the following graph:

Spam Graph

That’s a sharp downturn in spam. I know that this is a little read technical blog by an academic, but there has been no corresponding downturn in traffic that would explain this. Could times be tough, not only for investment bankers, but for spammers as well?

Naming Tabs in Leopard Terminal

Leopard’s terminal was a huge improvement, but issues remain. One of the welcome additions to the terminal was tabs, but there is no way to name them. With a number of tabs open, this can make navigation tedious. If only there were a convenient way to name tabs. Thanks to Erik Anderson there is. Terminal.app Tab Namer is a SIMBL plugin that allows you to assign a name to an open tab with command-shift-T. Just install the plugin in /Library/Application Support/SIMBL/Plugins.

Tabs in Leopard

GMU drops Endnote Support

In what is widely regarded as a nuisance suit, Thomson Reuters, the maker of Endnote, is suing GMU for their support in developing the open source bibliographic software Zotero. For more on the controversy see here, here, and here.

In a recent announcement, GMU reports that they will be dropping their Endnote license:

With litigation pending between Thomson and Mason, we’re letting our campus site license for EndNote expire at the end of November. When it lapses, any copy of EndNote that was downloaded and installed under the terms of that license will have to be uninstalled and removed.

In addition, GMU has provided a helpful website explaining how to migrate from Endnote to Zotero.

This disturbing incident is further evidence, if evidence were needed, of the perils of keeping your data in propriety formats. Keeping your data in formats that comply with open standards is really the only guaranteed way to control and reliably share your data.

Command Line Gist

As I posted earlier, gist is a Git powered pastebin service. Very handy. Handier still would be a command line interface to gist. Thanks to Github’s own Chris Wanstrath, aka defunkt, a command line interface with gist is now a reality.

To install:

curl http://github.com/defunkt/gist/tree/master%2Fgist.rb?raw=true > gist &&
chmod 755 gist &&
sudo mv gist /usr/local/bin/gist

Some usage examples:

cat file.txt | gist
echo hi | gist
gist 1234 > something.txt

Using gist from within TextMate should now be trivial.

Great Wall of China

China has blocked access to GitHub. See here

SyncTeX: Why it Matters

One of the features of MacTeX 2008 that I was looking forward to was its inclusion of SyncTeX, Jerome Laurens replacement for pdfsync.

There are two ways to call SyncTeX, from the command line and in source. From the command line simply use the argument -synctex=1, and from source include \synctex=1 in the preamble. The former may be preferable if you want to distribute your LaTeX source—that way it will run without errors even by those whose TeX distributions do not include SyncTeX support.

Here is how synchronization worked with pdfsync. A word in your source is associated with a file name and a line number where it occurs. The file name and line number together constituted the input record. That same word as it occurs in the pdf generated from the source is associated with a page number and a location on that page. Together these constituted the output record. Synchronization was achieved by linking the input record with the output record. Moreover this was done by assigning each of these records with a unique tag. This was necessary since while the TeX engine will place a word in the source on some page in the generated pdf and on some location on that page, by the time it has done this, it has “forgotten” the file name and line number. Unique tags were thus required to preserve this information for synchronization. These tags were stored in special data nodes in every pargraph, math display, etc.

There were three problems, however, with this synchronization solution:

  1. The data nodes used by pdfsync could interfere with TeX’s line breaking mechanism and thus affect the layout of the page.
  2. pdfsync was incompatible with certain LaTeX packages. While some could be rewritten to accommodate pdfsync, others could not.
  3. The mapping from the input record to the output record was not one-one, but one-many. So developers of pdf viewers supporting pdfsync had to choose the right mapping from the input record to the output record, and this was not always easy to do.

Instead of using special data notes, SyncTeX exploits kern nodes and glue nodes that are determined early on in TeX’s processing of the source to track the location of a word in the pdf output.

Another advantage of SyncTeX over pdfsync is that SyncTeX also supports dvi and xdv outputs (not that useful for me, but your situation may differ).

The TeXLive distribution, which MacTeX is built upon, is the first implementation of SyncTeX, and both PdfTeX and XeTeX in TeXLive now have SyncTeX support embedded deep within these TeX engines.

Philosophy and Microblogging

Ryan Paul in an Ars Technica article, Byte-sized stories: Twittering a tiny tale, wonders about philosophy and microblogging:

Microblogging can clearly work with fiction, but what about more substantive works, like philosophical treatises? In a moment of intoxication inspiration, I came up with a quick Python one-liner1 to compute how many lines in an English translation of Ludwig Wittgenstein’s celebrated “Tractatus Logico-Philosophicus” are 140 characters in length or less. I discovered that a bit under half of the lines are short enough to be Twittered (and, as the philosopher would say, what we cannot Twitter, we must pass over in silence).

In my own case, twittering philosophy has so far wholly consisted in mocking philosophers, gossiping about them, and git commit messages, thanks to GitHub’s service hooks. Perhaps others can offer tweets with more substantive philosophical content. After all, aphorism is not unknown to philosophy—the pre-Socratics and Nietzsche come to mind. For now, teh interwebs await a latter day Heraclitus.


  1. The Python-one liner: len([l.strip() for l in open("tractatus.txt").readlines() if len(l.strip()) <= 140 and l.strip() != ""]) 

FireStats icon Powered by FireStats