<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Excursus &#187; OOXML</title>
	<atom:link href="http://markelikalderon.com/category/ooxml/feed/" rel="self" type="application/rss+xml" />
	<link>http://markelikalderon.com</link>
	<description>Philosophy and Text</description>
	<lastBuildDate>Sat, 04 Sep 2010 13:22:58 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Archival Formats, The Third Way</title>
		<link>http://markelikalderon.com/2008/03/29/archival-formats-the-third-way/</link>
		<comments>http://markelikalderon.com/2008/03/29/archival-formats-the-third-way/#comments</comments>
		<pubDate>Sat, 29 Mar 2008 21:32:22 +0000</pubDate>
		<dc:creator>Mark Eli Kalderon</dc:creator>
				<category><![CDATA[Markup]]></category>
		<category><![CDATA[ODF]]></category>
		<category><![CDATA[OOXML]]></category>
		<category><![CDATA[PDF]]></category>
		<category><![CDATA[version control]]></category>

		<guid isPermaLink="false">http://markelikalderon.com/blog/2008/03/29/archival-formats-the-third-way/</guid>
		<description><![CDATA[I have been meaning to blog about this for awhile. File this under &#8220;Better Late Than Never&#8221;. What&#8217;s the best archival format for your important documents? In a previous post I suggested parchment might be&#8212;but that&#8217;s impractical. All joking aside, the issue is a serious for anyone who is going to spend the better part [...]]]></description>
			<content:encoded><![CDATA[<p>I have been meaning to blog about this for awhile. File this under &#8220;Better Late Than Never&#8221;.</p>

<p>What&#8217;s the best archival format for your important documents? In a previous <a href="http://markelikalderon.com/blog/2007/04/28/parchment-and-archival-formats/">post</a> I suggested parchment might be&#8212;but that&#8217;s impractical. All joking aside, the issue is a serious for anyone who is going to spend the better part of their life writing and needs reliable access to this material.</p>

<p>James King of <a href="http://www.adobe.com/" title="Adobe">Adobe</a> has a blog, <a href="http://blogs.adobe.com/insidepdf/" title="Inside PDF">Inside PDF</a>, that&#8217;s well worth checking out. Besides covering ISO 32000&#8212;the <a href="http://www.iso.org/" title="ISO - International Organization for Standardization">ISO</a> standard based on PDF 1.7, James has made an interesting <a href="http://blogs.adobe.com/insidepdf/2007/10/archiving_documents.html">case for PDF/A as an archival format</a>.</p>

<p>Part of the case is a case <em>against</em> <a href="http://en.wikipedia.org/wiki/XML" title="XML - Wikipedia, the free encyclopedia">XML</a> based alternatives such as <a href="http://en.wikipedia.org/wiki/OpenDocument" title="OpenDocument - Wikipedia, the free encyclopedia">ODF</a> and <a href="http://en.wikipedia.org/wiki/Office_Open_XML" title="Office Open XML - Wikipedia, the free encyclopedia">OOXML</a>. One problem is false advertising: While they do contain XML subfiles, they are, in fact <a href="http://en.wikipedia.org/wiki/ZIP_(file_format" title="ZIP (file format) - Wikipedia, the free encyclopedia">ZIP archives</a> that contain, besides binary files. Not only false advertising, but false promises as well:</p>

<blockquote>
  <p>There is what I think is a rather technically shallow belief that XML files are easier to work with and will survive the passing of time, even great periods of time, better than other formats. The text held within XML files can usually be viewed with any generic text editor and I guess that gives people a warm feeling that it will therefore also be easier to retrieve with a program. Fair enough. But what is glossed over way too much is that that text is enveloped within XML for [something]. (<a href="http://blogs.adobe.com/insidepdf/2007/09/xml_for.html">See my earlier blog entry.</a>) The envelopes (schemas) offered by ODF and OOXML are different. Different enough that a simple program cannot extract just the raw text from either. And is that all I really want from a document in the future, the raw text. Because when you get to the layout and the images and the color space definitions and the fonts, these things do not lend themselves well to XML and are often stored within the ZIP archives as binary data. So tell me again where the advantage to XML is for this purpose?</p>
</blockquote>

<p>To these two criticisms, let me add a third. The structure encoded in the XML subfiles of ODF and OOXML is not the logical structure of the document but the functions of the word processor. But that&#8217;s <em>not</em> what needs preserving.</p>

<p>Rob Weir at an <a href="http://www.robweir.com/blog/">Antic Disposition</a>, not surprisingly, had an alternative view. Rob observes that not all goals that one might have in archiving is well served by PDF. Reflection on these raise a number of questions, none of which are are answered by PDF:</p>

<ol>
<li>What was the nature of collaboration that lead to this document? How many people worked on it? Who contributed what?</li>
<li>How did the document evolve from revision to revision?</li>
<li>In the case of a spreadsheet, what was the underlying model and assumptions? In other words, what are the formulas behind the cells?</li>
<li>In the case of a presentation, how did the document interact with embedded media such as audio, animation, video?</li>
<li>How was technology used to create this document? In what way did the technology help or impede the author&#8217;s expression? (Note that researchers in the future may be as interested in the technology behind the document as the contents of the document itself.)</li>
</ol>

<p>Nevertheless, Rob is not blind to the attractions of PDF only sensitive to the way it offers a partial solution to the problem of archiving. In the end he entertains a hybrid approach:</p>

<blockquote>
  <p>An intriguing idea is whether we can have it both ways. Suppose you are in an ODF editor and you have a &#8220;Save for archiving&#8230;&#8221; option that would save your ODF document as normal, but also generate a PDF version of it and store it in the zip archive along with ODF&#8217;s XML streams. Then digitally sign the archive along with a time stamp to make it tamper-proof. You would need to define some additional access conventions, but you could end up with a single document that could be loaded in an ODF editor (in read-only mode) to allow examination of the details of spreadsheet formulas, etc., as well as loaded in a PDF reader to show exactly how it was formated.</p>
</blockquote>

<p>There is a third way.</p>

<p>It may not be the Final Solution (especially given its current incarnation). But it has the advantages of both approaches and the deficits of neither: Structural markup of plain text files kept under version control. Plain text is the <em>lingua franca</em> of computers and will remain that way in the foreseeable future. Any given file will remain editable, but any given commit will be preserved. Moreover, the version control system will preserve a wealth of metadata about the development of the document, the contribution of collaborators, etc. There are choices in implementation concerning both the markup&#8212;be it <a href="http://www.latex-project.org/" title="LaTeX project: LaTeX &ndash; A document preparation system">LaTeX</a>, <a href="http://www.pragma-ade.nl/">ConTeXt</a>, or XML variants such as <a href="http://www.docbook.org/" title="DocBook.org">DocBook</a>&#8212;and the version control system&#8212;be it <a href="http://subversion.tigris.org/" title="subversion.tigris.org">Subversion</a>, <a href="http://git.or.cz/" title="Git - Fast Version Control System">Git</a>, <a href="http://www.selenic.com/mercurial/" title="Mercurial - Mercurial">Mercurial</a>. None are perfect. But for now, my bet is on the third way.</p>
]]></content:encoded>
			<wfw:commentRss>http://markelikalderon.com/2008/03/29/archival-formats-the-third-way/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>OOXML is a Plucked Chicken</title>
		<link>http://markelikalderon.com/2007/08/02/ooxml-is-a-plucked-chicken/</link>
		<comments>http://markelikalderon.com/2007/08/02/ooxml-is-a-plucked-chicken/#comments</comments>
		<pubDate>Thu, 02 Aug 2007 22:25:32 +0000</pubDate>
		<dc:creator>Mark Eli Kalderon</dc:creator>
				<category><![CDATA[OOXML]]></category>
		<category><![CDATA[Word]]></category>

		<guid isPermaLink="false">http://markelikalderon.com/blog/2007/08/02/ooxml-is-a-plucked-chicken/</guid>
		<description><![CDATA[Over at An Antic Disposition, Rob Weir compares Micorsoft&#8217;s attempt to pass off OOXML as a standard to Diogenes the Cynic&#8217;s response to Plato&#8217;s definition of man. Plato, teaching in the Akademia grove, defined Man as &#8220;a biped, without feathers.&#8221; This was answered by the original smart-ass, Diogenes of Sinope, aka Diogenes the Cynic, who [...]]]></description>
			<content:encoded><![CDATA[<p><img src='http://markelikalderon.com/wp-content/uploads/2007/08/diogenes.jpg' alt='Diogenes the Cynic' /></p>

<p>Over at <a href="http://www.robweir.com/blog/2007/08/two-feet-no-feathers.html">An Antic Disposition</a>, Rob Weir compares Micorsoft&#8217;s attempt to pass off OOXML as a standard to Diogenes the Cynic&#8217;s response to Plato&#8217;s definition of man.</p>

<blockquote>
  <p>Plato, teaching in the Akademia grove, defined Man as &#8220;a biped, without feathers.&#8221; This was answered by the original smart-ass, Diogenes of Sinope, aka Diogenes the Cynic, who showed up shortly after with a plucked chicken, saying, &#8220;Here is Plato&#8217;s Man.&#8221;</p>
</blockquote>

<p>Was Diogenes the Cynic a Microsoft shill?</p>
]]></content:encoded>
			<wfw:commentRss>http://markelikalderon.com/2007/08/02/ooxml-is-a-plucked-chicken/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>noooxml</title>
		<link>http://markelikalderon.com/2007/06/26/noooxml/</link>
		<comments>http://markelikalderon.com/2007/06/26/noooxml/#comments</comments>
		<pubDate>Tue, 26 Jun 2007 10:59:11 +0000</pubDate>
		<dc:creator>Mark Eli Kalderon</dc:creator>
				<category><![CDATA[ODF]]></category>
		<category><![CDATA[OOXML]]></category>

		<guid isPermaLink="false">http://markelikalderon.com/blog/2007/06/26/noooxml/</guid>
		<description><![CDATA[There is a petition against OOXML becoming an ISO standard here. Besides the fact that there already is an ISO document standard, ODF, a real standard is easy to implement and naturally has a variety of implementations. But there is no working implementation of OOXML and unlikely to be any produced by anyone other than [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.noooxml.org/petition"><img src="http://www.noooxml.org/local--files/banners/banner-OOXMLnoApto_en.gif" border="0"></a></p>

<p>There is a petition against OOXML becoming an ISO standard <a href="http://www.noooxml.org/petition">here</a>. Besides the fact that there already is an ISO document standard, ODF, a real standard is easy to implement and naturally has a variety of implementations. But there is no working implementation of OOXML and unlikely to be any produced by anyone other than Microsoft. Support open standards and sign the <a href="http://www.noooxml.org/petition">petition</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://markelikalderon.com/2007/06/26/noooxml/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Why Give Up Word? Part&#8230;Oh I Give Up</title>
		<link>http://markelikalderon.com/2007/06/09/why-give-up-word-partoh-i-give-up/</link>
		<comments>http://markelikalderon.com/2007/06/09/why-give-up-word-partoh-i-give-up/#comments</comments>
		<pubDate>Sat, 09 Jun 2007 00:46:47 +0000</pubDate>
		<dc:creator>Mark Eli Kalderon</dc:creator>
				<category><![CDATA[LaTeX]]></category>
		<category><![CDATA[Markup]]></category>
		<category><![CDATA[OOXML]]></category>
		<category><![CDATA[Word]]></category>

		<guid isPermaLink="false">http://markelikalderon.com/blog/2007/06/09/why-give-up-word-partoh-i-give-up/</guid>
		<description><![CDATA[Nature has announced that it cannot accept OOXML documents: We currently cannot accept files saved in Microsoft Office 2007 formats. Equations and special characters (for example, Greek letters) cannot be edited and are incompatible with Nature&#8217;s own editing and typesetting programs. And so has Science: Because of changes Microsoft has made in its recent Word [...]]]></description>
			<content:encoded><![CDATA[<p><img src='http://markelikalderon.com/wp-content/uploads/2007/06/benedict_biscop.jpg' alt='benedict_biscop.jpg' /></p>

<p><a href="http://www.nature.com/" title="Nature Publishing Group : science journals, jobs, and information">Nature</a> has <a href="http://www.nature.com/nature/authors/submissions/template/index.html">announced</a> that it cannot accept OOXML documents:</p>

<blockquote>
  <p><strong>We currently cannot accept files saved in Microsoft Office 2007 formats. Equations and special characters (for example, Greek letters) cannot be edited and are incompatible with Nature&#8217;s own editing and typesetting programs.</strong></p>
</blockquote>

<p>And so has <a href="http://www.sciencemag.org/about/authors/prep/docx.dtl">Science</a>:</p>

<blockquote>
  <p>Because of changes Microsoft has made in its recent Word release that are incompatible with our internal workflow, which was built around previous versions of the software, <em>Science</em> <strong>cannot at present accept any files in the new .docx format produced through Microsoft Word 2007</strong>, either for initial submission or for revision. Users of this release of Word should convert these files to a format compatible with Word 2003 or Word for Macintosh 2004 (or, for initial submission, to a PDF file) before submitting to <em>Science</em>.</p>
</blockquote>

<p>Though, why anyone would be using Word for scientific document processing is beyond me. Indeed, the scientific community&#8217;s support for LaTeX during the ascendancy of word-processing&#8212;<a href="http://markelikalderon.com/blog/2006/09/27/the-word/">during the triumph of the Image over the Word</a>&#8212;is heroic, and for this we are indebted to them. Even if a latter-day <a href="http://en.wikipedia.org/wiki/Benedict_Biscop">St Benedict</a> of document processing has yet to arrive thus establishing editing with structural markup as the norm.</p>
]]></content:encoded>
			<wfw:commentRss>http://markelikalderon.com/2007/06/09/why-give-up-word-partoh-i-give-up/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Pot Calls Kettle Black</title>
		<link>http://markelikalderon.com/2007/02/17/pot-calls-kettle-black/</link>
		<comments>http://markelikalderon.com/2007/02/17/pot-calls-kettle-black/#comments</comments>
		<pubDate>Sat, 17 Feb 2007 03:34:33 +0000</pubDate>
		<dc:creator>Mark Eli Kalderon</dc:creator>
				<category><![CDATA[ODF]]></category>
		<category><![CDATA[OOXML]]></category>
		<category><![CDATA[Text]]></category>

		<guid isPermaLink="false">http://markelikalderon.com/blog/blog/2007/02/17/pot-calls-kettle-black/</guid>
		<description><![CDATA[Microsoft has issued an open letter on interoperability in which they criticize IBM for opposing Open XML as an open standard. In an Ars Technica article, a Microsoft spokesperson has some harsh words for IBM: &#8220;Microsoft has determined that it is important to shine a bright light on IBM&#8217;s activities that will have a negative [...]]]></description>
			<content:encoded><![CDATA[<p>Microsoft has issued an <a href="http://www.microsoft.com/interop/letters/choice.mspx">open letter</a> on interoperability in which they criticize IBM for opposing Open XML as an open standard.</p>

<p>In an <a href="http://arstechnica.com/">Ars Technica</a> article, a Microsoft spokesperson has some harsh words for IBM:</p>

<blockquote>
  <p>&#8220;Microsoft has determined that it is important to shine a bright light on IBM&#8217;s activities that will have a negative impact on the IT industry and customers, including taking concrete steps to prevent customer choice, engaging in hypocrisy, and working against the industry and against customer needs,&#8221; said the spokesperson. &#8220;Microsoft will continue to be public in identifying the ways that IBM is trying to prevent customer choice.&#8221;</p>
</blockquote>

<p>Wow.</p>

<p>Black is white. Up is down. Left is Right.</p>

<p>Read the <a href="http://arstechnica.com/news.ars/post/20070215-8851.html">article</a> for an insightful commentary about what might be transpiring.</p>
]]></content:encoded>
			<wfw:commentRss>http://markelikalderon.com/2007/02/17/pot-calls-kettle-black/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Why Give Up Word? Part Three</title>
		<link>http://markelikalderon.com/2007/01/25/why-give-up-word-part-three/</link>
		<comments>http://markelikalderon.com/2007/01/25/why-give-up-word-part-three/#comments</comments>
		<pubDate>Thu, 25 Jan 2007 22:52:30 +0000</pubDate>
		<dc:creator>Mark Eli Kalderon</dc:creator>
				<category><![CDATA[ODF]]></category>
		<category><![CDATA[OOXML]]></category>
		<category><![CDATA[Text]]></category>
		<category><![CDATA[Word]]></category>

		<guid isPermaLink="false">http://markelikalderon.com/blog/blog/2007/01/25/why-give-up-word-part-three/</guid>
		<description><![CDATA[Rob Weir over at An Antic Disposition has a good discussion of Microsoft&#8217;s Office Open XML&#8212;its new document format&#8212;and the Open Document Format developed by Organization for the Advancement of Structured Information Standards which was based on the XML format originally implemented by OpenOffice.org office suite. Some of the problems with storing data in proprietary [...]]]></description>
			<content:encoded><![CDATA[<p>Rob Weir over at <a href="http://www.robweir.com/blog/index.html">An Antic Disposition</a> has a good discussion of Microsoft&#8217;s Office Open XML&#8212;its new document format&#8212;and the Open Document Format developed by <a href="http://en.wikipedia.org/wiki/Organization_for_the_Advancement_of_Structured_Information_Standards">Organization for the Advancement of Structured Information Standards</a> which was based on the XML format originally implemented by <a href="http://www.openoffice.org/">OpenOffice.org</a> office suite.</p>

<p>Some of the problems with storing data in proprietary binaries that motivated the development of ODF are familiar to readers of this blog:</p>

<ul>
<li>I want to own my data.</li>
<li>I do not want access to my data controlled by a single commercial entity.</li>
<li>I do not want to require that people go out and purchase a particular application in order to read my documents.</li>
<li>I want my documents to be in a format that has long-term stability and understandability</li>
<li>I want my documents to be in a format that lends itself to processing by a range of tools, both commercial and free.</li>
<li>I want my documents to be a format that everyone can understand.</li>
<li>I want to break out of the cycle of having to constantly upgrade my software every time my vendor decides to change formats on me.</li>
</ul>

<p>Amen, brother, amen.</p>

<p>While I no longer word-process, if I did ODF would be the way to go. Almost every major word-processor now supports this open standard, and many that don&#8217;t are planning to implement it.</p>

<p>One of the advantages touted by Microsoft of their new XML standard is its compatibility with legacy binary formats. Rob shrewdly observes the irony of this:</p>

<blockquote>
  <p>So now, today, Microsoft is pushing their Office Open XML standard, &#8220;old wine in new wine skins&#8221;, not so much a new format as a new ploy. What should enrage every thoughtful person is that they are using compatibility with the legacy binary formats as the main selling point of the OOXML format. Think about it. Compatibility with the binary format that they withdrew from the public seven years ago when they cemented their monopoly, is now being touted as their unique advantage. Said differently, Microsoft is selling OOXML as the solution to an interoperability problem that they themselves created and carefully orchestrated.</p>
  
  <p>&#8230;</p>
  
  <p>So what prevents Microsoft from doing the same thing again?</p>
</blockquote>

<p>Indeed.</p>

<p>Check out Rob&#8217;s <a href="http://www.robweir.com/blog/2007/01/foolish-inconsistency.html#links">post</a>. And geeks who are writers (or at least, writers who are geeks abut the technology of writing) will enjoy <a href="http://www.robweir.com/blog/index.html">An Antic Disposition</a>&#8217;s informative and insightful discussion of document formats.</p>
]]></content:encoded>
			<wfw:commentRss>http://markelikalderon.com/2007/01/25/why-give-up-word-part-three/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
