As one of the most important and comprehensive languages for encoding text, XML has enjoyed great popularity as the basis for the creation of structured writing techniques and technologies. Yet even with continuous refinement and the broad adoption of data models like DITA for authoring and publishing, XML poses major challenges, especially when compared to so-called plain-text-formatting languages such as Markdown. Many, like The Content Wrangler’s Mark Baker, are criticizing XML for its perceived limitations.
“XML’s complexity makes it hard to author native content.”
A ‘verbose’ language
In a post bluntly titled “Why XML Sucks,” Baker says that, while performing a vital function as the basis for structured writing systems, XML’s tagging – which he says makes XML “verbose” – inhibits author productivity.
“If you write in raw XML you are constantly having to type opening and closing tags, and even if your [XML] editor [application] helps you, you still have to think about tags all the time, even when just typing ordinary text structures like paragraphs and lists,” said Baker.
“And when you read, all of those tags get in the way of easily scanning or reading the text. Of course, the people you are writing for are not expected to read the raw XML, but as a writer, you have to read what you wrote.”
The absence of absence
Baker hangs a lantern on the issue of whitespace. He cites the original purpose of XML (“XML was designed as a data transport layer for the Web. It was supposed to replace HTML and perform the function now performed by JSON. It was for machines to talk to machines…”) as the reason why whitespace has no meaning in XML.
And what’s the big deal about whitespace? Says Baker, “…in actual writing, whitespace is the basic building block of structure. Hitting return to create a new paragraph is an ingrained behavior in all writers….”
He goes on. “This failure [of XML] to use whitespace to mean what whitespace means in ordinary documents is a major contributor to the verbosity of XML markup. It is why we need so many elements for ordinary text structures and why we need end tags for everything.”
“XML performs a vital function…”
While all this talk of verbosity and whitespace may seem fairly damning to the future of XML, the truth is that it serves a fundamental purpose, a “vital function,” as Baker puts it, that lots of people use and which contributes to its longevity. As one Charles Gordon of NetSilicon said in 2001, XML is “…a tool that concisely and unambiguously defines the format of data records.”
The “unambiguous” aspect is particularly important. While Baker may lament the loss of readability when viewing XML-encoded content in its raw form, the fact that XML requires authors to make conscious decisions about the structure of what they’re writing – even the placement of whitespace – makes every line purposeful. XML is ideal for communicating with unambiguous intent, which is the precise purpose of structured writing systems and rule-based content architecture. Raw XML is indeed verbose, but its general simplicity has made it a building block of so many improvements in technical communication that its use endures and even flourishes to this day.