Most frequently used content creation and editing tools

In the world of content creation and editing, tracking the tools used across the entire industry can be tricky. For example, tools at one enterprise that facilitate collaboration while preserving authorship may be less valuable to another enterprise that needs integrated formatting and the ability to embed rich media.

In its exploration of content creation and management trends in 2016, the Center for Information-Development Management issued a survey to 328 individuals across the entire content creation spectrum. Writers, managers, information architects, content strategists, editors and a small contingent of IT support, customer services and publishers were represented – with the overwhelming majority of respondents representing computer software companies.

The survey sought to answer a few basic questions: What tools do you use to create and manage content? What kind of content do you most frequently develop? How will this content be published in years to come?

Tools of the trade 
As one might imagine, DITA played a significant role in content creation across all respondents. Roughly 74 percent of those surveyed reported using some kind of DITA-capable XML Editor as their primary content creation tool, far exceeding other tools. Following that, 66 percent reportedly used Madcap Flare, 53 percent Unstructured Adobe FrameMaker, 43 percent Adobe InDesign and finally, at 38 percent, Microsoft Word.

Microsoft Word's fall from preeminence for content creators is somewhat predictable. Content experts across the industry have been predicting the end of generic simple document creators, with many saying that basic text-to-HTML conversion tools like Markdown will render Word virtually obsolete among professional content creators.

One of the more fascinating insights in this data is the role native HTML authorship plays in content creation: While few survey respondents (25 percent) claimed an HTML editor as a primary tool, it was overwhelmingly the favorite secondary tool across all categories – coming in at a total of 52 percent. From this, we can extrapolate that content creators:

  • Are shifting away from creating HTML first/only content.
  • Still require HTML editing tools to fully leverage content production and publishing.

Where is content being published?
This seems to follow data insights related to falling use of HTML-based delivery: While still the preferred means of publishing for almost 75 percent of survey respondents, mobile is coming up rapidly – albeit with content creators seemingly confused as to how to fully leverage it.

"We were interested to learn how organizations are approaching publishing to mobile devices, since we advocate designing content differently for mobile devices," the authors of the CIDM survey stated. "Fully 38 percent report that their content is the same on all devices. Some publish more content on mobile devices (only 4 percent); more publishing less content (24 percent)."

The one not mentioned: Localization
This points to the fact that mobile content creation tools are still not being considered separately to traditional content creation. One facet not mentioned in the CIDM survey is localization. Yet this seems to ignore one of the fundamental tenants of the mobile experience: that localized UX is a crucial element for consumer engagement and must be taken into account in the creation of specific content. Tech.Co emphasizes that, for mobile experiences related to e-commerce, localization tools beyond simply translation are key as well.

"If you're doing this, be sure to use widely accepted localization packages or hire an expert to work on the content for you as there will be nuances across languages that even Google Translate doesn't quite get yet," wrote Tech.Co's Joe Liebkind. While mainstream content creators may be focused on issues related to format conversions, the greater topic of authoring content for diverse audiences seems to be underrepresented.

Style Guides: Internal or external?

The endless capabilities of an open-standard XML vocabulary like DITA means that you can design and automate the creation of content modules with minimal loss of usefulness across different platforms and applications.

However, cleverly implemented content automation does not necessary imply good content marketing. In fact, it's through content marketing that an enterprise, business, organization or brand bridges the gap between data and consumer. In other words, there must be rules to configure data that meets specific brand, linguistic and cultural guidelines so that people want to read what is produced. This is where the style guide comes into play.

"Content management goes hand and hand with content marketing."

Where did authors traditionally encounter style guides?
Style guides have long served as the way we achieve commonality and consistency from a particular institution. Two long-standing guides, the AP Stylebook and the Chicago Manual of Style, have been in heavy use by editors and journalists since the 1950s. Some brands have developed their own variations along the way to distinguish themselves in the market and ensure readability across different audience demographics.

With manual authorship, the interaction with style guides is simple: An author writes a piece of content with the style guide in mind. Editors may verify or adjust style guide usage, but the process still remains relatively contained to authorship.

In this way, style guides are primarily an internal tool, put into action by the person creating the content. Yet in the age of automation – where content can be created, reconfigured and and managed without ever being touched by human hands – where should style guides live?

CMS style guide implementation: Internal versus external
With CMS that enables automated content creation, there are essentially two scenarios where a style guide can be put in place.

The first is the code of the content generation module itself. The data module creates new content automatically and formats it according the style guide, which has been implemented algorithmically within the module. In other words, the style guide is woven into the content creation software, a process that might be termed "internal" implementation. Internal implementations are relatively simple to install and activate at the expense of complexity to the CMS architecture – effectively making it less agile and leaving room for configuration errors.

The second scenario is implementing a style guide outside of the data-creation module. This is a more "external" process and is akin to the traditional copy-editing function. It could involve the work of a human editor/author reading "copy" with the style guide in mind, or it could be an automated, rule-driven package applied in a "secondary" content configuration process. The human approach limits the complexity of the CMS at the expense of low-cost scalability.

Creating your style guide
Choosing which of these two implementation styles is right for your organization and CMS design is something that depends entirely on your resources and needs. However, one way to gauge the best approach is to explore what makes up your style guide.

The three basic elements of a style guide are:

  • Content attributes
  • Tone and voice
  • Rules

Content attributes typically involve the basic building blocks of the content, i.e. what data you will be feeding into the CMS. From there, tone and voice tie closest to the core of hypertext and data tagging. If your voice, for instance, is casual yet authoritative, having data tagged according to these descriptions can help guide the way content is subsequently assembled. Finally, rules dictate the parameters of the content – what phrases or structures must be avoided.

Through exploring the full scope of your style guide, you can more clearly see whether or not it can be integrated into CMS architecture without causing future issues.

Parsing the difference between HTML, XHTML and HTML5

Astoria’s support for the DITA Open Toolkit allows users to transform their DITA-style XML content into various permutations of HTML, the cornerstone technology for creating web pages known formally as HyperText Markup Language. While HTML has gone through many permutations and evolutions, the three main variations for web markup – HTML, XHTML and HTML5 – are all currently in use by developers. The following is a quick summary of the history and distinctions that give each language its own character and capabilities.

There are the three related languages for web markup – HTML, XHTML and HTML5.

The first internet markup language, HTML is the basis for every subsequent web design language. HTML’s enduring utility is its simplicity: a small set of elements to describe the structure and content of a web page. Layout and appearance rely on the more advanced capabilities of JavaScript, CSS or Flash to make a site more interactive. However, this can often lead to frustration for designers, since these more dynamic elements are difficult to construct natively in HTML.

The Extensible Hypertext Markup Language, XHTML, began as a reformulation of HTML 4.01 using XML 1.0.  XHTML was designed to allow for greater cross-browser compatibility and the construction of more dynamic, detailed sites. As XHTML evolved, it lost most of its compatibility with HTML 4.01.  Today, XHTML is relegated to specialized projects where HTML output is not desired.

As the latest permutation of HTML, HTML5 is a combination of three families of code: HTML, CSS, and JavaScript. It is significantly more versatile than previous HTML iterations, and it enjoys much more support than XHTML. Its cross-platform capabilities and native integration of what were once third-party plugin features (e.g., drawing, video playback, and drag-and-drop effects, etc.) make it a favorite of web designers.

IRS turning to XML to handle tax exemption forms

XML is not just an encoding format for technical documentation.  In a move that touches on XML's original purpose, United States Internal Revenue Service Form 990 – detailing tax-exempt organizations' financial information – is shedding its paper-based roots to go digital as its native format. The IRS announced that Form 990 Form will now be available in the machine-readable XML format.

"This will have an impact on the speed and efficiency of requests."

"The publicly available information on the Form 990 series is vital to those interested in the tax-exempt community," IRS Commissioner John Koskinen wrote in a statement regarding the transition, as quoted by AccountingWEB. "The IRS appreciates the feedback we've received from a variety of outside partners as we've worked together to explore improvements to make this data more easily accessible."

With more than 60 percent of Forms 990 filed electronically, according to FCW, the move to making data – with relevant redactions – available in a native machine-readable format is intuitive. Covered forms include electronically filed Form 990, Form 990-EZ and Form 990-PF from 2011 to the present.

"The IRS' move is a very good thing," Hudson Hollister, executive director of the pro-transparency Data Coalition, told FCW. "There is no reason why public information that the government already collects in a machine-readable format can't be published in that same format!"

Some industry experts, like The Sunlight Foundation's Alex Howard, emphasize that requesting and obtaining information still remains difficult for the public and that improvement in accessibility should be an ongoing focus, given the fact that public requests for non-profit or tax-exempt organizations' filing information is commonplace.  Nevertheless, the IRS's announcement will presumably have a tangible impact on the speed and efficiency of compliance with requests. It will also offer benefits to backend integration of IRS Form 990 data into XML-based content management systems, such as Astoria.

Understanding and utilizing adaptive content modeling

The Astoria Portal gives end-users the ability to interact with content managed by the Astoria Content Management System.  The Astoria Portal is a web site customized to match the client's expectations for user experience.  Not long ago, discussions about site design and user experience would involve arcane terminology and technical jargon.  Today, the world of web architecture has become increasing democratic, so that discussions about Astoria Portal use terms and concepts that are increasingly common knowledge. With the increased emphasis on user friendly web interfaces, the average consumer may have a solid grasp on the concept of "responsive design." Yet adaptive web design – and its sibling, adaptive content – remains a relatively unknown aspect of new technology.

In truth, adaptive design is one of the most important drivers of innovation within the world of content management. Rather than simply offering the ability to flip between mobile and desktop optimization, adaptive design allows for content to be reconfigured at will, taking the burden off designers as the focus shifts to more impactful content.

"Adaptive content is completely flexible on the back end."

What's the difference between responsive and adaptive?

While there is overlap between the concepts of responsive and adaptive design, "responsive" suggests design that fluctuates between a fixed number of outcomes and focus on fluid grids and scaling. With adaptive design, the possibilities are virtually limitless. This is enabled by a fundamentally modular approach to content and data, allowing for a completely device-agnostic content model.

"[Web site] CMS tools have largely been built on a page model, not on data types," Aaron Gustafson, coiner of the phrase "adaptive web design," told CMSWire. "We need to be thinking more modularly about content. We need to design properties of content types rather than how it's designed."

Embracing omnichannel 
What does this mean? Adaptive content is completely flexible on the back end, with a solid model able to publish across an infinite number of channels – a highly desirable ability. This is because most enterprises are fundamentally operating in an omnichannel world already, with the final hurdle being effective personalization. And what's the end-game of adaptive? SES Magazine recently found that eCommerce sites featuring personalized content were able to increase conversion by up to 70 percent

With responsive design, some content may be reconfigured in response to the device it is viewed on, but in general the content is static. Adaptive design allows for new levels of personalization, with machine learning giving an enterprise the ability to analyze a user's habits and taste and present customized, on-demand content matching their preferences. The key isn't just content that looks different – it's content that is different, depending on the device and the viewer. This is a valuable capability since device usage itself implies different behavioral patterns. 

"The key isn't just content that LOOKS different – it's content that IS different."

A 'multi-year journey'
The issue for many enterprises in embracing adaptive content, of course, is implementation. Not all portal systems driven by an XML content management system can easily convert from a static content management system to a dynamic one.

"For many organizations, especially those in business-to-business or those with large, complex or regulated content sets, implementation will be a multi-year journey, with many iterations and evolutions along the way," wrote Noz Urbina in Content Marketing Institute. "Organizations struggle to transform themselves to keep pace with communications options and customer demand. Delivering major changes in two years might mean having gotten started two years ago."

The major push of conversation is the ability to create data hierarchies that can exist independent from the eventual design functions. Rather than creating content with the end in mind, this requires strong hypertext conversion and structuring, as well as the integration of analytics and content building apps. Yet in committing to this conversion, the possibilities of how content is presented and its effectiveness could be limitless.

Essential vocabulary: Transclusion

Transclusion is one of the foundational concepts of DITA. Coined by hypertext pioneer Ted Nelson, the term "transclusion" refers to the inclusion of part or all of an electronic document into one or more other documents by hypertext reference.

"Transclusion allows content to be reused far more efficiently."

The concept of transclusion took form in Mr. Nelson's 1965 description of hypertext. However, widespread understanding of transclusion was limited by the slow adoption of markup languages, including Structured Generalized Markup Language (whose origins date to the 1960's), Hypertext Markup Language (released in 1993), and eXtensible Markup Language (released in 1996).  In fact, it wasn't until DITA, an XML vocabulary donated to the open-source community in 2004 by IBM, that the power of transclusion enjoyed broader reception.

Transclusion differs from traditional referencing. According to The Content Wrangler's Eliot Kimber, traditional content had "…to be reused through the problematic copy-and-paste method." With transclusion, a hyperlink inserts content by reference at the point where the hyperlink is placed. Robert Glushko adds, "Transclusion is usually performed when the referencing document is displayed, and is normally automatic and transparent to the end user." In other words, the result of transclusion appears to be a single integrated document, although its parts were assembled on-the-fly from various separate sources.

"In the information management sense, transclusion makes content easy to track, removes redundant information, eliminates errors, and so on," writes Kimber. "Use-by-reference serves the creators and managers of content by allowing a single instance to be used in multiple places and by maintaining an explicit link between the reused content and all of the places it is used, which supports better tracking and management."

Transclusion is not without its limitations. It's rarely used in web pages, where the processing of transclusion links can become cumbersome or can fail when the page is displayed.  For that reason, people writing content for the Web "…do the processing in the authoring environment and deliver the HTML content with the references already resolved. However, transclusion, which doesn't rely directly on metadata is superior to conditional preprocessing when working with content that has a large number of variations.

Content gets predictive with analytics

Within the world of customer engagement, predictive analytics have revolutionized the ability for enterprises to match materials with the audiences most suited to appreciate them. In regards to content, this has traditionally meant creating channels based on customer profiles and then funneling content to the appropriate market. Now, with the increasing sophistication of analytic algorithms – combined with a component-based, hypertext approach to content creation that XML vocabularies such as DITA enable – content can be configured on demand, customized to match the consumer profile.

"Content can now be configured on demand, customized to match the consumer profile."

Descriptive versus predictive

The key to this evolution is the transition from descriptive to predictive and on to prescriptive marketing. In the traditional customer profile, hindsight is 20/20 in the eyes of the marketer: existing materials are evaluated based on how they did previously. This is a simple descriptive process, but it limits the ability to better match content to customer needs except by small evolutions or by accident.

With predictive analytics, marketers can build "a fluid and multi-dimensional map of prospect interests," according to Ilan Mintz, Marketing Coordinator at Penguin Strategies. Mintz describes how predictive content analytics aggregates data within the pieces a user reads, then builds and catalogs a topic composite akin to a word cloud. From there, the composite is tied to the profile of that user or user group.

Mintz claims that this approach to content marketing allows for a graphical view of content-related interest. This in turn facilitates new insights, such as:

  • Content personalization.
  • Competitor analysis.
  • Anticipation of trends.
  • Lead nurturing and tracking/predicting sales cycles.

In all, Mintz points to the increased ability to target audiences with personalized content as the return on investment in content data analysis. 

More content than ever
This has given new power to marketers and content authors, particularly in an environment that is already awash in materials. Digital content is at a higher premium than ever according to the Content Marketing Institute, with 70 percent of B2B content marketers in 2014 saying they created more content than the previous year, with no end in sight for the trend. However, increased volumes isn't a meaningful measure of success for marketers. The impact of the content, which in and of itself is defined by the goals of those who run lines of business, must be measured and interpreted – and, according to Tjeerd Brenninkmeijer and Arjé Cahn, co-founders of Hippo, engagement metrics are a "notoriously fluffy" and increasingly unhelpful way to appraise successful content. 

"Over the next two years, predictive content analytics will provide smart businesses a means of gaining better insight into customer's interactions with content," the Hippo co-founders told CMSWire. "And by equipping their marketers with better access to analytics and more decision-making power, businesses will reap the benefits."

Through a deepened understanding of the role that predictive analytics can play in modern content marketing, authors and marketers have more effective tools to affect customer engagement.

Is XML too ‘verbose’?

As one of the most important and comprehensive languages for encoding text, XML has enjoyed great popularity as the basis for the creation of structured writing techniques and technologies. Yet even with continuous refinement and the broad adoption of data models like DITA for authoring and publishing, XML poses major challenges, especially when compared to so-called plain-text-formatting languages such as Markdown.  Many, like The Content Wrangler’s Mark Baker, are criticizing XML for its perceived limitations.

“XML’s complexity makes it hard to author native content.”

A ‘verbose’ language
In a post bluntly titled “Why XML Sucks,” Baker says that, while performing a vital function as the basis for structured writing systems, XML’s tagging – which he says makes XML “verbose” – inhibits author productivity.

“If you write in raw XML you are constantly having to type opening and closing tags, and even if your [XML] editor [application] helps you, you still have to think about tags all the time, even when just typing ordinary text structures like paragraphs and lists,” said Baker.

“And when you read, all of those tags get in the way of easily scanning or reading the text. Of course, the people you are writing for are not expected to read the raw XML, but as a writer, you have to read what you wrote.”

The absence of absence
Baker hangs a lantern on the issue of whitespace. He cites the original purpose of XML (“XML was designed as a data transport layer for the Web. It was supposed to replace HTML and perform the function now performed by JSON. It was for machines to talk to machines…”) as the reason why whitespace has no meaning in XML.

And what’s the big deal about whitespace? Says Baker, “…in actual writing, whitespace is the basic building block of structure. Hitting return to create a new paragraph is an ingrained behavior in all writers….”

He goes on.  “This failure [of XML] to use whitespace to mean what whitespace means in ordinary documents is a major contributor to the verbosity of XML markup.  It is why we need so many elements for ordinary text structures and why we need end tags for everything.”

“XML performs a vital function…”

No ambiguity
While all this talk of verbosity and whitespace may seem fairly damning to the future of XML, the truth is that it serves a fundamental purpose, a “vital function,” as Baker puts it, that lots of people use and which contributes to its longevity. As one Charles Gordon of NetSilicon said in 2001, XML is “…a tool that concisely and unambiguously defines the format of data records.”

The “unambiguous” aspect is particularly important. While Baker may lament the loss of readability when viewing XML-encoded content in its raw form, the fact that XML requires authors to make conscious decisions about the structure of what they’re writing – even the placement of whitespace – makes every line purposeful. XML is ideal for communicating with unambiguous intent, which is the precise purpose of structured writing systems and rule-based content architecture. Raw XML is indeed verbose, but its general simplicity has made it a building block of so many improvements in technical communication that its use endures and even flourishes to this day.

3 possible pitfalls in a content management system

Building a streamlined curation process allows owners, authors, editors and even users to fully engage with content in a manner best suited to each person’s needs. However, even the best laid plans will go awry if oversight responsibilities are murky or absent. The problem simply gets worse as the volume of data-rich hypertext content increases. Curators must maintain thorough reviewing processes to verify that the underlying data is of value. Here are just a few of flaws that inhibit effective content curation.

“Underpinning each pitfall is a missing aspect of oversight.”

Unclear ownership
As a foundation for establishing acceptability, authority, and proper editing privileges, a  robust curation strategy requires a system that maintains ownership of content on a granular level. Otherwise, the system gives rise to “orphaned” content; i.e., content exposed to an editorial gap (because it has no owner) that can result in inaccuracies.

Lack of coherent review stage
While it may seem obvious that a review stage is needed for effective content curation, a significant issue is where review should occur. Should individual data components be subject to review and approval? How much should generated content be subject to peer review if those reviewing it have similar editing privileges as the author? The placement of review stages in your curation processes is the essence of content “management” at every phase of content lifecycle.

Doesn’t consider design and formatting
Content curation programs put extensive thought into information architecture but comparatively less attention to the end-user’s experience with the content. This can lead to the selection of a component content management system (CCMS) that does everything expected of it while producing content that is fundamentally not user-friendly. Unless the CCMS integrates end-user presentation into its operating capability, even the most complex CCMS can miss the mark.

Putting together a content lifecycle strategy

For marketers, creating compelling content that connects with the intended audience is the main push of their daily work. But once this content is created, what happens next? How will it be disseminated, redeveloped and warehoused for future use?

“Marketers who have developed a strong content lifecycle have a leg up.”

Content: No longer disposable
Marketers who have developed a strong content lifecycle have a leg up when it comes to managing their materials and potentially reusing it for later campaigns. Columnist Robert Norris recommends the development of lifecycles to help craft content that resonates with different groups of customers and can remain effective across a variety of channels. To do this, he advocates moving away from treating content as a disposable material and toward viewing content as a living, evolving entity worthy of attention and careful consideration.

“Critically, we realize that these audiences have very specific needs for which we have the expertise—if not yet the processes —to craft and maintain targeted knowledge base resources,” Norris writes in The Content Wrangler. “Moreover, we recognize that the task of creating and publishing these resources must receive the same diligent attention to detail that we apply to our goods and services because poor publishing reflects upon our credibility just as harmfully as does a poor product or service.”

The content lifecycle
To ensure that content reaches its full potential, Norris proposes a lifecycle based on constant evaluation and redevelopment. The steps he puts forth include:

  • Production – Where content is developed, based on existing data components.
  • Approval – Content is reviewed and vetted by editors and administrators before being slated for release.
  • Publish – Content is configured and fully optimized for a publishing platform, as well as made discoverable by adding meta-data and setting prominence.
  • Curate – Ancillary resources are integrated into the content.
  • Improve – Feedback, telemetry and analytics are used identify and address successful aspects as well as deficiencies in the content. Once identified, the content is tweaked to address these pain points.
  • Re-certify – An often missed step, data used in content must be reverified periodically to ensure it is still relevant and accurate based on more recent findings.
  • Update – Aside from recertification, consideration of timeliness and cultural relevancy can warrant changes from minor updates to major revisions.
  • Retire – Once a piece of content has reached the end of its relevancy, archiving it is warranted. Make sure the content and its metadata are tagged for ease in locating it later.

With a hypertext-based content paradigm like DITA, this lifecycle is made even simpler by being able to evaluate and repurpose content on an XML component level. Analytics can show the efficacy of a single data element, and automation driven by content tagging can streamline campaign variations to audience segments to gauge impact. From there, each element of the lifecycles is a chance to refresh and swap metadata into more compelling content.