Astoria Software and WittyParrot Initiate Quantum Shift in Structured Authoring

The Astoria-WittyParrot integration allows multiple departments to reuse corporate Intellectual Property curated by technical documentation without using XML-based tools

SAN FRANCISCO, CA and CUPERTINO, CA—November 6, 2017 – Astoria Software, a division of TransPerfect, today announced the release of its integration with WittyParrot. The combined product allows non-XML content creators in Marketing, Sales, and Customer Support departments to reuse corporate intellectual property encoded in XML by the Technical Documentation department. The result is an unprecedented level of content sharing that allows the entire organization to speak with one voice without the burden of using a common set of authoring tools. Astoria Software and WittyParrot will showcase the solution, and a customer deploying it, at the LavaCon Conference in Portland, Oregon.

A Quantum Shift in Structured Authoring

Structured Authoring—most commonly associated with authoring in XML—suffers from restricted adoption for one principal reason: it asks all users who wish to be content creators to also be conversant with the bewildering details of XML elements, attributes, cross-document linkages, document-type descriptions, and so on. Companies seeking to surmount this barrier have chosen one of two approaches. The first is to force everyone to use non-XML tooling, which is easier for creators outside of Technical Documentation to understand, but which also sacrifices well-defined content structures and standard content-reuse mechanisms that make content useful and resilient. The second approach is to refactor all content into a lightweight-markup language such as MarkDown, which makes it easy for subject-matter experts to put ideas, concepts, thoughts, and descriptions into written form. The problem with languages like MarkDown is their inability to enforce either a content structure or an in-context reuse that would otherwise optimize information sharing.

The Astoria-WittyParrot integration takes a better approach. It allows creators in the Technical Documentation department to describe an organization’s intellectual property with all of the purposeful structure, rich metadata, and classification taxonomy found in XML-based authoring. Astoria provides all of the tooling and capability for version control, branching, revision, translation, and distribution. Approved content flows from Astoria to WittyParrot, where its seamless integration with Microsoft Office products allows Marketing and Sales users to consume such content within Word, PowerPoint, Excel, Teams, and Outlook. Sales and Customer Support users working in customer-relationship management (CRM) packages such as Microsoft Dynamics or Salesforce have a similar experience. Where organizations employ artificial intelligence to drive automated customer-service agents (i.e., “chat bots”), WittyParrot ingests micro-content managed in Astoria to provide content for its “Intelligent Q&A” module. Intelligent Q&A allows users, including an organization’s customers, to make natural-language requests and receive accurate, approved answers automatically.

The result is a quantum shift in the way the Technical Documentation team disseminates corporate intellectual property across the organization and to its customers. Each team uses the tooling best suited to its method of operation, and the content—be it multi-paragraph topics or sentence-level micro-content—retains all of the relationships, metadata, and taxonomy information encoded by the structured-authoring team.

Making the Quantum Shift

Attendees to the the LavaCon Conference in Portland, Oregon, will be able to speak with representatives of an Astoria Software customer that has incorporated the Astoria-WittyParrot integration. As one such representative describes it,

“One of the things we have had to overcome is the challenge of sharing content between departments. The appeal for us is that it allows other departments to use vetted content, content that is accurate for a customer-facing audience, without having the technical skills to author in DITA, and that’s huge for us.”

The documentation team at this company has pushed easily shareable tasks, like those you might find in a frequently-asked-questions list, from Astoria to WittyParrot so that Customer Support and Sales Engineering personnel can reuse that content in Microsoft Outlook and Word. In another project, PowerPoint slides for instructor-led learning will reuse DITA content authored and stored in Astoria. According to another representative of the same company, the benefit is tremendous because, “When you update the DITA material in Astoria, it’s automatically updated in PowerPoint!”

The Astoria Software Market Approach

In describing the transformation afforded by the Astoria-WittyParrot integration, Michael Rosinski, President and CEO of Astoria Software, commented, “The CCMS application vertical is dominated by vendors that focus on XML technology and related standards. Astoria Software is proposing an entirely new approach that allows the widely adopted Microsoft Office application to use content originally authored in XML. This means that content developers trained in Microsoft Office—and that’s millions of people in thousands of organizations—will reap the benefits of rich, structured content without having to learn anything about XML.” Marketing, Sales, and Customer Support organizations are usually shut out from reusing the intellectual property described by the Technical Documentation department. Yet personnel in these first three groups spend far more time talking to customers than any technical writer. Mr. Rosinski observed that, “The focus over the last decade on customer-relationship management can now be leveraged with WittyParrot. Sales and Marketing teams building customer-specific content can reuse the highly valuable content in Astoria, thereby increasing the return on investment made in CRM and technical documentation.”

Speaking with One Voice

Organizations of all sizes face too many barriers when trying to find and reuse approved, accurate descriptions of intellectual property. Intranet portals and CRM tools offer some relief when such descriptions are otherwise trapped within documents or scattered across repositories and local drives. However, these systems still require users to sift through search results; and, in large organizations with lots of portals and CRM systems, users cannot be certain that the results are showing approved, accurate content. Anil Jwalanna, Founder and CEO of WittyParrot, explains, “WittyParrot collates the information you want to share with customers into a single knowledge-automation, collaboration, and communication platform.” The WittyParrot solution makes accessing information easy. Through its built-in “bot” engine or with a few clicks in its user-interface, users bypass complex portal navigation to discover, classify, assemble, deliver and track all the content needed to communicate quickly and accurately with customers. As Mr. Jwalanna puts it, “WittyParrot centralizes the approved versions of all your messaging and best-practices communication, enabling your entire organization to speak with one voice.”

About Astoria Software

Astoria Software is the world’s most successful Enterprise solution for XML Component Content Management to help companies engage with their customers. Cisco Systems, Xylem, ITT, Siemens Healthcare, Northrop Grumman, Kohler, GE Digital, and other Forbes Global 2000 organizations rely on the Astoria platform to build meaningful, purposeful customer experiences around complex, business-critical content and documents. The Astoria platform’s reach extends to its web-based portal and to its mobile-device apps, forming an end-to-end solution that includes authoring, content management, and rendering systems, fully integrated and delivered on public or private clouds. Astoria Software, a division of TransPerfect, Inc., is based in San Francisco. For more information, visit

About WittyParrot

WittyParrot is a disruptive, intelligent micro-content automation, collaboration and communication platform for Marketing, Sales, and Support organizations. WittyParrot improves knowledge-worker consistency in communication, productivity and responsiveness by making information nuggets available through bots and widgets. The company’s investments in artificial intelligence, machine learning, and data science enable it to automate both effectiveness and messaging consistency in all knowledge-worker communications. WittyParrot is fully integrated with Microsoft Office and Microsoft Office 365, several CRM platforms and various chat-bot technologies. WittyParrot has offices in Silicon Valley, California and in Bangalore, India. For more information, visit

LavaCon Conference 2017 Portland

The LavaCon Content Strategy Conference: Spanning Silos, Building Bridges
November 5-8, 2017
Hilton Downtown Portland
Portland, OR

Join Astoria Software in the launch of the Astoria-WittyParrot solution in front of content curation colleagues from around the world. With its highest level of registration ever, LavaCon 2017 Portland promises to be three full days of career-empowering knowledge, practices, networking, and practical solutions.

As part of the Astoria-WittyParrot launch activities at LavaCon, Astoria Software will deliver a lecture entitled, “Making a Quantum Shift in Structured Authoring”.  Click here for more details.

Make a Quantum Shift in Structured Authoring

Eric Kuhnen and Michael Rosinski join Ed Marsh to talk about their presentation at LavaCon, Making a Quantum Shift in Structured Authoring.

According to Eric, one of the key changes in the content industry has become the inability for multiple groups within a department to share content while using a common set of tools. The technical documentation team works with structured content, and the content repository is often not available to those outside the team. Astoria Software now provides integration with Witty Parrot to enable “rich sharing” and ensure that XML-based content is available to non-XML content creators.

Julie Newcome of Ultimate Software, an Astoria Software customer, immediately saw the appeal of the integration:

When we first saw the demo with Witty Parrot, it really excited us. One of the things we have had to overcome is the challenge of sharing content between departments. The benefit for us is that [Astoria with WittyParrot] allows other departments to use vetted content, content that is accurate for a customer-facing audience without having the technical skills to author in DITA, and that’s huge for us.

Ultimate Software is scheduled to go live with their integration of Witty Parrot soon after the LavaCon Conference. You can see a demonstration of their implementation at the Astoria booth; the demonstration includes:

  • Pulling technical content, such as a task or FAQ, from the content repository and sharing it with a customer
  • Generating instructor slides for a training class directly from the source DITA and creating an updated course manual, which is a faster, more efficient, and better managed process

On this Podcast

  • Michael Rosinski: President and CEO of Astoria Software, Inc.
  • Julie Newcome: Content Management Analyst at Ultimate Software.
  • Eric Kuhnen: an expert in product research, development and management.
  • Ed Marsh: Creator and host of the Content Content podcast.


Castaways: Dealing with orphaned content

Even the most robust, expertly maintained content management system will inevitably face the challenge of orphaned content. Regardless of how small or seemingly insignificant the content block may be, when an important piece of your written intellectual property loses its link to its original author, that piece of IP loses its chain of provenance that gave rise to the content in the first place.  The effect is a disrupted chain of linkages, rendering many related content blocks essentially useless and degrading the value of the IP itself.

It takes careful and regular monitoring to avoid orphaned content and the subsequent increase in resources needed to rectify the condition.  Let’s take a look at a few issues surrounding orphaned content, starting with its genesis.

What makes content ‘orphaned’?
Content is orphaned when it loses its link to authorship and, therefore, its link to an authoritative source. This can occur if a CMS user/author account is deleted or updated without the content itself being updated. The content subsequently can become a “problem resource” – disconnected from clear authorship permissions and only able to be updated or deleted by a system administrator.

“You can’t verify the veracity of orphaned content.”

When content is orphaned, it can send ripples through the entire CMS. Every piece of linked content that refers back to data owned by the orphaned content is affected by its change in status, rendering them either broken or unable to be edited since the original author no longer exists in the CMS. This can be a serious problem, particularly if a CMS has significant user turnover or the system purges its authors regularly.


The impact of orphaned content
The challenge that heavily linked orphaned content creates is a considerable one. In addition to the manifold software errors it can prompt, the lack of author roles can undermine the authority of the content. Without author accountability, it becomes impossible to verify the veracity of data underpinning the piece of content – or even whose job it is keep the content updated. Deleting it may only make the matter worse – doing so can further break links in related files and folders.

“As content wranglers accustomed to dealing with orphaned content, we know from firsthand experience that it is unrealistic to rely upon the availability of original authors as the backbone of our quality system,” Robert Norris wrote in The Content Wrangler. “Far too often we’ve been left wondering who is going to fix the problem…and how…and when.”

Repairing orphaned content
To avoid the operational hassle of orphaned content, Norris urges CMS designers to build a mechanism that acts a “self-examination” for a system, combing through content and flagging issues of quality and authorship, and funneling these issues into a repair feed.

“[It] makes sense to assign topical content ownership at the upper-management level to establish accountability with a role that has authority,” Norris said. “Since every resource we publish incurs a burden of maintenance, this principle places that burden on the shoulders of someone with the resources needed to prioritize and execute the task.”

What this means is that orphaned content ideally needs to be repaired rather than purged. As previously stated, regular author turnover means that the task of repairing orphaned content defaults to a system administrator. The best practice, though,  is for the self-examination algorithm to assign ownership to widely accessible dummy account whereby qualified authors can claim ownership and reestablish the chain the provenance.

By taking a tactical, strategic approach and flagging content as problems arise – rather than only discovering a buildup of orphaned content after an audit – CMS managers can ensure their systems are clean and efficient.

DITA Europe 2017

2017 Content Management Strategies/DITA Europe conference
October 30-31, 2017
The Radisson Blu Hotel, Berlin

Join Astoria Software and content curation colleagues from around the world for two days of career-empowering knowledge, practices, networking, and practical solutions.  Click here to register.

Astoria Software will deliver a lecture entitled, “Your Connected Content—From CCMS to Mobile Devices”.  Click here for more details.

Event Template

Information Development World 2017
Information Development World 2017

Astoria Software will venture south from its offices in San Francisco to the heart of Silicon Valley venture-capital: Sand Hill Road in Menlo Park, California, and the latest gathering of the Information Development World.  The theme of the conference is Preparing Content for the Coming Artificial Intelligence Revolution.  Click here to register.

There’s nothing like IDW. Not even close. It’s a three-day, laser-focused, guided journey designed for technical, marketing, and product information managers—the folks responsible for managing the people, processes, and technologies involved in creating exceptional customer experiences with content. This year’s program features:

  • Innovators, artists, scientists, engineers, academics, and business leaders prepared take you step-by-step through the topics that matter
  • One single room, an intimate setting for collaborating with colleagues across multiple disciplines while focused on a common discussion topic

IDW is going for a more intimate venue this year: the Quadrus Conference Center.  Here’s a map of the location:

Is a Document Management System a half measure?

While the direction of this blog is forward-looking, it is instructive at times to consider the history of technologies and techniques.  One such is document management, and its predecessor, electronic document imaging, both of which are precursors to modern content management.  This is not to say that document management is dead as a technology or a solution; in fact, in some operational circles document management is very much alive and useful.

The earliest document management systems addressed the problem of paper proliferation.  "Electronic document imaging systems" combined document scanning with database-driven storage, indexing, and retrieval to form libraries of what were once reams of paper files. "Document management" became a solution in its own right as vendors added support for digital file formats generated by word processors, spreadsheets, and other office-productivity products. The descendants of those earlier systems are the document-based enterprise content management systems of today, such as Microsoft SharePoint, OpenText Documentum, and Hyland OnBase.

When is DMS useful?
One question to consider: is a document management system (DMS) relevant in the modern world of digital content management? At its core, a DMS knows nothing about the information within a document; that is, users don't link to content within a document managed in a DMS. Instead, users tag whole documents and link one whole document to another whole document; the DMS simply maintains the inter-document links.  Hence, in the context of digitized content, a DMS is something of a half-measure because each document under management exists as a static element.

"DMS is closer to DAM rather than CMS."

This may be sufficient for some organizations and in some applications. If the document itself is significant – either supplementary to or alongside the data it contains – then a DMS represents what could be a supremely useful permutation of content management. For instance, it's one thing to have a database containing the collected works of William Shakespeare intricately tagged and linked via hypertext. It's an entirely different concern, though, to digitize a specific document written in Shakespeare's own hand.

In a way, a DMS is closer in function to that of a digital asset management system rather than that of a content management system, especially in its ability to protect and preserve the original form of a document. A DMS can also be a very low-cost solution given the dozens of open-source document management solutions available today. Enterprises looking to achieve organization and clarity when dealing with large physical archives of documents may choose from a wide variety of free and fee-based DMS solutions. Using existing hardware and software like cloud computing, scanners and simple image editing and management software, an enterprise can digitize its documents without having to build or acquire a more complicated CMS.

The limits of DMS
However, by leaning on a DMS, enterprises may find themselves running up against the lack of sophistication innate to the software. Since the tagged data is essentially referential to the document itself, it is easy to miss valuable insight contained within the document. Documents cannot easily be interrelated with similar content or data recombined into something new.

Enterprises have found value in linking digital asset management with content management, so it's likely that a DMS working in conjunction with CMS is the ideal solution. If the physical document itself – or at least the visual representation of it – is of value, the ability to tag and separate the data within the document while still preserving it in a static form will lead to a more agile, comprehensive information.

The importance of simplicity in content languages

Everyone agrees: When designing a content or markup language, simple is better. Yet as intuitive as this may seem, the development arc of technology runs counter to this imperative – always evolving in terms of complexity. If we are looking to get more done with content, why do we want a relatively unsophisticated language to assist us?

Building blocks of complexity
First, it helps to understand the rationale for the evolving complexity. As technological capabilities expand – combined with users coming of age with expanded capabilities – innovation naturally pushes the boundaries of current content languages, particularly as we find ourselves needing to express and support more complex and dynamic content. The companies at the fore of innovation have subsequently made developing new programing languages to support expanding infrastructure an imperative. This has led to a boom in different programming languages of varying levels of complexity, particular according to Viral Shah, one of the creators of the programming language Julia

"Lightweight languages have endured for years."

"Big tech companies tend to have their own programming languages — Go at Google, Hack at Facebook, Swift at Apple, Java at Oracle [sic; Sun developed Java for different reasons], C# at Microsoft, or Rust at Mozilla," Shah told VentureBeat. "If you think about it, this makes sense: Software is the core competency of traditional tech companies — they can afford to have their legions of professional programmers use 'hard' languages like C++ and Java, which are great for performance and deployment, but less good for exploration and prototyping."

What Shah is pointing out is one of the key principles that makes lightweight markup languages crucial: While these companies have the capabilities to design their own languages that suit development needs, at the core of each are less sophisticated languages. In this respect Java and C++ are the building blocks supporting increased complexity.

The lasting power of markup
Similarly, when it comes to content – including tagging and metadata – lightweight languages have endured for years alongside more complex, proprietary solutions. Something like Markdown has been a favorite of bloggers, web writers and editors, developers, academics, technical writers and scientists looking for simple ways to translate simple text into HTML and XML, acting essentially as shorthand.

"Years ago, I started coding websites with HTML and then structuring documentation with XML, but Markdown allows me to use plain text for similar purposes," Carlos Evia, Ph.D., director of professional and technical writing and associate professor of technical communication in the Department of English and Center for Human-Computer Interaction at Virginia Tech told The Content Wrangler. "My Markdown files can become HTML and XML deliverables with one or two lines of commands or a few keystrokes."

Lightweight markup languages like Markdown thrive on their simplicity, bringing with them built-in constraints. As such, they area rarely the be-all, end-all for content creators and instead act as a vital component in a more sophisticated authoring tool chain. But this is the key to its staying power: With no end in sight for innovation and development of new languages, being able to author content in a simple language allows that content to be more portable a future iteration. Rather than having to parse artifacts of an outmoded language when transferring in older content, with simpler languages, the content remains relatively "pure" and thus more easily repurposed.

"The constraints of markup languages are one of their virtues."

Constrained, yet free
Mark Baker, writing for Every Page Is Page One, points out that the constraints of markup languages are one of their primary virtues. He points out these constraints essentially translate into a style guide, limiting the possibility for errors or deviations from house style. He also points out that simpler languages can interface with software and algorithms more easily, supporting automation and creating more naturalistic content.

"Every markup language has at least one program to process it and turn it into output (at a minimum, HTML)," Baker writes. "Those programs work because they know the constraints of the language. They know all the structures that are allowed to exist in the content, and all the combinations they are allowed to exist in, and they know how to format each of them."

He goes on to outline how this can extend well beyond formatting into API documentation, allowing for more sophisticated source tracking, combining sources into a single reference entry, error checks and validating the written content to make sure it conforms with the actual function definitions in the code.

Which brings us back to the main point: a relatively unsophisticated language with known constraints leaves authors free to create more compelling and dynamic content.  It is informal proof of the mantra that "simple is better."

Who ‘owns’ content design features?

Content is ever-changing. This is both its greatest virtue and the most significant challenge for designers. In the pursuit of even more intelligent and efficient user interfaces, CMS vendors are tasked with constantly redesigning their software to accommodate innovations in content format and design.

In the pursuit of a CMS that will successfully manage new forms of optimized content, there is one major obstacle that stands in the way of innovation: the provenance of content throughout its lifecycle.  With the rise of cross-platform giants like Amazon and Google, content is now being repurposed, reinterpreted or filtered through any number of proprietary formats, any one of which allows a company to stake an ownership claim. But where is the the line between content that can exist safely and comfortably within an multi-platform ecosystem and content that can be designated "property"?

The changing definition of content
The challenge of defining what counts as "proprietary" content lies in defining content in our modern data economy. If you are a user investigating a certain product on an ecommerce platform like Amazon, you will encounter a product description possibly submitted by the manufacturer or drafted by an author at Amazon itself. It is nearly impossible to trace authorship and ownership of the content since it will have been repurposed many times across a variety of platforms whenever you search for the product. This content may also be repurposed to appear in different formats: Written word turns into spoken audio which can in turn be captured on film. If all this different content is connected to the product and is the same copy, can it truly be considered different – or the same – content?

"Experts suggest that the definition of content should be expanded."

This has led to experts within the content and CMS design community to suggest we move away from the traditional definition of content as "copy produced by a single author," embracing instead a broader definition outside of where it occurs and its format.

"We need to shift our definition of content to be what the user needs right now," says Jared Spool, founder of User Interface Engineering. "It has nothing to do with how it's produced or where it lives on the server. If the user needs it, it's content."

Can you 'own' a need?
In this regard, Spool identifies content as the solution to an operational problem. Creating it comes down to identifying a need and producing something that satiates the need. This, however, becomes complicated once you introduce the idea of commercial platforms producing and managing content to meet the demands of their customers.

"If we want content seen as a business solution to a problem, we need to change expectations around what it is and what it is supposed to do," wrote AHA Media Group's Ahava Leibtag. Leibtag points to the obligations that organizations have, not only to produce and disseminate content, but to protect branding and control what it considers "proprietary."

Companies cannot patent an identified consumer "need", and the infrastructure relating to the pursuit of original content authorship privileges above all else simply doesn't exist in a robust form. Yet what organizations can do is develop proprietary design features. These features essentially act as a lens for content to be viewed: The basic content would exist outside the reach of patent, but the design features that can be woven into the overall platform interface could be copyrighted. Much like the way Microsoft and Apple of a generation ago sought to protect the look-n-feel of their respective products, modern companies can use formatting and user behavior as a mechanism for protect their proprietary interests over data that they did not create.

Most frequently used content creation and editing tools

In the world of content creation and editing, tracking the tools used across the entire industry can be tricky. For example, tools at one enterprise that facilitate collaboration while preserving authorship may be less valuable to another enterprise that needs integrated formatting and the ability to embed rich media.

In its exploration of content creation and management trends in 2016, the Center for Information-Development Management issued a survey to 328 individuals across the entire content creation spectrum. Writers, managers, information architects, content strategists, editors and a small contingent of IT support, customer services and publishers were represented – with the overwhelming majority of respondents representing computer software companies.

The survey sought to answer a few basic questions: What tools do you use to create and manage content? What kind of content do you most frequently develop? How will this content be published in years to come?

Tools of the trade 
As one might imagine, DITA played a significant role in content creation across all respondents. Roughly 74 percent of those surveyed reported using some kind of DITA-capable XML Editor as their primary content creation tool, far exceeding other tools. Following that, 66 percent reportedly used Madcap Flare, 53 percent Unstructured Adobe FrameMaker, 43 percent Adobe InDesign and finally, at 38 percent, Microsoft Word.

Microsoft Word's fall from preeminence for content creators is somewhat predictable. Content experts across the industry have been predicting the end of generic simple document creators, with many saying that basic text-to-HTML conversion tools like Markdown will render Word virtually obsolete among professional content creators.

One of the more fascinating insights in this data is the role native HTML authorship plays in content creation: While few survey respondents (25 percent) claimed an HTML editor as a primary tool, it was overwhelmingly the favorite secondary tool across all categories – coming in at a total of 52 percent. From this, we can extrapolate that content creators:

  • Are shifting away from creating HTML first/only content.
  • Still require HTML editing tools to fully leverage content production and publishing.

Where is content being published?
This seems to follow data insights related to falling use of HTML-based delivery: While still the preferred means of publishing for almost 75 percent of survey respondents, mobile is coming up rapidly – albeit with content creators seemingly confused as to how to fully leverage it.

"We were interested to learn how organizations are approaching publishing to mobile devices, since we advocate designing content differently for mobile devices," the authors of the CIDM survey stated. "Fully 38 percent report that their content is the same on all devices. Some publish more content on mobile devices (only 4 percent); more publishing less content (24 percent)."

The one not mentioned: Localization
This points to the fact that mobile content creation tools are still not being considered separately to traditional content creation. One facet not mentioned in the CIDM survey is localization. Yet this seems to ignore one of the fundamental tenants of the mobile experience: that localized UX is a crucial element for consumer engagement and must be taken into account in the creation of specific content. Tech.Co emphasizes that, for mobile experiences related to e-commerce, localization tools beyond simply translation are key as well.

"If you're doing this, be sure to use widely accepted localization packages or hire an expert to work on the content for you as there will be nuances across languages that even Google Translate doesn't quite get yet," wrote Tech.Co's Joe Liebkind. While mainstream content creators may be focused on issues related to format conversions, the greater topic of authoring content for diverse audiences seems to be underrepresented.