Segmentation as a key to personalized content delivery, part 1

When it comes to delivering personalized, localized and – most importantly – relevant content to audiences, segmentation is a key concept. Segmentation, in this context, does not refer to the division of text into translatable chunks.  Instead, it refers to the classification of content consumers according specific parameters.

There is a compelling business need for incorporating segmentation into an information architecture: helping customers make purchase decisions about the products and services described in your content. The underlying problem is the evolutionary expansion of technology; namely, as storage capacity expands and content management capabilities grow more sophisticated, the volume of data under active management also expands. Yet, brands, companies and other content providers must continue to deliver only the most relevant content to their customers while screening out irrelevant data that can otherwise disable crucial decision-making when it comes to making purchases.

The solution to this problem has two parts. The first action, discussed in the following paragraphs, is to develop effective, defensible rules for segregating the people who read your content. The second part, which will be discussed separately, is to incorporate those rules into the way you classify your content.

"The first step is to building segmentation is to develop audience personas."

So how do you develop appropriate segmentation rules? The first step is to build audience personas, defining the qualities and interests that will route your audience to different pieces of content. Building these simply from end-user IP addresses, however, is tilting at windmills. A better segmentation strategy focuses on a few primary areas where audiences offer distinguishing characteristics and then sorting these characteristics into their respective content channels. Here are candidate primary identifiers for segmenting your audience.

While it's asking the impossible to build audience personas solely from IP addresses, it is nonetheless true that IP addresses offer some meaningful guidance to the task. For example, you can know whether or not to deliver translated or localized versions of your content.  You can pinpoint cultural signifiers and traditions that may shape how your audience interacts with content. Then there's the fact that IP addresses help you identify key geographic zones of influence, be they broad measures such as a continent or a country, or more precise identifiers such as a region, a county or ZIP code. While our increasingly interconnected world has broken down the barriers that once defined a geography, different areas can still drive audience behaviors and thought in a way that requires segmenting.

From the "where?" of geographic identifiers, demography identifies the "who?" that makes up your audience. Demographics can take on a nearly infinite array of attributes, from gender to age to national origin to economic status and income. Probably the most immediately relevant is occupation, since the services being offered can be specific to a certain audience within a single company (for example: HR supervisors versus regional managers).

Demography identifies the "who?" of your audience.

Behavioral attributes are simply the answer to the question, "What actions does this audience segment regularly perform?" This method examines patterns of behavior: where people shop, what they buy, what kind of web pages they look at and for how long.

Psychographic segmentation is one of the more nuanced approaches to audience segmenting since it takes data from geographic, demographic attributes, and behavior sources to synthesize a psychological profile. This profile, though, is less about identifying patterns in what people do and more about identifying what and how they think. Hence, a profile subjected to psychographic segmentation focuses on the following attributes:

  • Lifestyle and personality. Beyond their behavior, audiences identify themselves in accordance with the aspects that have the most meaning in their lives. This can be a self-identified interest derived from behavior but distinct from it. A person may identify with and fit the profile of a "Harley-Davidson bike owner" even without purchasing a motorcycle. A sustained interest in the culture and ephemera related to Harley-Davidson ownership is enough to accurately capture this audience segment and deliver customized content. 
  • Values, attitudes and opinions. Values, attitudes and opinions provide the framework of thought and perspective, which drives an emotional response to stimuli.
  • Social class. Class consciousness plays a big role in psychographic identification. Class differs from lifestyle in that "class" describes an inherited set of rules (acquired through family or peer-group interactions) governing where people exist and how they relate to others in various classes, whereas "lifestyle" describes a set of chosen interests. A prime example is the upper-class young lady looking for information on handbags who is driven by the luxury standards of her social circle, as compared to a middle-class young lady looking for information on inexpensive, rugged alternatives. 

Next time, we will explore the ways to most effectively and accurately identify audience segments and funnel them into your content.

Translation versus localization: creating globalized content

In an interconnected world, your content has value far beyond your backyard – so long as people can understand it. As part of the globalized content market, design teams are faced with a fundamental choice: should content strategy prioritize translation, localization or some combination of the two?

Reading versus comprehending
The first obstacle in developing the most effective strategy is that content marketers often don't recognize the distinctions between the two approaches. The terms "translation" and "localization" are often used interchangeably. 

"Your content has value far beyond your backyard."

The simplest way to understand the difference between translation and localization is to think of the underlying values they serve. Translation is a very literal, data-driven process: it takes data from one locale and substitutes its equivalent value in another locale. This means that on a basic level, the document is being reformatted to be read by a foreign audience.

For enterprises operating outside of their domestic market, bringing native content to foreign readers represents a challenge – one often met with a ham-fisted "throw it in a translator" approach. But as anyone who has used an online translation service may have noticed, a substitution-style translation of text or other content doesn't always result in something that makes sense. Even technical documents, driven and quantified by empirical data, may end up virtually incomprehensible – although technically readable – after a such a translation.

Crossing the cultural barrier
Karl Montevirgen, writing in The Content Wrangler, explains that translation is concerned with bridging the language barrier, while localization is about crossing the cultural barrier. As with prose, a certain vernacular expression or linguistic shorthand in technical material written for one culture may not carry over into another culture. Even in the hands of a trained bilingual and without a simplistic word-by-word translation, content that speaks to readers in different cultures with equal fidelity can be elusive.

This is where the world of dynamic content writing has a unique edge when it comes to translation. Depending on the sophistication of the content management and editing system, the ability to divide content into hypertext components makes translation more than simply substituting words; it allows for complex reconstitution of content into another language.

"With localization comes added costs – both monetary and time."

The value and cost of localization

This is where localization can edge out translation as a strategy to market content. By creating customized content with a culture's native tongue, you can ensure that the content speaks directly in the language of your audience while taking on the cadence and cultural mores of that demographic. This can have significant value when it comes to avoiding any miscommunication or snafus that may arise by awkward translation.

However, with localization comes added costs – both monetary and time. Localization requires native speakers and contributors, which will be able to communicate in any specific target language but may not be experts or otherwise familiar with the subject matter of the content. Alternatively, locally created content may be functionally indecipherable to the home enterprise, making fact-checking and editing impossible.

In the end, an enterprise looking to spread its content globally employs some mix of translation and localization to achieve the optimal combination of cost, schedule, and meaningful outreach.

How will online archiving influence content management?

The advent of Internet archiving has changed the way we think about media and content.  The continuum of information has flattened, unchained as it is from physical form, so that as soon as content is published, it can be cataloged, reused and repurposed at any time.

"As soon as content is published, it can be cataloged, reused and repurposed."

At the forefront of this revolution are websites like the Internet Archive. The site has amassed approximately 25 petabytes of data — a repository of digitized media including books, films, TV clips, websites, software, music and audio files, photos, games, maps, court/legal documents — all made freely available. As part of its "Wayback Machine" project, the Internet Archive offers the Archive-It tool, which has thus far saved historical copies of 484 billion retired and indexed web pages and which allows users to "[c]apture a web page as it appears now for use as a trusted citation in the future." The operators of the site liken the archive to the fabled Library of Alexandra – a repository of human knowledge and culture, supposedly lost in antiquity. 

"We believe it's crucial to provide free access to information. Our society evolves because of information, and everything we learn or invent or create is built upon the work of others," Alexis Rossi, Internet Archive's director of Media and Access, told The Content Wrangler. "In a digital age, when everything is expected to be online, we need to make sure the best resources are available. The human race has centuries of valuable information stored in physical libraries and personal collections, but we need to ensure that all of it is online in some form."

The challenge of cataloging
While maintaining the massive archive — according to Rossi, the site tops 50 petabytes due to built-in replication and redundancy — is a feat of engineering itself, the true challenge is in cataloging and maximizing accessibility. This is where the world of content management and archiving begin to intersect. By combing through the archives and breaking its content into hyperlinked components (similar to the way one constructs content in DITA), this can render content much more discoverable and thus able to be repurposed for new content.

This represents a distinct evolutionary – and revolutionary – shift in the way that we approach content, primarily in terms of scope. Whereas traditional authors might have been limited to accessing historical materials, modern authors are now theoretically unbound because all recorded and cataloged media and content are available at their fingertips. The fundamental question for authors and curators changes from "Does this content exist for citation?" to "Can I find the content?"

"Authors are now unbound by the traditional constraints of archiving."

The great equalizer: Free
Tied into this increased availability is the fact that the content made available by Internet Archive and similar sites is totally free. This access model, in its own way, has become an equalizer of sorts, homogenizing the "Can I find it?" question to an egalitarian take on human knowledge. The free component is an ideological cornerstone of the Internet Archive and its contributors, which includes Jessamyn West, a library consultant and community liaison for the Open Library project.

"We make it available for free, and that's especially important to the underprivileged and to people in other countries who may not have free access to information," West said to The Content Wrangler. "This kind of access has great value, because knowledge is power."

This, however, may be a simplified take on the true value of archiving. While not every site may provide meaningful — or accurate — information, experts like John Wiggins, director of Library Services and Quality Improvement at Drexel University, claim that content creators can still benefit from the historical aspect of an archive, allowing them a glimpse into the way cultural forces have shaped and guided content throughout time.

A return to physical content archiving?

In a unique and unexpected twist on the traditional push to digitalization, has announced it is shuttering its site, freezing services as of September 2016. What makes the story surprising is that the site's operators, Craig Mod and Chris Palmieri, have written in a Medium post that they will be archiving the site's nearly 2,000,000 words and 14,000 photos onto a microprinted two-by-two-inch nickel plate and sent to various locations all over the world, including a copy ending up in the Library of Congress.

'Medium, not media' 
The plates can only be viewed with a 1,000-power optical microscope and have a lifespan of roughly 10,000 years, resistant to fire, water and salt damage. Mod and Palmieri pointed out that, while they will be paying to maintain a digital, hosted version of the site and a historical copy will be entered into the Internet Archive, the process is designed to embrace a physical footprint over a digital one.

"The process does not produce "data." It is not like a CD," write Mod and Palmieri. "It is not a composition of 0's and 1's representing the information. It is the information itself. The nickel plate is a medium, not media.""

"The nickel plates have a lifespan of roughly 10,000 years."

Repository or crypt?
This take on "time capsule" archiving is nothing new: In their coverage of the project, The Atlantic talks about the Crypt of Civilization, a 2,000-square-foot sealed vault initiated by President of Oglethorpe University in Atlanta, Thornwell Jacobs, in 1940. The vault contains about 640,000 pages of text reproduced on microfilm and is designated to be reopened in 8113 C.E.

"Today we can place articles in the crypt and nothing can keep them from being readable a million years from now," remarked Jacobs while planning the Crypt in 1938. This, in a sense, mirrors the optimism of Mod and Palmieri and even alludes to the coming modern era of increasingly inexpensive and simple content archiving. However, only two years later, Jacobs seemed to have taken on a more somber tone in the wake of global war breaking out.

"The world is now engaged in burying our civilization forever," he recorded as part of speech included in the Crypt, "and here in this crypt we leave it to you."

Time capsules 
While these words may seem melodramatic, they retain a certain ring of truth even now: While our archiving and content management capabilities have grown more sophisticated, archives will always remain vulnerable to acts of malice, negligence or simple indifference. The vast digital depositories of information we have aggregated and cataloged could vanish into the ether with the failure of a specific server or be rendered unreadable by future generations accessing it with futuristic technology. In essence, while the idea of a physical archive for media may seem antiquated, it may in fact be a worthwhile investment in preserving content well into the foreseeable future.

New White House initiative aims to bring government into tech age

Marking a distinct transition from the technological missteps of previous years, the White House has announced the formation of the United States Digital Service. Described as a "startup" founded by President Obama, the goal is to partner government with technology providers to create a more intuitive, modern approach to addressing national priorities.

"What if interacting with government services were as easy as ordering a book online?" writes the Executive Office of the President. "The challenges behind brought this question to the forefront, changing our government's approach to technology."

"The White House has announced the formation of the United States Digital Service."

Learning from past mistakes
Indeed, this reference to the issues that occurred during the launch of is telling. As the first major push to implement public policy via a major technological initiative, the seemingly unending issues hampered the site's overall efficacy and put a serious damper on the optimistic tone of the Obama administration. Rather than attribute the failure to negligence or mismanagement, industry experts like Aziz Gilani saw the site as an example of "too much, too fast."

"The federal government is just like every other enterprise out there," Gilani told CMS Wire. "It's facing a lot of pressure to join the world in Software-as-a-Service (SaaS) transformation, but then be able to release and maintain applications with the shortened sprint times required to support those types of applications."

Enlisting help from the world of tech
Through the USDS, the government aims to avoid the errors of the past by bringing technological infrastructure design in-house. To do this, the startup employs a team of engineers, led by Mikey Dickerson, a former Google engineer who played a key role in salvaging Dickerson has been an outspoken critic of the government's previous efforts – or lack thereof – to embrace technology, saying that the private sector has long since blown past federal agencies in the way they interface with cutting-edge technology.

"First of all, government still calls it 'IT' and 'cyber' which the tech industry does not, and that's a clue right there," Dickerson told Gov Insider. "This issue has become particularly acute and visible to the public in a really painful way. Ten years ago the iPhone didn't exist and now innovations like smart phones, GPS and Uber are deeply intertwined across people's everyday lives – with government looking flat-footed by comparison."

"Different systems designed separately breed translation and communication issues."

Streamlining content design and dissemination
One of the biggest hurdles to government-sponsored technology is creating consistency across every agency. In previous years, the government eschewed a top-down approach to content design and consistency, opting instead to have each agency individually contract out for their technology needs. This led to a fundamental operational roadblock: Design would vary dramatically from agency to agency.

Beyond the aesthetic mismatch and user-experience incongruities stemming from this lack of consistency, disparate systems separately designed bred translation and communication issues. Cross-agency data exchange became difficult, contributing to the infamous delays associated with government agency communications. It also rendered the ability to generate data-rich hypertext effectively impossible.

These are essentially the same problems facing large groups of content creators who operate within the same company but are otherwise completely disjointed in their associations with each other.  Agreeing to a common set of information architecture rules brings these groups into alignment.  Establishing a common data-encoding standard lets them share their information more easily.  Adopting a common set of tooling streamlines the collaboration and the overall productivity of each team.

The USDS is following a similar playbook. Its U.S. Web Design Standards program provides a style guide, mobile-responsive design constraints, and recommended open-source coding tools to create a seamless web standard across all agencies subject to federal oversight. Whether or not the USDS will broaden its scope to include content creation and management systems remains to be seen, but the push towards commonality inspires some optimism within the world of content strategy.

A browserless world via chat apps?

Just when the content market gains parity with full web 2.0 integration, new apps and functionality are shaping the way we manage and distribute content to consumers. An emerging trend: a shift away from open accessibility browsers to proprietary chat apps.

In Facebook's Spring 2016 announcement of expanded functionality in its Messenger app, one of the benefits emphasized by the social media giant was new content publishing capabilities. Facebook invites developers to build bots that use the Messenger Send / Receive API, which now supports, "…not only sending and receiving text, but also images and interactive rich bubbles containing multiple calls-to-action." The company also announced integration of's Bot Engine, which allows for developers to build more "complex" bots around machine-learning algorithms that can process natural language patterns. 

"80 percent of user time is spent on just five applications."

Too much to surf
While it's difficult to speculate as to its full impact, the expansion of chat app functionality may in fact signal that consumers are looking for a more direct way to receive and transmit data by way of content.

"Today, we no longer surf the web because there's too much to surf," writes Chris Moore is Chief Revenue Officer of Nexmo for The Content Wrangler. "Now, the bottomless ocean of information and data at our disposal has arguably become the Internet's biggest weakness. That reality, combined with the rise of smartphones as mainstream consumer devices, has ushered in an app economy where chat apps are fast becoming the new default destination for web-bound consumers."

Moore goes on to say that, according to a recent Forrester report, 80 percent of user time is spent on just five applications, with the closed-system of social media and messaging being the primary places spent. This could effectively mean that chat isn't just the web of tomorrow – it's essentially the web of today. 

An evolutionary essential 
So what does this mean for content managers? Simply, it means that chat- and social integrated-content management tools are a growing necessity. The ability to aggregate content and data, and apply learning algorithms in the forums where the most communication is occurring is the next evolutionary step for the content management and hypertext market. This means, too, that XML content management systems cannot hope to stay relevant in a market where content consumers are looking to bypass browser-oriented portals in favor of direct-from-the-vendor apps driven by user-configured bots.

According to Daniel Nations, a trend expert for About Tech, the "browsers of tomorrow" will likely be each website offering unique, proprietary apps and creating a seamlessly integrated browsing experience.

"I imagine it would be like merging our current browsers, ActiveX, and Java to create something that can be both a mini-operating system and a development platform," speculates Nations.

Mr. Nations may be showing his age, since ActiveX and Java in the browser are falling out of favor for their nearly innumerable security weaknesses.  Nevertheless, his core point is worth noting, "For you and me, it would be like loading up our office application, seamlessly switching between a word processor and a spreadsheet, and just as seamlessly switching to a massively multiplayer online roleplaying game."

House Speaker Paul Ryan proposes future legislation converted to XML

A recent measure by House Speaker Paul Ryan will see "all legislative measures" converted to XML. Ryan announced this proposal at the 2016 Legislative Data and Transparency Conference, saying it would be the continuation of work started by the creation of the Bulk Data Task Force in hopes of making "the House more open and transparent" – giving developers to opportunity to scrape data and make it more publicly searchable.

"All legislative measures will now be converted to XML."

"Now we're working to go further, and publish even more current and past documents in XML," Ryan told the assembled conference-goers. "I've asked our team to keep moving ahead by publishing all legislative measures in a standard format. That means enrolled measures, public laws, and statutes at large."

The Bulk Data Task Force was the work of Ryan's predecessor, John Boehner, and was designed to convert documents in bulk to digital markup languages. This led to the creation of the United States Legislative Markup Language, an XML vocabulary used to encode all versions of the United States Code created on or after July 30, 2013.

"We want to have a project we can start and complete in a fairly short time frame," Lisa LaPlant, Federal Digital System program manager for the Government Publishing Office, the government group initially funding the project, told FedScoop. 

In addition go improving transparency, Ryan stated that the conversion would help lawmakers make more informed decisions when proposing or arguing legislation. By entering laws into XML, the ease in searching and keywording historical legislation will, according to Ryan, guide lawmakers away from "making or repeating the mistakes of the past."