
The advent of Internet archiving has changed the way we think about media and content.
The advent of Internet archiving has changed the way we think about media and content. The continuum of information has flattened, unchained as it is from physical form, so that as soon as content is published, it can be cataloged, reused and repurposed at any time.
"As soon as content is published, it can be cataloged, reused and repurposed."
At the forefront of this revolution are websites like the Internet Archive. The site has amassed approximately 25 petabytes of data — a repository of digitized media including books, films, TV clips, websites, software, music and audio files, photos, games, maps, court/legal documents — all made freely available. As part of its "Wayback Machine" project, the Internet Archive offers the Archive-It tool, which has thus far saved historical copies of 484 billion retired and indexed web pages and which allows users to "[c]apture a web page as it appears now for use as a trusted citation in the future." The operators of the site liken the archive to the fabled Library of Alexandra – a repository of human knowledge and culture, supposedly lost in antiquity.
"We believe it's crucial to provide free access to information. Our society evolves because of information, and everything we learn or invent or create is built upon the work of others," Alexis Rossi, Internet Archive's director of Media and Access, told The Content Wrangler. "In a digital age, when everything is expected to be online, we need to make sure the best resources are available. The human race has centuries of valuable information stored in physical libraries and personal collections, but we need to ensure that all of it is online in some form."
The challenge of cataloging
While maintaining the massive archive — according to Rossi, the site tops 50 petabytes due to built-in replication and redundancy — is a feat of engineering itself, the true challenge is in cataloging and maximizing accessibility. This is where the world of content management and archiving begin to intersect. By combing through the archives and breaking its content into hyperlinked components (similar to the way one constructs content in DITA), this can render content much more discoverable and thus able to be repurposed for new content.
This represents a distinct evolutionary – and revolutionary – shift in the way that we approach content, primarily in terms of scope. Whereas traditional authors might have been limited to accessing historical materials, modern authors are now theoretically unbound because all recorded and cataloged media and content are available at their fingertips. The fundamental question for authors and curators changes from "Does this content exist for citation?" to "Can I find the content?"
"Authors are now unbound by the traditional constraints of archiving."
The great equalizer: Free
Tied into this increased availability is the fact that the content made available by Internet Archive and similar sites is totally free. This access model, in its own way, has become an equalizer of sorts, homogenizing the "Can I find it?" question to an egalitarian take on human knowledge. The free component is an ideological cornerstone of the Internet Archive and its contributors, which includes Jessamyn West, a library consultant and community liaison for the Open Library project.
"We make it available for free, and that's especially important to the underprivileged and to people in other countries who may not have free access to information," West said to The Content Wrangler. "This kind of access has great value, because knowledge is power."
This, however, may be a simplified take on the true value of archiving. While not every site may provide meaningful — or accurate — information, experts like John Wiggins, director of Library Services and Quality Improvement at Drexel University, claim that content creators can still benefit from the historical aspect of an archive, allowing them a glimpse into the way cultural forces have shaped and guided content throughout time.