Is there a way to permanently archive information?

Home » Is there a way to permanently archive information?

Digital Obsolescence: Is there a way to permanently archive information

In the birthplaces of writing, ancient Mesopotamia, Sumeria and Egypt, archaeologists find documents both divine and banal. From the ritual spells inscribed on the tombs of the wealthy to help them in their journey through the afterlife, to the accounts of livestock transactions preserved on clay tablets, these analog forms of media have proven far more durable than their creators might ever have imagined.

Yet many of these discoveries remain illegible despite the best efforts of linguists and historians. This is because language itself is a form of technology. While the media have survived, the means of reading it has long disappeared. Were it not for the chance discovery of priceless keys like the Rosetta Stone, we would be more illiterate still.

The fate of humanity’s ancient texts serves as an apt lesson to those who seek to create permanent digital archives today.

A Digital Dark Age?

The periods in history known as the Dark Ages have corresponded with a loss of access to archives of past learning. Disconnected from physical forms of cultural memory, civilization must undergo the hard and redundant work of rediscovery, often at great cost.

At the outset of the digital age, there was a great deal of boasting about how new forms of media such as optical discs had finally improved upon the perceived vulnerability of paper. Untold volumes of human knowledge have, after all, been lost throughout history to war (such as the burning of the great Library of Alexandria), accident and the hunger of mould. By contrast, plastics are so resistant to biodegrading that there are decent odds our shopping bags will outlast our civilization. Not only that, but the data contained on various disk formats is endlessly (and instantly) transferable and reproducible.

But while these formats are arguably less physically vulnerable than books, the technology they rely upon for use is more fragile by far. All that is needed for a sighted person to absorb the contents of a book is to be taught how to read the language. There is no equivalent process for reading a 9-track tape reel without its corresponding specialized tape player, though it might contain the equivalent of tens of thousands of books along its length.

Cornell University’s digital preservation tutorial maintains a timeline of various digital formats such as punch media, disks, tape and solid state media. It’s also a rare chance for those of us in the file management space, like our team at ShareArchiver, to gain a little perspective on how rapidly our industry is evolving—and how much is being left behind.

“The Chamber of Horrors”

Cornell’s timeline is cheekily entitled “The Chamber of Horrors,” in reference to the difficulties these obsolescent formats present to archivists. Take for example its section on disk media, which includes 12 distinct formats in use from 1971 through the early 2000s. Some of these, like the Sparq Disk Cartridge  were literally in use for less than a year before their manufacturer folded. Others, like the familiar compact disk, have changed their standards and file-naming conventions over the years, to the point that many early disks are now completely unusable.

Competition and technological advances mean high-turnover rates for formats, especially for niche products that never really caught on with consumers or institutions. And, in real terms, it’s usually no great mischief if someone is no longer able to access their old vacation photos because their camera used an unusual kind of SD card. But what if some piece of information essential to our understanding of history, of culture, of business is locked away in one of these mute hunks of plastic.

“Domesday” Predictions

After William the Conqueror and his Norman armies invaded England, once of his first acts was to order a full accounting of the holdings of his new realm, from land distribution and population down to the last suckling lamb. The result is known as the Domesday Book (Middle English for “Doomsday”), which has become one of the most important texts for historians studying the era. It is a snapshot of Britain circa 1086 A.C.E. that is almost unparalleled in antiquity.

To celebrate the 900th anniversary of the celebrated compendium, in 1986 the BBC and a number of partners launched a modern multimedia Domesday Project. Over 1 million people were ultimately involved in producing the project, which included maps, video tours of various landmarks, census data and personal writings by a cross-section of the UK’s population. It was hoped that the project might be as accessible in the year 2886 as the original Domesday Book is to us today.

But there was a problem. The BBC Domesday Project did indeed achieve historical significance, but not in the way its authors intended. It is commonly cited as the poster child for  :

“The project was stored on adapted LaserDiscs in the LaserVision Read Only Memory (LV-ROM) format, which contained not only analogue video and still pictures, but also digital data, with 300 MB of storage space on each side of the disc. Data and images were selected and collated by the BBC Domesday project based in Bilton House in West Ealing. Pre-mastering of data was carried out on a VAX-11/750 mini-computer, assisted by a network of BBC micros.

[…] In 2002, there were great fears that the discs would become unreadable as computers capable of reading the format had become rare and drives capable of accessing the discs even rarer.”

For those keeping score at home, the Domesday Book has made it 932 years and counting without need of a format change, whereas the multimedia Domesday Project would have been lost were it not for a heroic technical effort within just 26. Archivists were able to rescue the information, even creating a version which runs on a Windows PC. But, as the motto of Cornell’s digital preservation program underscores, this is merely “implementing short-term strategies for long-term problems.”

As of 2018, the Domesday Reloaded project, which made the raw information from 1986 accessible online, has been shut down, with no news as to whether or how the public will be able to access the information in future.

Physical vs. Digital Storage Media

This isn’t to say that time doesn’t also exact a heavy price of physical media. Consider this account from NPR of the difficulties archivists at the Library of Congress face as they attempt to preserve their vast audio holdings:

When you’re working with old formats, you are often racing against time. With wax cylinders from the 1890s — one of the oldest recording formats — the heat from your hands can cause them to crack. They require highly specialized, expensive equipment to digitize, as well as people who know how to use it.

Records made during World War II, constructed out of glass because other materials were going toward the war effort, are so fragile that they can break even when they’re handled properly.

Digitizing this information requires the media be played in its entirety so that it can be recorded—which puts a temporal limit on how quickly archiving can be achieved. The Library’s most fragile holdings are literally turning to dust faster than they can be saved to tape or hard disk. It’s easy to see why the question of whether said tapes or hard disks will be usable in future is of less-pressing concern.

Why these digital archiving questions impact modern business

While most of the information we at ShareArchiver help our clients preserve is of lesser historical significance than say, Martin Luther King’s “I Have a Dream” speech, we still must help them tackle this question of preservation. Between external regulations like GDPR, and a company’s own need for internal information continuity, companies are collecting and archiving more data than ever before—in fact, your average medium-sized business holds a greater volume of information than any of the great lost libraries of antiquity.

And they can be just as vulnerable, having exchanged the security of armies for modern backups, data replication and digital security. We’ve touched on this a number of times in previous posts, such as our examination of the limitations of backup tape storage, and the obsolescence of on-site file storage. Our application is one of many which helps companies stay safe and compliant to their file archiving responsibilities in the short- and medium-term, but in our next post we’ll be trying our hand at examining the big question:

Can anything be saved forever?

Stay tuned.