On recommendation of a friend I recently watched a documentary called Side by Side which details the history of the primary technology behind the cinema: the cameras. It starts off by giving you an introduction into the traditional photographic methods that were used to create films in the past and then goes on to detail the rise of digital in the same space. Being something of a photographic buff myself as well as a technological geek who can’t get enough of technology the topic wasn’t something I was unfamiliar with but it was highly interesting to see what people in the industry were thinking about the biggest change to happen in their industry in almost a century.
Like much of my generation I grew up digitally with the vast majority of my life spent alongside computers and other non-analog style equipment. I was familiar with film as my father was something of a photographer (I believe his camera of choice was a Pentax K1000 which he still has, along with his Canon 60D) and my parents gave me my own little camera to experiment with. It wasn’t until a good decade and a half later that I’d find myself in possession of my first DSLR and still not another few years until after then that I’d find some actual passion for it. What I’m getting at here is that I’m inherently biased towards digital since it’s where I found my feet and it’s my preferred tool for capturing images.
One of the arguments that I’ve often heard levelled at digital formats, both in the form of images and your general everyday data, is that there’s no good way to archive it in order for future generations to be able to view it. Film and paper, the traditional means with which we’ve stored information for centuries, would appear to archive quite well due to the amount of knowledge contained in those formats that has stood the test of time. Ignoring for the moment that digital representations of data are still something of a nascent technology by comparison the question of how we archive it has come up time and time again and everyone seems to be under the impression that there’s no way to archive it.
This just isn’t the case.
Just before I was set to graduate from university I had been snooping around for a better job after my jump to a developer hadn’t worked out as I planned. As luck would have it I managed to land a job at the National Archives of Australia, a relatively small organisation tasked with the monumental effort of cataloguing all records of note that were produced in Australia. This encompassed all things from regular documents used in the course of government to things of cultural value like the air line tickets from when the Beatles visited Australia. Whilst they were primarily concerned with physical records (as shown by their tremendous halls filled with boxes) there was a small project within this organisation that was dedicated to the preservation of records that were born digital and were never to see the physical world.
I can’t take much credit for the work that they did there, I was merely a care taker of the infrastructure that was installed long before I arrived but I can tell you about the work they were doing there. The project team, consisting mostly of developers with just 2 IT admins (including myself), was dedicated to preserving digital files in the same way you would do with a paper record. At the time a lot of people were still printing them off and then archiving them in that way however it became clear that this process wasn’t going to be sustainable, especially considering that the NAA had only catalogued about 10% of their entire collection when I was there (that’s right, they didn’t know what 90% of the stuff they had contained). Thankfully many of the ideas used in the physical realm translated well to the digital one and thus XENA was born.
XENA is an open source project headed by the team at NAA that can take everyday files and convert them into an archival format. This format contains not only the content but also the “essence” of the document, I.E. it’s presentation, layout and any quirks that make that document, that document. The viewer included is then able to reconstruct the original document using the data contained within the file and since the project is open source should the NAA cease development on the project the data will still be available for all of those who used the XENA program. The released version does not currently support video but I can tell you that they were working on it while I was there but the needs of archiving digital documents was the more pressing requirement at the time.
Ah ha, I’ll hear some film advocates say, but what about the medium you store them on? Surely there’s no platform that can guarantee that the data will still be readable in 20 years, heck even 10 I’ll bet! You might think this, and should you have bought any of the first generation of CD-Rs I wouldn’t fault you for it, but we have many ways of storing data for long term archival purposes. Tapes are by far the most popular (and stand the test of time quite well) but for truly archival quality data storage that exists today nothing beats magneto-optical discs which can have lives measured in centuries. Of course we could always dive into the world of cutting edge science for likes like a sapphire etched platinum disc that might be capable of storing data for up to 10 million years but I think I’ve already hammered home the point enough.
There’s no denying that there are challenges to be overcome with the archival of digital data as the methods we developed for traditional means only serve as a pointer in the right direction. Indeed attempting to apply them to digital the world has often had disastrous results like the first reel of magnetic tape brought to the NAA which was inadvertenly baked in an oven (done with paper to kill microbes before archival), destroying the data forever. This isn’t to say we don’t have anything nor are we not working on it however and as technology improves so will the methods available for archiving digital data. It’s simply a matter of time until digital becomes as durable as its analogue counterpart and, dare I say it, not long before it surpasses it.