How to make our data accessible by our future generations?
I meticulously label and organize into folders the digital pictures, videos and documents that I create or take with my camera. The files get synced to a cloud backup service, which gets copied to a removable media once in a few months. Add to this are:
- MP4 files of the movies that I had bought as Discs,
- Songs and plays from the tape cassettes I owned as MP3 files,
- Scanned pictures from the family photo albums,
- Saved copies of email attachments and downloaded files like Insurance receipts, Bank Statements, Tax Challans,
- Scanned copies of old invitations of family events,
- Scanned copies of a handful of old books that I cherish and are not available elsewhere especially from the publishing firm (LIFCO) that my grandfather founded in 1929, and,
- Lastly, copies of the stuff that I bought online and are copyright-free like Audio from iTunes, IEEE Periodicals that I subscribe to, and, Books that I really like from the huge collections of Internet Archive Books and Open Library, Project Gutenberg, Project Madurai and out of print magazines like Chandamama.
The files are stored in my local server (check out the configuration and details in the blog post), with media files residing in the Plex Server software and Books in Calibre (the opensource ebook management software). Family members, mainly my sisters and my niece routinely send their pictures and videos, saying “You will keep it safe and give it to me when I want it!“.
Yes, that makes me a borderline Digital hoarder– why borderline? Because visiting /r/DataHoader subreddit I realized there are thousands of people out there who are much worse than me!
Over the years, I have gone through the pain of moving the files from one format to another (Wordstar to Word, WMA to MP3, AVI to MP4, and so on) or from one media to another (floppies to Iomega zip drives to CDs to DVDs to the cloud) or from one cloud backup (Sugarsync to Dropbox and OneDrive) and so on.
I keep wondering what will happen in the next 100 years to these digital files, data and the memories stored in them? When my (future) grandchildren go through them, will they be able to see them in a comprehensible manner?
Am I the only one with these thoughts and fear? Fortunately, it seems I am not the only one. Mr Vint Cerf, one of the fathers of the Internet too things on this problem, in an interview in 2015 where he had said:
I worry a great deal about that, You and I are experiencing things like this. Old formats of documents that we’ve created or presentations may not be readable by the latest version of the software because backwards compatibility is not always guaranteed. And so what can happen over time is that even if we accumulate vast archives of digital content, we may not actually know what it is. – Vint Cerf
He fears that future generations will have little or no record of the 21st Century as we enter what he describes as a “digital Dark Age”. The concept of what Mr Cerf refers to as “digital vellum” is a solution that takes an X-ray snapshot of the content and the application and the operating system together, with a description of the machine that it runs on, and preserve that for long periods of time. And that digital snapshot will recreate the past in the future. In an article titled “We’re Going Backward!” written by Mr Cerf in 2016 in the ACM magazine he explains his worry as below:
It seems inescapable that our society will need to find its own formula for underwriting the cost of preserving knowledge in media that will have some permanence. That many of the digital objects to be preserved will require executable software for their rendering is also inescapable. Unless we face this challenge in a direct way, the truly impressive knowledge we have collectively produced in the past 100 years or so may simply evaporate with time. – Vint Cerf
In this regard, I was happy to note GitHub is taking some action when they revealed late last year (2019) their plan (Github Archive Program) to store all of its open-source software in an Arctic vault as part of its Archive Program. Now they have completed this work, making sure future generations can access them within the next 1,000 years. Reading about the technology from their partner they have deployed to this is fascinating – it is called piqlFilm, a digital photosensitive archival film that can be read by a computer, or a human with a magnifying glass.
Note: When I posted the above on my Facebook page, there were comments that “my” personal photos and memories won’t be of interest to anyone other than me. It may appear to true, but I disagree. Today, we don’t value any of this, as we are in a deluge of information and as a result, the value of these collections diminish due to our limited time at hand. But in my digital collections, there may be event videos, books, movies, shows, news that may be of wider interest for a generation in the future. Today, there are historians who go through the everyday letters of saying the Victorian era to garner valuable lessons and insights. This same problem applies at a larger scale for libraries, museums, governments, courts, and temples whose documents and pictures must be preserved for human good for long. In the future, when computers have more intelligence and can sift through the data seamlessly, people will value data from the past.
Good information. Appears to be an unexplored area with business potential.