Cheapskate's Guide

Home Contact

The Problem of Long-Term Offline Data Storage for Consumers

12-12-19



Hard Drive Governments and large corporations recognize that long-term data storage is a problem without an easy solution. While they can afford to spend millions of dollars for data repositories buried inside mountains in the Arctic or for writing data onto quartz glass platters with femtosecond lasers, the average consumer is forced to take a cheaper approach.

Data can be stored on line for increasingly reasonable prices, though still higher than do-it-yourself solutions. But, what about those of us who don't want to trust the security of our data to organizations that provide no guarantees that it will be there when we need it or that it won't be sifted through and sold to advertisers or shared with the NSA? How can we best preserve our data off line?



Data Storage Media

Unfortunately, the typical, inexpensive data storage methods used by consumers have lifespans significantly shorter than the lifespans of most consumers. Data storage media used by most consumers suffer from high rates of data degradation, sometimes referred to as "bit rot" or "disc rot". Hard drives and magnetic tapes lose their magnetization. CD's and DVD's degrade physically via oxidation, delamination, and chemical reaction due to impurities, dirt, and fingerprints. And, flash memory cards, USB flash drives, and solid state drives lose their electric charge over time. Flash storage media can also be corrupted by cosmic rays. Reliable data about the length of time these degradation processes take to erase or corrupt data is hard to come by, and data retention times tend to be over-estimated by manufacturers. The table below gives some educated guesses, based on laboratory "accelerated life testing", about the maximum number of years that data can be reliably stored on various media. Remember that the accuracy of predictions made by accelerated life testing is not above questioning.

A new type of write-once optical disc called an M-Disc has been widely available for a few years. M-Discs come in DVD and Blue-Ray (BD-R) formats, and the manufacturers claim that they store data safely for as long as 1000 years. But wait. There may be insufficient evidence on which to base that claim. Carefully read through some M-Disc test results before you run out and spend your money. Should you decide to use this solution for your long-term data storage, Amazon sells 4.7 GB DVR-R M-Discs for $35 for 25 disks and 25 GB BD-R M-Discs for $50 for 25 disks. DVD-R drives with M-Disc support can be had for as little as $20, and BD-R drives with M-Disc support can be purchased for $70. Here is some anecdotal evidence regarding the reliability of M-discs compared to that of DVD's.

Another new type of storage medium is 3D XPoint memory (pronounced "cross point"). This medium uses a phase change to store data, so power is not required to retain data, and it can be stored for much longer than on other media that must hold an electric or magnetic charge. Unfortunately, 3D XPoint memory is still expensive and is manufactured only by Intel under the Optane brand. Amazon currently sells a 118 GB Optane SSD for $199. I was not able to find a prediction about the lifetime of data stored on an unpowered Optain SSD, but it should be much more than 10 years, assuming the firmware on the Optane SSD holds up. Due to the high cost of Optane SSD's, they are not currently an optimal long-term data storage solution, but in a few years when the prices come down, who knows?



Maximum Lifetimes of Data Storage Media


Storage MethodExpected Maximum Reliable Storage Life in Years
CD-RW's and DVD-RW's2-5 under typical conditions, 30 under ideal conditions
gold-coated Blu-ray discs100 under idea conditions
hard drives3-5
flash drives10
Solid State Drives10
3D XPoint SSD's much more than 10???
M-Discs500-1000???




Backup and Archiving Software

I avoid software that is designed to lessen the work associated with backing up and archiving data. I do this primarily for two reasons. First, software adds another mode of failure on top of all the other failure modes you are already dealing with. For example, I have used Microsoft's built-in Windows software for backing up a hard drive. Later, when I needed my backup, I discovered that it didn't work. Microsoft not only uses a proprietary compression format that prevents you from knowing whether specific data has actually been stored in an archive, it also creates hidden files that may not be copied if you decide to move your backup to another storage location. This is two additional modes of failure. The second reason I don't use backup or archival software is that it adds another layer of complexity. If you don't use the software correctly, your data can be incomplete, corrupted, or lost. A third reason would be that, as with the case of Microsoft's backup software, if archiving software uses a proprietary format, in 10, 15, or 20 years, you may no longer have access to the software that created the archive or to hardware capable of running it. Who, for example, still has a computer with a working floppy disc drive that can run Windows 3.1?



File and File System Formats

The best approach to long-term data storage is to store data in universal file formats--or as close to them as you can get. This will increase the likelihood that software can be found decades from now to read your data. For example, the JPEG format for picture files has been around for decades and is unlikely to disappear any time soon. That format is better for archiving pictures than a format that most people have never heard of, like ECW or DRW. Similarly, given that most computers do not use Apple's file system, you may want to consider using a Windows computer, or even better, a Linux computer for formatting media on which you store your files. I say that Linux file system formats are better than Windows file system formats, because although few consumers use Linux, corporate servers almost universally use it. So, Linux will not be disappearing any time soon. However, be aware that anything could happen as far as file formats and file system formats are concerned, so there are no guarantees that any data that you store for a long period of time will be retrievable by your children or grand children.



Data Storage Procedures

Data can be corrupted over time on hard drives due to the gradual fading of the magnetic domains on the drives' platters. This problem can be mitigated by rewriting hard drive data every few years. To do this, simply transfer the drive's data onto another storage device, then repartition and reformat the hard drive and write the data back onto it. Repartitioning and reformatting renew the strength of the magnetic domains of the hard drive's file system files. If the file system files are corrupted, even uncorrupted data will not be easily retrievable.

The same procedure should be followed for memory cards, flash drives, and SSD's. Remember that low-level and high-level formating are different. High-level formatting is the process by which a file system is written to a drive. This is done by a computer operating system. Low-level formatting of a flash memory device requires a special software tool. Low-level formating (other than with the manufacturer's low-level formating software) may damage the device. You should consider performing both types of formatting to maximize the life of the flash memory device, but a high-level format is better than nothing. Given that flash memory devices have become cheap and that the firmware on flash devices also has a limited life, it may be preferable to periodically buy new flash drives, rather than worry about reformatting.

DVD In order to maximize the life of your storage media, keep them in as close to ideal environments as possible. Don't subject storage media to temperatures significantly higher than room temperature. Don't subject them to direct sunlight. Don't subject hard drives to magnetic fields (e.g. near speakers, motors, or power supplies). You may also want to keep your storage devices in water-proof containers, just in case your home is flooded.

Make multiple copies of your data and store at least one copy at a different location than the others. This way, if your house burns down, for example, you will have one copy of your data left. A safe deposit box in a bank is a good location for storing your most important data.

Just as with any short-term backup, if possible, use more than one type of storage medium for your long-term data storage. This way, accidents and disasters are less likely to destroy every copy of your data. For example, a flood can destroy hard drives, but not DVD's. Your teenager's boom box speakers can wipe data from a hard drive, but not a DVD. Leaving DVD's next to a sunny window for months may destroy them, but perhaps not a USB stick or hard drive.



Final Words

Long term data storage and the procedures required to make it effective require effort and cost money. You are the one who must decide if your data is worth the effort and money required to keep it safe. Before you implement your long-term data storage plan, it may be worthwhile to take an inventory to decide what data is most important to you and then put most of your effort and money into preserving that data. For example, family photos may be more important to you than a copy of a Tetris game with your high score that you've been saving since high school. Remember, as technologies change, whatever long-term data storage solution you choose today may not be the best 20 years from now. So, you are likely to have to re-evaluate your chosen solution as the years go by.



Related Articles:

The Importance of making Regular Backups

Understanding Computer File Transfer Rates

Why You should Absolutely be using Linux!

When Buying a Computer, More Knowledge Equals Lower Cost

Comments


Alice
said on Dec 14th 2019 @ 10:23:05am,

I think saying that HDDs last "3-5 years" is inaccurate when it comes to long term archival. That figure refers to the lifespan of a drive that's left powered on *all the time*. Presumably in a data archival situation you'd store it powered off. After some time (https://superuser.com/a/312764) the magnetic field starts breaking down so for maximum lifespan you'd like to reconnect the drive and rewrite it every 2-5 years.


Cheapskate
said on Dec 15th 2019 @ 05:41:01am,

Alice, thanks for the comment and the informative link. I'm sure you are probably correct that hard drives used mostly for storage do last longer. However, I was unable to find any hard numbers on that, so being conservative, I used the numbers for hard drives that are powered much or all of the time. One thing I should add about the 1%/yr loss in hard drive magnetic strength mentioned in the linked article is that is under ideal conditions. Homes have all sorts of stray magnetic fields that make them less than ideal. I suspect from personal experience that waiting longer that 5 years to rewrite data archived on a hard drive would be a mistake.


Required Fields *

*Name:

*Comment:
Comments Powered by Babbleweb

*Day of the month in North America + 8 =

Copyright © 2019 The Cheapskate's Guide to Computers and the Internet. All rights reserved.