Scrubbing Disks, Part 1



ITworld.com, Unix in the Enterprise 11/30/2005



Sandra Henry-Stocker, ITworld.com

Retiring a disk that can't keep up with your growing mass of data might seem a routine task. After all, ten years ago, a big disk might have held 1 GB of data. Today, a big disk is 100 GB or larger. While retiring small disks may be routine, however, what you do with these disks before you send them on to the computer junkyard is not. Depending not only on whether you scrub your disks but how you scrub them, data may still be readable. In fact, numerous companies offering data recovery services routinely read data from disks which have been erased or damaged. And they work with nearly every type of disk and every popular operating system -- even Unix. How, then, should you erase a disk that you are retiring from service? What tools and techniques are available and at what cost and level of difficulty? Who can read data that might remain on a presumably "erased" disk and at what cost to them?

Why is Sanitization Required?



The underlying reason that disks need to be thoroughly scrubbed before they are recycled is that the data tracks on magnetic disks are somewhat wider than the heads that write on them. This means that data will be readable beyond the centerline of the tracks on which the data is written.

The difficulty of reading data from a disk which has been reformatted, erased or sanitized depends on the sophistication of the person trying to read the disks and the tools and equipment they have at their disposal. A typical home computer user may be able to find and read erased files using undelete on his Windows system or R-Linux on his Linux system, but he isn't going to be able to recover files from a disk which has been overwritten. This doesn't mean that nobody can, however. Given adequate resources and motivation, an expert might be able to recover data from a disk which has been overwritten with zeroes or with random data values.

Data Types



Part of the answer to the question about what you should do before retiring a disk lies in the nature of the data that resides on the disk. Except for the most benign systems, any system that has been used long enough to become antiquated should be assumed to hold some level of proprietary or personal data. This means that there is a certain risk in allowing these disks to get into the hands of anyone you don't know and trust.

In a very coarse evaluation, we can look at disks as containing one of several levels of data. Some disks contain high risk data -- meaning that penalties can ensue if that data is compromised (e.g., client medical records). Others contain confidential data -- meaning that you or your organization can suffer a loss if the data are retrieved by a competitor or by a hostile individual. Still others contain data that can be viewed as "public" -- meaning that exposure of the data represents no significant loss to anyone.

Your home computer, for example, might contain nothing more sensitive than personal photos and saved email. On the other hand, it might also contain cached files -- evidence of your web browsing activity, usernames and passwords for your online-accessible accounts and your personal finances.

Your work computer might contain company source code, outlines and timelines depicting development plans and names and phone numbers for your customer base.

Scrubbing Methods



Formatting is a process meant to prepare a disk to hold files. When a disk is formatted, the formatting software erases the old file systems (if applicable) and prepares for the disk's new contents by creating empty structures. Some operating systems (e.g., Windows) refer to the process of creating the physical structure on disks as "low-level" formatting and that of creating file systems as "high-level". Formatting a drive does not remove all of its contents. Even if you mount a file system on a newly formatted disk and notice that it is empty, this does not imply that previous contents cannot still be read by a determined individual with the right tools.

Degaussing involves exposing disks to strong magnets that "coerce" the magnetization back to a state in which no data remains. Not all degaussers are created equal, however. Some may leave residual magnetization while others are so powerful that the disks are unusable when the process is complete.

Sanitization is the process of overwriting the data on a disk in such a way that the prior contents cannot be read. This involves writing zeroes, random characters or a combination of both over the disk many times over in the hope that the overall effect will be that the original data cannot be read by any practical method.

Destruction is, as I'm sure you understand, the process of intentionally damaging the disk so that its contents can no longer be read. Running over a hard disk with your back hoe might do the trick. Shoving an ice pick through the platters or melting them in your neighbor's kiln might be even better. The point is that the disk must be damaged beyond repair -- to the point at which it cannot be rebuilt. Unless your disks contain Top Secret information of evidence of criminal activity, you probably won't want to go to this extreme.

Next week's column will examine federal and other recommendations for disk sanitization along with some inexpensive products, free tools and home-brew sanitization methods. Sandra Henry-Stocker has been administering Unix systems for nearly 18 years. She describes herself as "USL" (Unix as a second language) but remembers enough English to write books and buy groceries. She currently works for TeleCommunication Systems, a wireless communications company, in Annapolis, Maryland, where no one else necessarily shares any of her opinions. She lives with her second family on a small farm on Maryland's Eastern Shore. Send comments and suggestions to sandra@toadmail.com.