Sunday, September 27, 2015

Recovering Deleted Files

There have been almost daily articles in the news about Hillary Clinton’s private server. Most recently there are articles about the efforts by the FBI to recover the deleted emails on that server. This article is not about any of the political ramifications or legal consequences that she may be facing. Rather, it is about my own story and how I performed this same sort of recovery process thirty years ago. Admittedly, the technology has changed greatly over the last thirty years and the FBI has tools for recovery that were not even dreamt about thirty years ago, but some of the principles are the same. So if you’re interested in this topic, please read on.

In order to keep this posting to a reasonable length, I’m going to refer to a number of topics that you can do further research on using Wikipedia. Specifically, feel free to read articles on these topics. I have marked each of them with an “*” the first time that they appear.
·         Savannah River Site
·         Turbo Pascal
·         Bernoulli Box
·         Norton Utilities

The Background

In the early 1980’s, Air Products had begun a new venture that involved labeling their cylinders with bar codes so they could track individual cylinders instead of just counting all the cylinders of a particular type. In addition to the bar code label, we had to develop software (a) that ran on a handheld scanner that could read the bar codes, and (b) that ran on a local standalone PC that could take the results of the scanning, keep it all in a rudimentary database and produce meaningful reports. The initial version of this was written by someone else, but when I took over the project I rewrote the PC software in Turbo Pascal*. It was a pretty robust system for the PC of the day (approximately 1984-1985). The database was stored on a Bernoulli Box* to provide ease of backup (essentially using the removable disks of the Bernoulli Box as removable hard drives (which did not yet exist).

Because of the success of this system, some of our customers asked if they could purchase/license the software for tracking the cylinders within their own facility. Accordingly, I made some customizable variations to the software so it could be used that way and we licensed it to a handful of customers (I believe perhaps the only software that our IT department ever licensed to someone else).

The Savannah River Site* (SRS) was one of these customers. Air Products delivered the cylinders to a central loading dock and the customer then delivered the cylinders to other locations within SRS (they had over 300 square miles of facility).

The Problem

One afternoon our local South Carolina office received a frantic phone call from SRS. They had, through a series of missteps, deleted the primary cylinder master file from their system. I won’t go through all the steps, but all they had remaining was an empty file. Since they relied on this master file to keep track of the several thousand cylinders in the facility, this was tragic to them. They asked if we could help them recover it. Working with our sales manager and with my own manager in IT, I thought that I could – even though I had never attempted anything like that before. But I trusted my instincts.

In record time, especially for a government facility, they prepared and got approval for, a purchase order that would pay for my expenses – the flight to/from South Carolina and my housing for two nights. I grabbed my trusty copy of Norton Utilities* and flew down. After staying overnight in a motel, the sales manager met me early in the morning and we drove to SRS.

It was a two-hour process to check in and gain admittance to SRS – including finger printing, background checks, etc. On the other side of the admissions building, the SRS manager met us and drove us to the building next to the central dock where the PC running our Cylinder Tracking System was located. I remember that he had a lanyard around his neck with over a dozen security badges hanging from it, as each area of the site required separate access badges and codes. We were only allowed access to the single building where we needed to do the work.

The Undelete Process

(Note that the following is a VERY simplified version – please don’t criticize the description.)

Files are stored on a disk by having a directory entry with a name (and several other attributes) and a pointer to the first sector on the disk that the file occupies. If a file is more than one sector long, there is a pointer at the end of the first sector to the next, etc, with the final sector marked with a “last sector” marker. In a process that is basically unchanged for the last thirty years, when you delete a file the PC cannot physically delete that portion of the disk. Rather, it just marks it as deleted and removes access to it. The directory entry is modified (by changing the first character to a “?”), and the sectors which the file occupied are returned to the “available sector pool”.

If all that had been done to this particular PC was that the file was deleted, then it’s a fairly easy process to located the marked directory entry, restore the first character of the name, and follow the list of linked sectors, unmarking the “available sector pool” as you go. But this was not the case for SRS. Instead, they had done further damage by creating a “new” file with the same name, but with no content. So I had to use the more advanced options.

This involved scanning the “deleted” sectors in the available sector pool, looking for any which contained data that looked like what I knew the database records were like, and noting the sector numbers and what records each contained (since the file needed to be in ascending key order when done). This took a few hours and when I finished I had a couple of sheets of paper with sector numbers and contents written down. I then went back through these sectors, ensuring that I had them all in the right order and that I had not skipped anything.

Now for the crucial “undelete” part. I first created a new directory entry just one sector long, then deleted it. This was going to be the root for my work. I then used the advanced options of Norton Utilities (which I had never done before). I first “undeleted” the new file, and then sector by sector using my notes, added these sectors to the end of the first, recreating the chain of sectors that I had marked on paper. With each addition I received a warning from Norton Utilities noting that I was adding more sectors than were in the original file – I just kept ignoring the warning. After another nearly an hour, I had successfully recreated the overwritten file. As I then scanned the newly created file, I could see that it wasn’t totally perfect – there were a few cases where a sector or two had been reused by something else during the original SRS procedure fiasco, but I figured I was 98% successful.

Consistency Checking

I was almost done. First I used some other Norton Utilities options to check the integrity of the disk, verifying that there were no file problems, that the list of sectors in all the files was the exact complement of the list of unused sectors, etc.

Then, I ran a series of integrity checks that I had built into the Cylinder Tracking System. Among other things, this verified the few missing records that I had not been able to recover. The records could be added based on other duplicate information elsewhere in the system, but the product in the cylinder and the actual cylinder serial number would be missing (something that would correct itself over time as that cylinder was refilled and scanned again).

I backed up the entire system onto a spare Bernoulli Box cartridge and turned the system back over to the SRS manager – to his absolute delight!

Lessons Learned

·         You can never protect yourself from user error, because you never know what a stupid user might do.
·         Having the right tools is essential, but they are only useful if you know how to use them.
·         Government security procedures are ridiculous (two hours to get in?)

<rant on> I said in the beginning that I was not going to comment on Hillary Clinton’s private server fiasco. But after spending the time to write all the above, I feel that I need to say something. I don’t know whether Hillary was really setting up her own server because she really felt that it was easier that dealing with government security procedures, or whether she was being malicious. But either way, she has made herself look like a “stupid user” and has jeopardized classified government information in the process. Her comments about “wiping” the server using a cloth just make her look even more stupid. I’m fairly certain that the FBI will find even more incriminating evidence on her server before they are done. So whether from maliciousness, stupidity, or just feeling that she doesn’t have to play by the rules, she continues to show herself unqualified to be the President of the US. I’m sure that she’s a smart person, but this time she’s out of her league. She may be able to avoid jail time, but she’ll likely end up as a convicted felon before this is all done. <end of rant>


1 comment:

  1. I was having way too may occasions to use Norton Utilities for disk repair and file recovery on a "portable" PC (20 pounds) I was was using in the late 80s. My 5-1/4 inch floppies kept getting corrupted. Solved by installing a new drive, but thankful I had N.U. until I could get that done.

    Now you need to do a post on the transition from sequential access (mag tapes) to FAT access (or whatever they called the earliest version of non-sequential access)

    ReplyDelete