Friday, March 23, 2018

Jaeger - RAID

I've read somewhere that hard drives have an average life span of three to five years. Regardless of how careful you are with your computer or how frequent you access your data, hard drive failure is simply inevitable.

Symptoms


I was downloading a game on Steam and it failed with an error message. Maybe the D: drive where I store Steam games is full. I checked My Computer, the D: drive was still there and I still had lots of free space. I tried the download again. Error. My suspicions were confirmed when I tried to access the D: drive and Windows Explorer crashed. I tried a reboot and things got worse. Boot times took significantly longer. Errors popped up left and right once I got into Windows. That was it - likely a hard drive failure.


Fortunately, I keep my boot drive and my data drive separate. This lets me still have a functioning computer with an OS. Furthermore, I only store application data in my HDD while my important files like documents and pictures get stored in an external drive which is backed up to the cloud. Not much damage done.

Troubleshooting


I wanted to make sure that the hard drive was actually dead and that it wasn't just because of a faulty data or power cable. I checked this by connecting the problematic hard drive via USB using a SATA to USB adapter. The drive was not recognized, finally putting the nail in the coffin.

Solution


I could have just gotten a replacement drive and be done with it. But there's a lesson to be learned from this. Replacing the drive will just give me an extra three to five years before it happens again. We can do better.


Enter RAID (Redundant Array of Independent Disks). Specifically, RAID 1. This is a storage virtualization technology that enables data redundancy. In RAID 1, data is written to two or more drives, each drive having a full copy of all the data. If one drive fails, the other drives still have the data. No data loss while you replace the dead drive and rebuild the RAID array.

At a minimum, I needed two new drives for RAID 1 to work. The disadvantage of RAID 1 is that you pay for two drives for the storage capacity of one. But I think the redundancy this enables makes it worth it.

Catch-22


After hooking up the new hard drives, I was ready to configure RAID. In my case, this involved going to the BIOS and changing the SATA Mode to use RAID Mode, then selecting both hard drives to form the new RAID volume.



So far so good. Reboot.. BOOM! Black Screen of Death.


After some Googling, it turns out that in my case, Windows cannot operate in RAID Mode because the necessary drivers were not installed. But in order to install the drivers, RAID Mode needs to be selected in the BIOS. A paradox! A dead lock! Apparently, this was usually done before installing Windows on the system. I had no choice. I had to reformat and do a fresh install of Windows.

Fast Forward


It worked without a hitch after a fresh install of Windows. I can monitor the RAID volume's health using built-in applications. With this, future hard drive failures will no longer be an issue.