I spent two days trying to teach someone just part of this once.
Now, you may think my failure in this regard is due to me being a bad teacher. Sadly, it was not. Two other people, one of whom I had already taught about RAID, and more specifically, SCSI RAID configurations, couldn’t teach this to my failed student either. Shockingly, when I was “encouraged to find other opportunities to excel”, outside that company, naturally, that student took over my job. Oddly enough, a few years later, I heard the person who had made that organizational choice had also been encourage to find other opportunities to excel. Funny how that works.
So, now, in part to make up for not being able to educate that person, and also to spare someone the same teaching fate I faced, here are two articles about RAID.
First, from ExtremeTech, RAID 101, Understanding Multiple Drive Storage.
And, secondly, from TechRepublic, Choose A RAID Level that works for you!
You can go to those articles and get lots of detail, but I’ll break it down for you in brief here.
Something that people tend to forget, for some reason, is that RAID stands for Redundant Array of Inexpensive Disks. That’s not as true as it used to be, thanks to server pricing and how cheap SATA drives have become compared to SCSI drives. Back in the day, we always used SCSI and I still do for server systems, mostly, because it tends to be faster and more reliable than anything else. That’s not as true as it used to be thanks to improvements in SATA, but if you still want to do a BIG array of disks, SCSI is pretty much the only real option.
There are a bunch of RAID “levels”, but, realistically, you’re mostly going to deal with three or four: RAID 0, RAID 1, RAID 5 and, maybe, RAID 10.
RAID 0 is generally referred to as “disk striping”.
In a nutshell, what this configuration does is stripe data across multiple drives. Generally, this is done to make more available disk space and improve performance. The down-side is that there is no redundancy. In other words, with RAID 0, you can take several disks and make them perform like one larger, faster drive, but if one disk crashes, they all do.
RAID 1 is generally referred to as “disk mirroring.”
And, that’s essentially what it is, a system which saves everything to a duplicate drive or drives. Most often in server configurations, you’ll find the operating system on two drives that are mirrored. That means that if one drive goes bad, the admin can reconfigure the other drive to take over running the server. In theory, this works pretty well. In practice, it takes a little finagling sometimes to get that mirror drive reconfigured as the primary. The other thing to remember is that the second drive is essentially lost storage. In other words, if you put two 1 terabyte drives in a RAID 1 array, you only have 1 terabyte of available storage when the system is running.
This is pretty much bare-bones, bottom-of-the-barrel redundancy.
RAID 5 is what most people think of when you talk about RAID arrays.
In RAID 5, data bits and “pairity” bits are striped across three or more drives. Basically, data is broken up and written to multiple drives and then another, sort of “record-keeping” bit of data is written, too, so that the RAID 5 system knows where all the pieces of the data are. Now, that’s a bit of an oversimplification, but, what it means is that if one of the drives in a RAID 5 array fails, the array keeps running and no data is lost. Also, when a replacement drive is put into place, the RAID 5 array automatically rebuilds the missing drive on the replacement! This, my friends, is like system administration magic! Somehow, with a lot of really big math, that I frankly don’t understand, they can tell what the missing bit is based on the stuff they do have and fill it in. This is the best invention since sliced bread!
Also, an option on many RAID 5 systems is something called the “hot spare”. The hot spare is a drive that is part of the array but not active, until one of the other drives fails. Then, the hot spare becomes active and will automatically start to rebuild the missing data on that new drive. That means that the system admin and order a new replacement drive at their leisure and actually schedule down-time to replace it. What a concept! Not always doing things at the last minute or under fire, but planning ahead and taking your time. It’s unheard of!
Finally, the best option available on many RAID arrays is the “hot swapable” drive. In that case, you don’t need to schedule downtime at all, but only need to pull the damaged drive out of the array and pop the replacement right in. All without even shutting the production system down for even a minute! Again, this is like magic!
The last “common” RAID level is RAID 10.
Basically, this is a combination of RAID 1 and RAID 0. In other words, it’s a set of mirrored arrays. This setup requires at least four drives and is fairly pricey. It’s mainly used for redundancy and speed and, realistically, is almost only used for database servers. In fact, I can’t think of any other instance that I’ve heard of this being used, outside of database servers.
There are other levels, too, of course, but you can hit the articles for more info about them. They’re pretty uncommon outside of really high-end or experimental configurations of one kind or another.
Oh, one last thing… RAID can be implemented either via hardware or software. In general, software RAID, such as you might find in Linux, is cheaper, but is slower and more prone to having issues if something goes wrong. Hardware RAID is faster, a little more expensive, but a far more robust solution.
So, there you have it, RAID in a nutshell.
And, yes, for those of you who have noticed, articles like this are me turning this blog back toward its roots as a technical blog. I hope to have more basic info like this as well as some new projects over the next 18 months or so. Certainly, more than there have been in the past two or three years.
I hope you’ll keep coming back for more!