A Beginners guide to raid
Part 1 :What is raid?
Raid stands for Redundant Array of Inexpensive Disks. It is a system that allows multiple hard drives to be recognized as one, allowing for greater storage capacity, increased performance, data redundancy, or a mix of these three.
It used to be the case that RAID was only for servers and high end workstation. But today almost all southbridges have some sort of integrated raid.
Part 2 :RAID types or levels
Most common RAID levels.
These are the most common raid levels
Raid 0
Raid 1
Raid 5
Raid 0
Raid 0 isn't really a level of raid at all, it's not redundant hence it has level 0 redundancy i.e. None. Raid 0 Stripes the data across disks like this.
If one drive fails : Array and data destroyed
If two drives fail: Array and data Destroyed
Read performance:
Theoretical : Number or Drives x speed of the slowest disk
Reality : about 75% of theoretical at best
Write Performance:
Theoretical : Number or Drives x speed of the slowest disk
Reality : about 75% of theoretical at best.
Advantages:
Faster than one drive
Cheaper than one large drive usually i.e. 2X 500gb are cheaper than one 1tb drives
can reach larger capacity's than one drive can. i.e. 2 1TB drives can create a 2TB array.
Disadvantages
No redundancy
Higher chance of array failure than one disk
Good usage scenario: Web servers, read only file servers, video editing page files and temp storage.
Raid 1
Raid 1 is the first of the real Raid levels. It has level 1 redundancy. Raid 1 works by mirroring the data over 2 drives. This is called mirroring.
If one disk fails: Data is still safe but array runs degraded
If two disks fails: data is lost. Fatal array failure occurs
Read performance:
Theoretical:Around 150% of single disk performance if a good controller is used.
Reality: just over 100% of slowest disk.
Write performance:
Theoretical: Around the same as a single disk.
Reality: Just under is software controller is used.
Advantages:
All data is kept intact after a failure
No real performance hit as there is no parity data to be worked out.
Disadvantages
inefficient use of space.
can still run the risk of failure as chance of dual disk failure is quite high.
Good usage scenario :general file servers
Raid 5
I must admit that i have a soft spot for raid 5. it is my favorite raid level. It has saved my butt on a number of occasions involving may gig's of data including my As ICT project the day before it had to be in. so i must say that i am slightly biased towards it.
raid 5 uses a similar level of striping to what raid 0 uses but where raid 5 differs is that it uses distributed parity to spread parity bit across the drives.
[ INSERT SENTENCE ABOUT CRC / PARITY BITS HERE]
Raid 5 needs at least 3 drives to work.(A two disk Raid 5 set is possible on some controllers but is not ofter implemented used it negates a lot of the benefits of raid 5.)
If one drive Fails: Array runs in degraded mode. All disk space still available.
If two drives fail: Array fails
Read performance:
Theoretical: Roughly the same speed as Raid 0 unless a block fails a CRC check. then data is read from parity. causing a slight dip in performance.
Reality:if a good hardware controller is used the performance is much the same as theoretical
Write performance:
Theoretical:If large amount of small changes are made data can become backed up and performance can take a hit. if large files are written performance can be very good.
Reality:Very similar to the theoretical sample but is very dependent on the controller.
Good usage scenario: Database servers,gaming desktops, critical file servers
Nested raid levels
With nested raid levels there can be many different raid levels. almost too many. And I'm not going to cover them in much detail.
the most common nested Raid levels are:
Raid 10
raid 0+1
Raid 100
Raid 50
Raid 60
For more information go to:
http://en.wikipedia.org/wiki/Nested_RAID_levels
Non standard raid levels
There are a few non standard raid levels that you might come across. These are detailed below. They are not usually given a number like raid 5 or raid 1 they have names given to them by marketing types.
Non standard raid levels
Intel Matrix raid
Intel matrix raid is a system of raid that combines raid onto to of more disk without having a separate set of disks for each raid level.
The below diagram uses a raid 1 and a raid 0 set:
Additionally you can use a raid 5 set over 3 disks as below:
Array
|-------------------------------||---------------------------|
Disk 1 Disk 2 Disk 3
A1 A2 A3 Raid 0
A4 A5 A6 Part
A7 A8 A9
B1 B2 P1 Raid 5
B3 P2 B4 Part
P3 B5 B6
Part 3:Controllers
there are three distinct type of Raid controller available today these are
1.Hardware
2.Software
3.Hybrid
Hardware controllers
Hardware controllers are expensive but they give you the best performance available. these controllers have a hardware processor the usually has accompanying ram and flash storage. these can be thought of as a whole extra computer inside your computer.
Hardware controllers are usually powered by Intel/free scale made IOX processors although some have proprietary raid engines like 3ware and Areca cards. Hardware controllers are made to be completely independent and invisible to the system and only to be visible to the end user by the raid BIOS and monitoring utilities although most operating systems need drivers to see the card. luckily though if you are after a cheap fully hardware raid card you do not have to look far. The Revo card made by XFX is a fully hardware card and supports raid 3 at the time of writing the card is available for around £30 - £40 and has the bonus of 64mb of cache on-board. Also there are cheap hardware raid card's on e Bay almost constantly.
Characteristics of a hardware raid card:
Expensive
Big (some cards such as IBM's serve raid cards can be 14 inches long)
Dedicated on-board I/O processor
usually has on-board cache( either as a Replaceable and upgradeable DIMM or soldered on the the board)
Software controller
almost every motherboard shipped today has some for of software raid controller on the motherboard. these software raid chips are usually integrated into the chip sets such as Intel south bridges ending in R (i.e. ICH5R , ICH6R ,ICH7R ,ICH8R AND ICH9R) and almost all Nvidia N force chip sets.
software controllers are usually in the form of cheap controller cards that offer raid functionality. The cards themselves offer very little in the way of raid on the chip itself. this is usually provided by the driver witch off loads all of the processing to the CPU this is not necessarily a bad thing but large writes can tax the processor and if you use you computer for gaming a large write can slow your game to a slide show in the worst cases.
software chip sets are dependent on a driver to be written for the intended operating system. this has in the past hindered there introduction onto new platforms.newer chip sets from manufacturers such as adaptec and highpoint all usually have native drivers for the platform. it is best to check before you buy to see if a has raid drivers for your chosen operating system.
one of the better software chip sets that i have come across and are still available is the silicon image sil3114 chip set. this was one of the first native sata chip sets and also have very good read and write performance and a very matured driver. it is only rated to sata 150 but you are usually limited by the PCI bus before you reach the limit of this card. Cards based on this chip set can be bought for around £20 on e Bay and are a recommended buy.
Hybrid controllers
Hybrid controllers are a rarity these days and generally hark back to the days of Pentium 2's. These card offer most of the advantages of hardware raid cards such as cache and speed but use the systems processor to do raid calculations .
These controllers are not available today for a good reason. they are very poor performers. because the processor usually has to access the cards cache over the PCI bus, something the pci bus is not made nor designed for. another reason hybrid controllers died out is due the the fact that they were only marginally cheaper that full blown hardware cards.
The bulk raid cards cost used to be licensing for patents. so the cost of losing the on-board processor was marginal but had a big effect on performance. if you really want a hybrid card you could always look on ebay Compaq smart 431's go for as little as a fiver-I got one for free once.
Part 4: Stripe sizes
On most controllers you have the ability to chose your stripe size. This may look confusing but it really is not. all you really need to know is what the different sizes are good for. There is a general rule of thumbs that the stripe size should be twice that of of the file systems block size. this is not necessarily true, because when dealing with NTFS the block size is 4k ,so the stripe size would be 8k and this would mean poor performance with large files. ideally the best point for general usage is around 64k. but having a large stripe size would not effect storage space adversely.
These are the typical stripe sizes found today:
4k
8k
16k
32k
64k
128k
256k
512k
1024k
The extremities of this is only useful in very specialist cases. For the most part 64k would be the most useful to most people. This assumes that you have choose to use raid 5. for raid 0 you would do just fine using around the 8k mark.
Produced by:
Alistair Senior
With thanks to:
Supershanks: For pointing out a another way of using matrix raid.
TiG : For giving me the idea to put i usage scenario's
Streetster: Spelling, punctuation and layout fixing.
and me mam: For not getting annoyed at finding hard drives scattered around the house.
Svg images are taken form the wikipedia pages
Questions,comments,suggestions and hate mail to:
al[at]alsenior.co.uk or
PM 'alsenior' on Hexus community forums. http://forums.hexus.net
All trademarks are properties of the respective owners.
This work is licensed under the Creative Commons Attribution-Noncommercial-Share Alike 2.0 UK: England & Wales License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/2.0/uk/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.
Still available from : http://www.alsenior.me.uk/files/raid...idguide1.2.zip