Sunday, August 1, 2010

RAID



RAID stands for Redundant Array of Inexpensive/Independent Disks.
RAID allows a user to store data in multiple places at a time(redundancy) in a balanced way so as to improve performance and data retrieval. It stores the data on multiple hard disks and then combines them into a high performance logical unit where all the drives in the unit are interdependent thus improving Input/Output performance. To put it in simple words, data is spread over different disks but to the operating system and the user, it would appear as if the data is present in one disk only.
An example of improved performance: One set of data is being written to one disk and at the same time another set of data is being retrieved from another hard disk thus speeding up the input/output process.

Thus with RAID we can have:

1) Higher data transfer rate on large data accesses
2) Higher Input/Output rates on small data accesses


Now the question arises "How is the data written to the disks?"

There are several ways of doing this but I shall discuss only 2 of them: Block-level Striping and Mirroring.

Block-level Striping(Raid 0)

In this case, chunks of data is written to different disks. Each disk's storage space is partitioned into "units ranging from a sector(512 bytes) up to several megabytes" (taken from http://searchstorage.techtarget.com/sDefinition/0,,sid5_gci214332,00.html)
Supposing we have a data consisting of 6 parts(D1,D2,D3,D4,D5 and D6). Parts D1,D3 and D5 will be stored in disk 1 whereas D2,D4 and D6 will be stored disk 2(they could be stored in more than 2 disks also, but I'm just giving an example here).











Advantage of this method: smaller chunks of data can be read from the drives thus increasing bandwidth as more data can be read at a time.
Disadvantage: if one disk fails, then the entire data is inaccessible.

Mirroring(Raid 1)

In this case, the same data is duplicated 100% to 2 or more disks thus showing the "mirroring" aspect.










Advantage of this method: Even if one disk fails, we can still read the whole data from the other disk where the data was stored, provided it is working fine. Thus our system can still stay up and running while the affected disk is being replaced!


Those who want to know more about the other ways of doing this, check this link out

http://en.wikipedia.org/wiki/RAID#Standard_levels


Now we come to the advantages and disadvantages of using a RAID disk

Advantage:

1) They can make excellent backup drives especially when they are employed as backup devices to the main drives and particularly when they are located away from the main system.

2) It increases the performance and reliability of a system as regular checks are done to check for any possibility of a system crash

Disadvantage:

1) RAID disks are essentially designed and extensively written for servers. Thus if there is a server, there has to be a network as well and this can cause configuration issues as each network is different.

2) High cost of Purchase (doesn't cost less than a few thousand of dollars)


This post would be incomplete without any mention of WHEN do we use a RAID. It is all very well knowing what RAID does but not knowing when to apply it can be problematic in view of the high cost of purchase as mentioned above.

We can use them in mainly 3 ways:

1) Business Servers: Large business companies require a secure way of storing large amounts data without suffering any loss of the same. RAID can be used for this purpose especially if it is stored in a remote location where accidental erasure of data is pretty much impossible to do.

2) Workstations: "Individuals who are doing intensive work such as video file editing, graphical design,etc. should use a RAID" (taken from http://www.pcguide.com/ref/hdd/perf/raid/why.htm) RAID 0 will provide the improved performance needed in many of these applications (remember what I wrote earlier about improved bandwidth?).

3) Home PC: Most of us use this and we don't need RAID as the high cost of purchase is not justified by the activities we perform with the PC. It's better for us to use SATA/IDE instead (For those who don't know what SATA/IDE is, go here http://smblog.iiitd.com/2010/07/hard-disks-advantage-of-sata-over-ide.html).


Thus we can say that RAID is best used by big companies that can afford to purchase it as it gives them their money's worth over the security of their data which is a very important thing in this world.

If anyone wishes to add something else to the above, go ahead and do so =D


SOURCES:

http://en.wikipedia.org/wiki/RAID

http://www.webopedia.com/TERM/R/raid.html

http://www.ecs.umass.edu/ece/koren/architecture/Raid/raidhome.html

http://www.raid-data-recovery.net/advantages-raid.html


IMAGES:

http://en.wikipedia.org/wiki/RAID#Standard_levels

7 comments:

  1. very well explained that too in such simple language..really liked this post..thanku..

    ReplyDelete
  2. A really informative one. You are a very good writer Shayan.

    ReplyDelete
  3. Thanks Udit and Soumyavardhan :D

    ReplyDelete
  4. very well written shayan. just a little question though. How the data is actually written on the magnetic disks?(Like in the case of cds and dvds its in the form of pits and lands)

    ReplyDelete
  5. Tanmay-This is the best answer that i could find for your question without going into too much details

    http://wiki.answers.com/Q/How_is_data_initially_written_on_a_magnetic_disk

    Unfortunately the link isn't working directly, so you'll have to copy and paste it to the address bar :\

    ReplyDelete
  6. Nicely written post!
    You explained it very well without using technical terms that we wouldn't understand :P

    ReplyDelete
  7. Are you going to inform us about the remaining RAID configurations anytime soon (RAID 2, RAID 3, RAID 4, RAID 5, RAID 6, Hybrid RAID, etc.)???
    I'm looking forward to a follow up blog post.

    ReplyDelete