With Amazon EBS, you can use any of the standard RAID configurations that you can use with a traditional bare metal server, as long as that particular RAID configuration is supported by the operating system for your instance. This is because all RAID is accomplished at the software level.
RAID Overview
RAID stands for Redundant Array of Independent Disks. RAID is a data storage virtualization technology that combines multiple physical disk drive components into a single logical unit for the purposes of data redundancy, performance improvement, or both.
RAID Confugurations Overview
Data is distributed across the drives in one of several ways, referred to as RAID levels:
-
RAID 0
-
RAID 0 consists of striping, without mirroring or parity.
-
-
RAID 1
-
RAID 1 consists of data mirroring, without parity or striping.
-
-
RAID 2
-
RAID 2 consists of bit-level striping with dedicated Hamming-code parity.
-
-
RAID 3
-
RAID 3 consists of byte-level striping with dedicated parity.
-
-
RAID 4
-
RAID 4 consists of block-level striping with dedicated parity.
-
-
RAID 5
-
RAID 5 consists of block-level striping with distributed parity.
-
Unlike RAID 4, parity information is distributed among the drives, requiring all drives but one to be present to operate. Upon failure of a single drive, subsequent reads can be calculated from the distributed parity such that no data is lost. RAID 5 requires at least three disks.
-
Good for reads, bad for writes.
-
-
RAID 6
-
RAID 6 consists of block-level striping with double distributed parity. Double parity provides fault tolerance up to two failed drives.
-
-
RAUD 10
-
Stripping with mirroring.
-
Combo of RAID 1 and RAID 0.
-
Good redundancy, Good performance.
-
Taking snapshot of an EC2 instance with a RAID Array
When you take a snapshot of an attached Amazon EBS volume that is in use, the snapshot excludes data cached by applications or the operating system. For a single EBS volume, this is often not a problem. However, when cached data is excluded from snapshots of multiple EBS volumes in a RAID array, restoring the volumes from the snapshots can degrade the integrity of the array.
When creating snapshots of EBS volumes that are configured in a RAID array, it is critical that there is no data I/O to or from the volumes when the snapshots are created. RAID arrays introduce data interdependencies and a level of complexity not present in a single EBS volume configuration.
Solution is to take an application consistent snapshot: stop applications from writing to the disk and flush all caches to the disk.
Few ways of doing application consistent snapshot are:
-
Shutdown the EC2 instance
-
Freeze file system
-
Unmount the RAID array
Easiest way would be to shutdown the EC2 instance, take a snapshot and start the instance again.
Note: Taking a snapshot of EC2 effectively means taking a snapshot of the volumes attached. As you cannot take a snapshot of a instance store, this effectively means taking snapshot of an EBS volume.
Important Notes (Exam Tips)
-
Amazon discourages using RAID 5 in EBS. Use RAID 0 or RAID 10.
-
You should avoid booting from a RAID volume. Grub is typically installed on only one device in a RAID array, and if one of the mirrored devices fails, you may be unable to boot the operating system.
- heartin's blog
- Log in or register to post comments
Recent comments