Jonathan Brassow | c0a2fa1 | 2011-08-02 12:32:06 +0100 | [diff] [blame] | 1 | dm-raid |
| 2 | ------- |
NeilBrown | 9d09e66 | 2011-01-13 20:00:02 +0000 | [diff] [blame] | 3 | |
Jonathan Brassow | c0a2fa1 | 2011-08-02 12:32:06 +0100 | [diff] [blame] | 4 | The device-mapper RAID (dm-raid) target provides a bridge from DM to MD. |
| 5 | It allows the MD RAID drivers to be accessed using a device-mapper |
| 6 | interface. |
NeilBrown | 9d09e66 | 2011-01-13 20:00:02 +0000 | [diff] [blame] | 7 | |
Jonathan Brassow | c0a2fa1 | 2011-08-02 12:32:06 +0100 | [diff] [blame] | 8 | The target is named "raid" and it accepts the following parameters: |
NeilBrown | 9d09e66 | 2011-01-13 20:00:02 +0000 | [diff] [blame] | 9 | |
Jonathan Brassow | c0a2fa1 | 2011-08-02 12:32:06 +0100 | [diff] [blame] | 10 | <raid_type> <#raid_params> <raid_params> \ |
| 11 | <#raid_devs> <metadata_dev0> <dev0> [.. <metadata_devN> <devN>] |
NeilBrown | 9d09e66 | 2011-01-13 20:00:02 +0000 | [diff] [blame] | 12 | |
Jonathan Brassow | c0a2fa1 | 2011-08-02 12:32:06 +0100 | [diff] [blame] | 13 | <raid_type>: |
Jonathan Brassow | b12d437 | 2011-08-02 12:32:07 +0100 | [diff] [blame] | 14 | raid1 RAID1 mirroring |
Jonathan Brassow | c0a2fa1 | 2011-08-02 12:32:06 +0100 | [diff] [blame] | 15 | raid4 RAID4 dedicated parity disk |
| 16 | raid5_la RAID5 left asymmetric |
| 17 | - rotating parity 0 with data continuation |
| 18 | raid5_ra RAID5 right asymmetric |
| 19 | - rotating parity N with data continuation |
| 20 | raid5_ls RAID5 left symmetric |
| 21 | - rotating parity 0 with data restart |
| 22 | raid5_rs RAID5 right symmetric |
| 23 | - rotating parity N with data restart |
| 24 | raid6_zr RAID6 zero restart |
| 25 | - rotating parity zero (left-to-right) with data restart |
| 26 | raid6_nr RAID6 N restart |
| 27 | - rotating parity N (right-to-left) with data restart |
| 28 | raid6_nc RAID6 N continue |
| 29 | - rotating parity N (right-to-left) with data continuation |
Jonathan Brassow | 63f33b8d | 2012-07-31 21:44:26 -0500 | [diff] [blame] | 30 | raid10 Various RAID10 inspired algorithms chosen by additional params |
| 31 | - RAID10: Striped Mirrors (aka 'Striping on top of mirrors') |
| 32 | - RAID1E: Integrated Adjacent Stripe Mirroring |
| 33 | - and other similar RAID10 variants |
NeilBrown | 9d09e66 | 2011-01-13 20:00:02 +0000 | [diff] [blame] | 34 | |
Masanari Iida | 40e4712 | 2012-03-04 23:16:11 +0900 | [diff] [blame] | 35 | Reference: Chapter 4 of |
Jonathan Brassow | c0a2fa1 | 2011-08-02 12:32:06 +0100 | [diff] [blame] | 36 | http://www.snia.org/sites/default/files/SNIA_DDF_Technical_Position_v2.0.pdf |
NeilBrown | 9d09e66 | 2011-01-13 20:00:02 +0000 | [diff] [blame] | 37 | |
Jonathan Brassow | c0a2fa1 | 2011-08-02 12:32:06 +0100 | [diff] [blame] | 38 | <#raid_params>: The number of parameters that follow. |
NeilBrown | 9d09e66 | 2011-01-13 20:00:02 +0000 | [diff] [blame] | 39 | |
Jonathan Brassow | c0a2fa1 | 2011-08-02 12:32:06 +0100 | [diff] [blame] | 40 | <raid_params> consists of |
| 41 | Mandatory parameters: |
| 42 | <chunk_size>: Chunk size in sectors. This parameter is often known as |
| 43 | "stripe size". It is the only mandatory parameter and |
| 44 | is placed first. |
| 45 | |
| 46 | followed by optional parameters (in any order): |
| 47 | [sync|nosync] Force or prevent RAID initialization. |
| 48 | |
| 49 | [rebuild <idx>] Rebuild drive number idx (first drive is 0). |
| 50 | |
| 51 | [daemon_sleep <ms>] |
| 52 | Interval between runs of the bitmap daemon that |
| 53 | clear bits. A longer interval means less bitmap I/O but |
| 54 | resyncing after a failure is likely to take longer. |
| 55 | |
| 56 | [min_recovery_rate <kB/sec/disk>] Throttle RAID initialization |
| 57 | [max_recovery_rate <kB/sec/disk>] Throttle RAID initialization |
Jonathan Brassow | 46bed2b | 2011-08-02 12:32:07 +0100 | [diff] [blame] | 58 | [write_mostly <idx>] Drive index is write-mostly |
Jonathan Brassow | c0a2fa1 | 2011-08-02 12:32:06 +0100 | [diff] [blame] | 59 | [max_write_behind <sectors>] See '-write-behind=' (man mdadm) |
| 60 | [stripe_cache <sectors>] Stripe cache size (higher RAIDs only) |
Jonathan Brassow | c108456 | 2011-08-02 12:32:07 +0100 | [diff] [blame] | 61 | [region_size <sectors>] |
| 62 | The region_size multiplied by the number of regions is the |
| 63 | logical size of the array. The bitmap records the device |
| 64 | synchronisation state for each region. |
Jonathan Brassow | c0a2fa1 | 2011-08-02 12:32:06 +0100 | [diff] [blame] | 65 | |
Jonathan Brassow | 63f33b8d | 2012-07-31 21:44:26 -0500 | [diff] [blame] | 66 | [raid10_copies <# copies>] |
| 67 | [raid10_format near] |
| 68 | These two options are used to alter the default layout of |
| 69 | a RAID10 configuration. The number of copies is can be |
| 70 | specified, but the default is 2. There are other variations |
| 71 | to how the copies are laid down - the default and only current |
| 72 | option is "near". Near copies are what most people think of |
| 73 | with respect to mirroring. If these options are left |
| 74 | unspecified, or 'raid10_copies 2' and/or 'raid10_format near' |
| 75 | are given, then the layouts for 2, 3 and 4 devices are: |
| 76 | 2 drives 3 drives 4 drives |
| 77 | -------- ---------- -------------- |
| 78 | A1 A1 A1 A1 A2 A1 A1 A2 A2 |
| 79 | A2 A2 A2 A3 A3 A3 A3 A4 A4 |
| 80 | A3 A3 A4 A4 A5 A5 A5 A6 A6 |
| 81 | A4 A4 A5 A6 A6 A7 A7 A8 A8 |
| 82 | .. .. .. .. .. .. .. .. .. |
| 83 | The 2-device layout is equivalent 2-way RAID1. The 4-device |
| 84 | layout is what a traditional RAID10 would look like. The |
| 85 | 3-device layout is what might be called a 'RAID1E - Integrated |
| 86 | Adjacent Stripe Mirroring'. |
| 87 | |
Jonathan Brassow | c0a2fa1 | 2011-08-02 12:32:06 +0100 | [diff] [blame] | 88 | <#raid_devs>: The number of devices composing the array. |
| 89 | Each device consists of two entries. The first is the device |
| 90 | containing the metadata (if any); the second is the one containing the |
Jonathan Brassow | b12d437 | 2011-08-02 12:32:07 +0100 | [diff] [blame] | 91 | data. |
Jonathan Brassow | c0a2fa1 | 2011-08-02 12:32:06 +0100 | [diff] [blame] | 92 | |
| 93 | If a drive has failed or is missing at creation time, a '-' can be |
| 94 | given for both the metadata and data drives for a given position. |
| 95 | |
| 96 | |
| 97 | Example tables |
| 98 | -------------- |
Jonathan Brassow | b12d437 | 2011-08-02 12:32:07 +0100 | [diff] [blame] | 99 | # RAID4 - 4 data drives, 1 parity (no metadata devices) |
NeilBrown | 9d09e66 | 2011-01-13 20:00:02 +0000 | [diff] [blame] | 100 | # No metadata devices specified to hold superblock/bitmap info |
| 101 | # Chunk size of 1MiB |
| 102 | # (Lines separated for easy reading) |
Jonathan Brassow | c0a2fa1 | 2011-08-02 12:32:06 +0100 | [diff] [blame] | 103 | |
NeilBrown | 9d09e66 | 2011-01-13 20:00:02 +0000 | [diff] [blame] | 104 | 0 1960893648 raid \ |
| 105 | raid4 1 2048 \ |
| 106 | 5 - 8:17 - 8:33 - 8:49 - 8:65 - 8:81 |
| 107 | |
Jonathan Brassow | b12d437 | 2011-08-02 12:32:07 +0100 | [diff] [blame] | 108 | # RAID4 - 4 data drives, 1 parity (with metadata devices) |
NeilBrown | 9d09e66 | 2011-01-13 20:00:02 +0000 | [diff] [blame] | 109 | # Chunk size of 1MiB, force RAID initialization, |
| 110 | # min recovery rate at 20 kiB/sec/disk |
Jonathan Brassow | c0a2fa1 | 2011-08-02 12:32:06 +0100 | [diff] [blame] | 111 | |
NeilBrown | 9d09e66 | 2011-01-13 20:00:02 +0000 | [diff] [blame] | 112 | 0 1960893648 raid \ |
Jonathan Brassow | b12d437 | 2011-08-02 12:32:07 +0100 | [diff] [blame] | 113 | raid4 4 2048 sync min_recovery_rate 20 \ |
| 114 | 5 8:17 8:18 8:33 8:34 8:49 8:50 8:65 8:66 8:81 8:82 |
NeilBrown | 9d09e66 | 2011-01-13 20:00:02 +0000 | [diff] [blame] | 115 | |
Jonathan Brassow | c0a2fa1 | 2011-08-02 12:32:06 +0100 | [diff] [blame] | 116 | 'dmsetup table' displays the table used to construct the mapping. |
Jonathan Brassow | 46bed2b | 2011-08-02 12:32:07 +0100 | [diff] [blame] | 117 | The optional parameters are always printed in the order listed |
Jonathan Brassow | c0a2fa1 | 2011-08-02 12:32:06 +0100 | [diff] [blame] | 118 | above with "sync" or "nosync" always output ahead of the other |
| 119 | arguments, regardless of the order used when originally loading the table. |
Jonathan Brassow | 46bed2b | 2011-08-02 12:32:07 +0100 | [diff] [blame] | 120 | Arguments that can be repeated are ordered by value. |
NeilBrown | 9d09e66 | 2011-01-13 20:00:02 +0000 | [diff] [blame] | 121 | |
Jonathan Brassow | c0a2fa1 | 2011-08-02 12:32:06 +0100 | [diff] [blame] | 122 | 'dmsetup status' yields information on the state and health of the |
| 123 | array. |
| 124 | The output is as follows: |
NeilBrown | 9d09e66 | 2011-01-13 20:00:02 +0000 | [diff] [blame] | 125 | 1: <s> <l> raid \ |
| 126 | 2: <raid_type> <#devices> <1 health char for each dev> <resync_ratio> |
| 127 | |
Jonathan Brassow | c0a2fa1 | 2011-08-02 12:32:06 +0100 | [diff] [blame] | 128 | Line 1 is the standard output produced by device-mapper. |
| 129 | Line 2 is produced by the raid target, and best explained by example: |
NeilBrown | 9d09e66 | 2011-01-13 20:00:02 +0000 | [diff] [blame] | 130 | 0 1960893648 raid raid4 5 AAAAA 2/490221568 |
| 131 | Here we can see the RAID type is raid4, there are 5 devices - all of |
| 132 | which are 'A'live, and the array is 2/490221568 complete with recovery. |
Jonathan Brassow | c0a2fa1 | 2011-08-02 12:32:06 +0100 | [diff] [blame] | 133 | Faulty or missing devices are marked 'D'. Devices that are out-of-sync |
| 134 | are marked 'a'. |
Jonathan Brassow | 4ec1e36 | 2012-10-11 13:40:24 +1100 | [diff] [blame] | 135 | |
| 136 | |
| 137 | Version History |
| 138 | --------------- |
| 139 | 1.0.0 Initial version. Support for RAID 4/5/6 |
| 140 | 1.1.0 Added support for RAID 1 |
| 141 | 1.2.0 Handle creation of arrays that contain failed devices. |
| 142 | 1.3.0 Added support for RAID 10 |
| 143 | 1.3.1 Allow device replacement/rebuild for RAID 10 |