blob: b428556197c99a0eea19d3b040bfbe3c0ffcf06e [file] [log] [blame]
Jonathan Brassowc0a2fa12011-08-02 12:32:06 +01001dm-raid
2-------
NeilBrown9d09e662011-01-13 20:00:02 +00003
Jonathan Brassowc0a2fa12011-08-02 12:32:06 +01004The device-mapper RAID (dm-raid) target provides a bridge from DM to MD.
5It allows the MD RAID drivers to be accessed using a device-mapper
6interface.
NeilBrown9d09e662011-01-13 20:00:02 +00007
Jonathan Brassowc0a2fa12011-08-02 12:32:06 +01008The target is named "raid" and it accepts the following parameters:
NeilBrown9d09e662011-01-13 20:00:02 +00009
Jonathan Brassowc0a2fa12011-08-02 12:32:06 +010010 <raid_type> <#raid_params> <raid_params> \
11 <#raid_devs> <metadata_dev0> <dev0> [.. <metadata_devN> <devN>]
NeilBrown9d09e662011-01-13 20:00:02 +000012
Jonathan Brassowc0a2fa12011-08-02 12:32:06 +010013<raid_type>:
Jonathan Brassowb12d4372011-08-02 12:32:07 +010014 raid1 RAID1 mirroring
Jonathan Brassowc0a2fa12011-08-02 12:32:06 +010015 raid4 RAID4 dedicated parity disk
16 raid5_la RAID5 left asymmetric
17 - rotating parity 0 with data continuation
18 raid5_ra RAID5 right asymmetric
19 - rotating parity N with data continuation
20 raid5_ls RAID5 left symmetric
21 - rotating parity 0 with data restart
22 raid5_rs RAID5 right symmetric
23 - rotating parity N with data restart
24 raid6_zr RAID6 zero restart
25 - rotating parity zero (left-to-right) with data restart
26 raid6_nr RAID6 N restart
27 - rotating parity N (right-to-left) with data restart
28 raid6_nc RAID6 N continue
29 - rotating parity N (right-to-left) with data continuation
Jonathan Brassow63f33b8d2012-07-31 21:44:26 -050030 raid10 Various RAID10 inspired algorithms chosen by additional params
31 - RAID10: Striped Mirrors (aka 'Striping on top of mirrors')
32 - RAID1E: Integrated Adjacent Stripe Mirroring
Jonathan Brassowfe5d2f42013-02-21 13:28:10 +110033 - RAID1E: Integrated Offset Stripe Mirroring
Jonathan Brassow63f33b8d2012-07-31 21:44:26 -050034 - and other similar RAID10 variants
NeilBrown9d09e662011-01-13 20:00:02 +000035
Masanari Iida40e47122012-03-04 23:16:11 +090036 Reference: Chapter 4 of
Jonathan Brassowc0a2fa12011-08-02 12:32:06 +010037 http://www.snia.org/sites/default/files/SNIA_DDF_Technical_Position_v2.0.pdf
NeilBrown9d09e662011-01-13 20:00:02 +000038
Jonathan Brassowc0a2fa12011-08-02 12:32:06 +010039<#raid_params>: The number of parameters that follow.
NeilBrown9d09e662011-01-13 20:00:02 +000040
Jonathan Brassowc0a2fa12011-08-02 12:32:06 +010041<raid_params> consists of
42 Mandatory parameters:
43 <chunk_size>: Chunk size in sectors. This parameter is often known as
44 "stripe size". It is the only mandatory parameter and
45 is placed first.
46
47 followed by optional parameters (in any order):
48 [sync|nosync] Force or prevent RAID initialization.
49
50 [rebuild <idx>] Rebuild drive number idx (first drive is 0).
51
52 [daemon_sleep <ms>]
53 Interval between runs of the bitmap daemon that
54 clear bits. A longer interval means less bitmap I/O but
55 resyncing after a failure is likely to take longer.
56
57 [min_recovery_rate <kB/sec/disk>] Throttle RAID initialization
58 [max_recovery_rate <kB/sec/disk>] Throttle RAID initialization
Jonathan Brassow46bed2b2011-08-02 12:32:07 +010059 [write_mostly <idx>] Drive index is write-mostly
Jonathan Brassowc0a2fa12011-08-02 12:32:06 +010060 [max_write_behind <sectors>] See '-write-behind=' (man mdadm)
61 [stripe_cache <sectors>] Stripe cache size (higher RAIDs only)
Jonathan Brassowc1084562011-08-02 12:32:07 +010062 [region_size <sectors>]
63 The region_size multiplied by the number of regions is the
64 logical size of the array. The bitmap records the device
65 synchronisation state for each region.
Jonathan Brassowc0a2fa12011-08-02 12:32:06 +010066
Jonathan Brassow63f33b8d2012-07-31 21:44:26 -050067 [raid10_copies <# copies>]
Jonathan Brassowfe5d2f42013-02-21 13:28:10 +110068 [raid10_format <near|far|offset>]
Jonathan Brassow63f33b8d2012-07-31 21:44:26 -050069 These two options are used to alter the default layout of
70 a RAID10 configuration. The number of copies is can be
Jonathan Brassowfe5d2f42013-02-21 13:28:10 +110071 specified, but the default is 2. There are also three
72 variations to how the copies are laid down - the default
73 is "near". Near copies are what most people think of with
74 respect to mirroring. If these options are left unspecified,
75 or 'raid10_copies 2' and/or 'raid10_format near' are given,
76 then the layouts for 2, 3 and 4 devices are:
Jonathan Brassow63f33b8d2012-07-31 21:44:26 -050077 2 drives 3 drives 4 drives
78 -------- ---------- --------------
79 A1 A1 A1 A1 A2 A1 A1 A2 A2
80 A2 A2 A2 A3 A3 A3 A3 A4 A4
81 A3 A3 A4 A4 A5 A5 A5 A6 A6
82 A4 A4 A5 A6 A6 A7 A7 A8 A8
83 .. .. .. .. .. .. .. .. ..
84 The 2-device layout is equivalent 2-way RAID1. The 4-device
85 layout is what a traditional RAID10 would look like. The
86 3-device layout is what might be called a 'RAID1E - Integrated
87 Adjacent Stripe Mirroring'.
88
Jonathan Brassowfe5d2f42013-02-21 13:28:10 +110089 If 'raid10_copies 2' and 'raid10_format far', then the layouts
90 for 2, 3 and 4 devices are:
91 2 drives 3 drives 4 drives
92 -------- -------------- --------------------
93 A1 A2 A1 A2 A3 A1 A2 A3 A4
94 A3 A4 A4 A5 A6 A5 A6 A7 A8
95 A5 A6 A7 A8 A9 A9 A10 A11 A12
96 .. .. .. .. .. .. .. .. ..
97 A2 A1 A3 A1 A2 A2 A1 A4 A3
98 A4 A3 A6 A4 A5 A6 A5 A8 A7
99 A6 A5 A9 A7 A8 A10 A9 A12 A11
100 .. .. .. .. .. .. .. .. ..
101
102 If 'raid10_copies 2' and 'raid10_format offset', then the
103 layouts for 2, 3 and 4 devices are:
104 2 drives 3 drives 4 drives
105 -------- ------------ -----------------
106 A1 A2 A1 A2 A3 A1 A2 A3 A4
107 A2 A1 A3 A1 A2 A2 A1 A4 A3
108 A3 A4 A4 A5 A6 A5 A6 A7 A8
109 A4 A3 A6 A4 A5 A6 A5 A8 A7
110 A5 A6 A7 A8 A9 A9 A10 A11 A12
111 A6 A5 A9 A7 A8 A10 A9 A12 A11
112 .. .. .. .. .. .. .. .. ..
113 Here we see layouts closely akin to 'RAID1E - Integrated
114 Offset Stripe Mirroring'.
115
Jonathan Brassowc0a2fa12011-08-02 12:32:06 +0100116<#raid_devs>: The number of devices composing the array.
117 Each device consists of two entries. The first is the device
118 containing the metadata (if any); the second is the one containing the
Jonathan Brassowb12d4372011-08-02 12:32:07 +0100119 data.
Jonathan Brassowc0a2fa12011-08-02 12:32:06 +0100120
121 If a drive has failed or is missing at creation time, a '-' can be
122 given for both the metadata and data drives for a given position.
123
124
125Example tables
126--------------
Jonathan Brassowb12d4372011-08-02 12:32:07 +0100127# RAID4 - 4 data drives, 1 parity (no metadata devices)
NeilBrown9d09e662011-01-13 20:00:02 +0000128# No metadata devices specified to hold superblock/bitmap info
129# Chunk size of 1MiB
130# (Lines separated for easy reading)
Jonathan Brassowc0a2fa12011-08-02 12:32:06 +0100131
NeilBrown9d09e662011-01-13 20:00:02 +00001320 1960893648 raid \
133 raid4 1 2048 \
134 5 - 8:17 - 8:33 - 8:49 - 8:65 - 8:81
135
Jonathan Brassowb12d4372011-08-02 12:32:07 +0100136# RAID4 - 4 data drives, 1 parity (with metadata devices)
NeilBrown9d09e662011-01-13 20:00:02 +0000137# Chunk size of 1MiB, force RAID initialization,
138# min recovery rate at 20 kiB/sec/disk
Jonathan Brassowc0a2fa12011-08-02 12:32:06 +0100139
NeilBrown9d09e662011-01-13 20:00:02 +00001400 1960893648 raid \
Jonathan Brassowb12d4372011-08-02 12:32:07 +0100141 raid4 4 2048 sync min_recovery_rate 20 \
142 5 8:17 8:18 8:33 8:34 8:49 8:50 8:65 8:66 8:81 8:82
NeilBrown9d09e662011-01-13 20:00:02 +0000143
Jonathan Brassowc0a2fa12011-08-02 12:32:06 +0100144'dmsetup table' displays the table used to construct the mapping.
Jonathan Brassow46bed2b2011-08-02 12:32:07 +0100145The optional parameters are always printed in the order listed
Jonathan Brassowc0a2fa12011-08-02 12:32:06 +0100146above with "sync" or "nosync" always output ahead of the other
147arguments, regardless of the order used when originally loading the table.
Jonathan Brassow46bed2b2011-08-02 12:32:07 +0100148Arguments that can be repeated are ordered by value.
NeilBrown9d09e662011-01-13 20:00:02 +0000149
Jonathan Brassowc0a2fa12011-08-02 12:32:06 +0100150'dmsetup status' yields information on the state and health of the
151array.
152The output is as follows:
NeilBrown9d09e662011-01-13 20:00:02 +00001531: <s> <l> raid \
1542: <raid_type> <#devices> <1 health char for each dev> <resync_ratio>
155
Jonathan Brassowc0a2fa12011-08-02 12:32:06 +0100156Line 1 is the standard output produced by device-mapper.
157Line 2 is produced by the raid target, and best explained by example:
NeilBrown9d09e662011-01-13 20:00:02 +0000158 0 1960893648 raid raid4 5 AAAAA 2/490221568
159Here we can see the RAID type is raid4, there are 5 devices - all of
160which are 'A'live, and the array is 2/490221568 complete with recovery.
Jonathan Brassowc0a2fa12011-08-02 12:32:06 +0100161Faulty or missing devices are marked 'D'. Devices that are out-of-sync
162are marked 'a'.
Jonathan Brassow4ec1e362012-10-11 13:40:24 +1100163
164
165Version History
166---------------
1671.0.0 Initial version. Support for RAID 4/5/6
1681.1.0 Added support for RAID 1
1691.2.0 Handle creation of arrays that contain failed devices.
1701.3.0 Added support for RAID 10
1711.3.1 Allow device replacement/rebuild for RAID 10
Jonathan Brassow55ebbb52013-01-22 21:42:18 -06001721.3.2 Fix/improve redundancy checking for RAID10
Jonathan Brassowfe5d2f42013-02-21 13:28:10 +11001731.4.0 Non-functional change. Removes arg from mapping function.
1741.4.1 Add RAID10 "far" and "offset" algorithm support.