Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1 | |
| 2 | Ext3 Filesystem |
| 3 | =============== |
| 4 | |
| 5 | ext3 was originally released in September 1999. Written by Stephen Tweedie |
| 6 | for 2.2 branch, and ported to 2.4 kernels by Peter Braam, Andreas Dilger, |
| 7 | Andrew Morton, Alexander Viro, Ted Ts'o and Stephen Tweedie. |
| 8 | |
| 9 | ext3 is ext2 filesystem enhanced with journalling capabilities. |
| 10 | |
| 11 | Options |
| 12 | ======= |
| 13 | |
| 14 | When mounting an ext3 filesystem, the following option are accepted: |
| 15 | (*) == default |
| 16 | |
| 17 | jounal=update Update the ext3 file system's journal to the |
| 18 | current format. |
| 19 | |
| 20 | journal=inum When a journal already exists, this option is |
| 21 | ignored. Otherwise, it specifies the number of |
| 22 | the inode which will represent the ext3 file |
| 23 | system's journal file. |
| 24 | |
| 25 | noload Don't load the journal on mounting. |
| 26 | |
| 27 | data=journal All data are committed into the journal prior |
| 28 | to being written into the main file system. |
| 29 | |
| 30 | data=ordered (*) All data are forced directly out to the main file |
| 31 | system prior to its metadata being committed to |
| 32 | the journal. |
| 33 | |
| 34 | data=writeback Data ordering is not preserved, data may be |
| 35 | written into the main file system after its |
| 36 | metadata has been committed to the journal. |
| 37 | |
| 38 | commit=nrsec (*) Ext3 can be told to sync all its data and metadata |
| 39 | every 'nrsec' seconds. The default value is 5 seconds. |
| 40 | This means that if you lose your power, you will lose, |
| 41 | as much, the latest 5 seconds of work (your filesystem |
| 42 | will not be damaged though, thanks to journaling). This |
| 43 | default value (or any low value) will hurt performance, |
| 44 | but it's good for data-safety. Setting it to 0 will |
| 45 | have the same effect than leaving the default 5 sec. |
| 46 | Setting it to very large values will improve |
| 47 | performance. |
| 48 | |
| 49 | barrier=1 This enables/disables barriers. barrier=0 disables it, |
| 50 | barrier=1 enables it. |
| 51 | |
| 52 | orlov (*) This enables the new Orlov block allocator. It's enabled |
| 53 | by default. |
| 54 | |
| 55 | oldalloc This disables the Orlov block allocator and enables the |
| 56 | old block allocator. Orlov should have better performance, |
| 57 | we'd like to get some feedback if it's the contrary for |
| 58 | you. |
| 59 | |
| 60 | user_xattr (*) Enables POSIX Extended Attributes. It's enabled by |
| 61 | default, however you need to confifure its support |
| 62 | (CONFIG_EXT3_FS_XATTR). This is neccesary if you want |
| 63 | to use POSIX Acces Control Lists support. You can visit |
| 64 | http://acl.bestbits.at to know more about POSIX Extended |
| 65 | attributes. |
| 66 | |
| 67 | nouser_xattr Disables POSIX Extended Attributes. |
| 68 | |
| 69 | acl (*) Enables POSIX Access Control Lists support. This is |
| 70 | enabled by default, however you need to configure |
| 71 | its support (CONFIG_EXT3_FS_POSIX_ACL). If you want |
| 72 | to know more about ACLs visit http://acl.bestbits.at |
| 73 | |
| 74 | noacl This option disables POSIX Access Control List support. |
| 75 | |
| 76 | reservation |
| 77 | |
| 78 | noreservation |
| 79 | |
| 80 | resize= |
| 81 | |
| 82 | bsddf (*) Make 'df' act like BSD. |
| 83 | minixdf Make 'df' act like Minix. |
| 84 | |
| 85 | check=none Don't do extra checking of bitmaps on mount. |
| 86 | nocheck |
| 87 | |
| 88 | debug Extra debugging information is sent to syslog. |
| 89 | |
| 90 | errors=remount-ro(*) Remount the filesystem read-only on an error. |
| 91 | errors=continue Keep going on a filesystem error. |
| 92 | errors=panic Panic and halt the machine if an error occurs. |
| 93 | |
| 94 | grpid Give objects the same group ID as their creator. |
| 95 | bsdgroups |
| 96 | |
| 97 | nogrpid (*) New objects have the group ID of their creator. |
| 98 | sysvgroups |
| 99 | |
| 100 | resgid=n The group ID which may use the reserved blocks. |
| 101 | |
| 102 | resuid=n The user ID which may use the reserved blocks. |
| 103 | |
| 104 | sb=n Use alternate superblock at this location. |
| 105 | |
| 106 | quota Quota options are currently silently ignored. |
| 107 | noquota (see fs/ext3/super.c, line 594) |
| 108 | grpquota |
| 109 | usrquota |
| 110 | |
| 111 | |
| 112 | Specification |
| 113 | ============= |
| 114 | ext3 shares all disk implementation with ext2 filesystem, and add |
| 115 | transactions capabilities to ext2. Journaling is done by the |
| 116 | Journaling block device layer. |
| 117 | |
| 118 | Journaling Block Device layer |
| 119 | ----------------------------- |
| 120 | The Journaling Block Device layer (JBD) isn't ext3 specific. It was |
| 121 | design to add journaling capabilities on a block device. The ext3 |
| 122 | filesystem code will inform the JBD of modifications it is performing |
| 123 | (Call a transaction). the journal support the transactions start and |
| 124 | stop, and in case of crash, the journal can replayed the transactions |
| 125 | to put the partition on a consistent state fastly. |
| 126 | |
| 127 | handles represent a single atomic update to a filesystem. JBD can |
| 128 | handle external journal on a block device. |
| 129 | |
| 130 | Data Mode |
| 131 | --------- |
| 132 | There's 3 different data modes: |
| 133 | |
| 134 | * writeback mode |
| 135 | In data=writeback mode, ext3 does not journal data at all. This mode |
| 136 | provides a similar level of journaling as XFS, JFS, and ReiserFS in its |
| 137 | default mode - metadata journaling. A crash+recovery can cause |
| 138 | incorrect data to appear in files which were written shortly before the |
| 139 | crash. This mode will typically provide the best ext3 performance. |
| 140 | |
| 141 | * ordered mode |
| 142 | In data=ordered mode, ext3 only officially journals metadata, but it |
| 143 | logically groups metadata and data blocks into a single unit called a |
| 144 | transaction. When it's time to write the new metadata out to disk, the |
| 145 | associated data blocks are written first. In general, this mode |
| 146 | perform slightly slower than writeback but significantly faster than |
| 147 | journal mode. |
| 148 | |
| 149 | * journal mode |
| 150 | data=journal mode provides full data and metadata journaling. All new |
| 151 | data is written to the journal first, and then to its final location. |
| 152 | In the event of a crash, the journal can be replayed, bringing both |
| 153 | data and metadata into a consistent state. This mode is the slowest |
| 154 | except when data needs to be read from and written to disk at the same |
| 155 | time where it outperform all others mode. |
| 156 | |
| 157 | Compatibility |
| 158 | ------------- |
| 159 | |
| 160 | Ext2 partitions can be easily convert to ext3, with `tune2fs -j <dev>`. |
| 161 | Ext3 is fully compatible with Ext2. Ext3 partitions can easily be |
| 162 | mounted as Ext2. |
| 163 | |
| 164 | External Tools |
| 165 | ============== |
| 166 | see manual pages to know more. |
| 167 | |
| 168 | tune2fs: create a ext3 journal on a ext2 partition with the -j flags |
| 169 | mke2fs: create a ext3 partition with the -j flags |
| 170 | debugfs: ext2 and ext3 file system debugger |
| 171 | |
| 172 | References |
| 173 | ========== |
| 174 | |
| 175 | kernel source: file:/usr/src/linux/fs/ext3 |
| 176 | file:/usr/src/linux/fs/jbd |
| 177 | |
| 178 | programs: http://e2fsprogs.sourceforge.net |
| 179 | |
| 180 | useful link: |
| 181 | http://www.zip.com.au/~akpm/linux/ext3/ext3-usage.html |
| 182 | http://www-106.ibm.com/developerworks/linux/library/l-fs7/ |
| 183 | http://www-106.ibm.com/developerworks/linux/library/l-fs8/ |