Rob Landley | b90926a | 2007-03-12 11:02:04 -0400 | [diff] [blame^] | 1 | <title>Rob's ext2 documentation</title> |
| 2 | |
| 3 | <p>This page focuses on the ext2 on-disk format. The Linux kernel's filesystem |
| 4 | implementation (the code to read and write it) is documented in the kernel |
| 5 | source, Documentation/filesystems/ext2.txt.</p> |
| 6 | |
| 7 | <p>Note: for our purposes, ext3 and ext4 are just ext2 with some extra data |
| 8 | fields.</p> |
| 9 | |
| 10 | <h2>Overview</h2> |
| 11 | |
| 12 | <h2>Blocks and Block Groups</h2> |
| 13 | |
| 14 | <p>Every ext2 filesystem consists of blocks, which are divided into block |
| 15 | groups. Blocks can be 1k, 2k, or 4k in length.<super><a href="#1">[1]</a></super> |
| 16 | All ext2 disk layout is done in terms of these logical blocks, never in |
| 17 | terms of 512-byte logical blocks.</p> |
| 18 | |
| 19 | <p>Each block group contains as many blocks as one block can hold a |
| 20 | bitmap for, so at a 1k block size a block group contains 8192 blocks (1024 |
| 21 | bytes * 8 bits), and at 4k block size a block group contains 32768 blocks. |
| 22 | Groups are numbered starting at 0, and occur one after another on disk, |
| 23 | in order, with no gaps between them.</p> |
| 24 | |
| 25 | <p>Block groups contain the following structures, in order:</p> |
| 26 | |
| 27 | <ul> |
| 28 | <li>Superblock (sometimes)</li> |
| 29 | <li>Group table (sometimes)</li> |
| 30 | <li>Block bitmap</li> |
| 31 | <li>Inode bitmap</li> |
| 32 | <li>Inode table</li> |
| 33 | <li>Data blocks</li> |
| 34 | </ul> |
| 35 | |
| 36 | <p>Not all block groups contain all structures. Specifically, the first two |
| 37 | (superblock and group table) only occur in some groups, and other block |
| 38 | groups start with the block bitmap and go from there. This frees up more |
| 39 | data blocks to hold actual file and directory data, see the superblock |
| 40 | description for details.</p> |
| 41 | |
| 42 | <p>Each structure in this list is stored in its' own block (or blocks in the |
| 43 | case of the group and inode tables), and doesn't share blocks with any other |
| 44 | structure. This can involve padding the end of the block with zeroes, or |
| 45 | extending tables with extra entries to fill up the rest of the block.</p> |
| 46 | |
| 47 | <p>The linux/ext2_fs.h #include file defines struct ext2_super_block, |
| 48 | struct ext2_group_desc, struct ext2_inode, struct ext2_dir_entry_2, and a lot |
| 49 | of constants. Toybox doesn't use this file directly, instead it has an e2fs.h |
| 50 | include of its own containting cleaned-up versions of the data it needs.</p> |
| 51 | |
| 52 | <h2>Superblock</h2> |
| 53 | |
| 54 | <p>The superblock contains a 1024 byte structure, which toybox calls |
| 55 | "struct ext2_superblock". Where exactly this structure is to be found is |
| 56 | a bit complicated for historical reasons.</p> |
| 57 | |
| 58 | <p>For copies of the superblock stored in block groups after the first, |
| 59 | the superblock structure starts at the beginning of the first block of the |
| 60 | group, with zero padding afterwards if necessary (I.E. if the block size is |
| 61 | larger than 1k). In modern "sparse superblock" filesystems (everything |
| 62 | anyone still cares about), the superblock occurs in group 0 and in later groups |
| 63 | that are powers of 3, 5, and 7. (So groups 0, 1, 3, 5, 7, 9, 25, 27, 49, 81, |
| 64 | 125, 243, 343...) Any block group starting with a superblock will also |
| 65 | have a group descriptor table, and ones that don't won't.</p> |
| 66 | |
| 67 | <p>The very first superblock is weird. This is because if you format an entire |
| 68 | block device (rather than a partition), you stomp the very start of the disk |
| 69 | which contains the boot sector and the partition table. Back when ext2 on |
| 70 | floppies was common, this was a big deal.</p> |
| 71 | |
| 72 | <p>So the very first 1024 bytes of the very first block are always left alone. |
| 73 | When the block size is 1024 bytes, then that block is left alone and the |
| 74 | superblock is stored in the second block instead<super><a href="#2">[2]</a>. |
| 75 | When the block size is larger than 1024 bytes, the first superblock starts |
| 76 | 1024 bytes into the block, with the original data preserved by mke2fs and |
| 77 | appropriate zero padding added to the end of the block (if necessary).</p> |
| 78 | |
| 79 | <h2>Group descriptor table</h2> |
| 80 | <h2>Block bitmap</h2> |
| 81 | <h2>Inode bitmap</h2> |
| 82 | <h2>Inode table</h2> |
| 83 | <h2>Data blocks</h2> |
| 84 | |
| 85 | <h2>Directories</h2> |
| 86 | |
| 87 | <p>For performance reasons, directory entries are 4-byte aligned (rec_len is |
| 88 | a multiple of 4), so up to 3 bytes of padding (zeroes) can be added at the end |
| 89 | of each name. (This affects rec_len but not the name_len.)</p> |
| 90 | |
| 91 | <p>The last directory entry in each block is padded up to block size. If there |
| 92 | isn't enough space for another struct ext2_dentry the last </p> |
| 93 | |
| 94 | <p>Question: is the length stored in the inode also padded up to block size?</p> |
| 95 | |
| 96 | <hr /> |
| 97 | <p><a name="1" />Footnote 1: On some systems blocks can be larger than 4k, but |
| 98 | for implementation reasons not larger than PAGE_SIZE. So the Alpha can have |
| 99 | 8k blocks but most other systems couldn't mount them, thus you don't see this |
| 100 | out in the wild much anymore.</p> |
| 101 | |
| 102 | <p><a name="2" />Footnote 2: In this case, the first_data_block field in the |
| 103 | superblock structure will be set to 1. Otherwise it's always 0. How this |
| 104 | could POSSIBLY be useful information is an open question, since A) you have to |
| 105 | read the superblock before you can get this information, so you know where |
| 106 | it came from, B) the first copy of the superblock always starts at offset 1024 |
| 107 | no matter what, and if your block size is 1024 you already know you skipped the |
| 108 | first block.</p> |