Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1 | Early userspace support |
| 2 | ======================= |
| 3 | |
| 4 | Last update: 2004-12-20 tlh |
| 5 | |
| 6 | |
| 7 | "Early userspace" is a set of libraries and programs that provide |
| 8 | various pieces of functionality that are important enough to be |
| 9 | available while a Linux kernel is coming up, but that don't need to be |
| 10 | run inside the kernel itself. |
| 11 | |
| 12 | It consists of several major infrastructure components: |
| 13 | |
| 14 | - gen_init_cpio, a program that builds a cpio-format archive |
| 15 | containing a root filesystem image. This archive is compressed, and |
| 16 | the compressed image is linked into the kernel image. |
| 17 | - initramfs, a chunk of code that unpacks the compressed cpio image |
| 18 | midway through the kernel boot process. |
| 19 | - klibc, a userspace C library, currently packaged separately, that is |
| 20 | optimized for correctness and small size. |
| 21 | |
| 22 | The cpio file format used by initramfs is the "newc" (aka "cpio -c") |
| 23 | format, and is documented in the file "buffer-format.txt". There are |
| 24 | two ways to add an early userspace image: specify an existing cpio |
| 25 | archive to be used as the image or have the kernel build process build |
| 26 | the image from specifications. |
| 27 | |
| 28 | CPIO ARCHIVE method |
| 29 | |
| 30 | You can create a cpio archive that contains the early userspace image. |
Jim Cromie | b2d1a8a | 2005-11-08 17:16:50 +0100 | [diff] [blame^] | 31 | Your cpio archive should be specified in CONFIG_INITRAMFS_SOURCE and it |
Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 32 | will be used directly. Only a single cpio file may be specified in |
| 33 | CONFIG_INITRAMFS_SOURCE and directory and file names are not allowed in |
| 34 | combination with a cpio archive. |
| 35 | |
| 36 | IMAGE BUILDING method |
| 37 | |
| 38 | The kernel build process can also build an early userspace image from |
| 39 | source parts rather than supplying a cpio archive. This method provides |
| 40 | a way to create images with root-owned files even though the image was |
| 41 | built by an unprivileged user. |
| 42 | |
| 43 | The image is specified as one or more sources in |
| 44 | CONFIG_INITRAMFS_SOURCE. Sources can be either directories or files - |
| 45 | cpio archives are *not* allowed when building from sources. |
| 46 | |
| 47 | A source directory will have it and all of it's contents packaged. The |
| 48 | specified directory name will be mapped to '/'. When packaging a |
| 49 | directory, limited user and group ID translation can be performed. |
| 50 | INITRAMFS_ROOT_UID can be set to a user ID that needs to be mapped to |
| 51 | user root (0). INITRAMFS_ROOT_GID can be set to a group ID that needs |
| 52 | to be mapped to group root (0). |
| 53 | |
| 54 | A source file must be directives in the format required by the |
| 55 | usr/gen_init_cpio utility (run 'usr/gen_init_cpio --help' to get the |
| 56 | file format). The directives in the file will be passed directly to |
| 57 | usr/gen_init_cpio. |
| 58 | |
| 59 | When a combination of directories and files are specified then the |
| 60 | initramfs image will be an aggregate of all of them. In this way a user |
| 61 | can create a 'root-image' directory and install all files into it. |
| 62 | Because device-special files cannot be created by a unprivileged user, |
| 63 | special files can be listed in a 'root-files' file. Both 'root-image' |
| 64 | and 'root-files' can be listed in CONFIG_INITRAMFS_SOURCE and a complete |
| 65 | early userspace image can be built by an unprivileged user. |
| 66 | |
| 67 | As a technical note, when directories and files are specified, the |
| 68 | entire CONFIG_INITRAMFS_SOURCE is passed to |
| 69 | scripts/gen_initramfs_list.sh. This means that CONFIG_INITRAMFS_SOURCE |
| 70 | can really be interpreted as any legal argument to |
| 71 | gen_initramfs_list.sh. If a directory is specified as an argument then |
| 72 | the contents are scanned, uid/gid translation is performed, and |
| 73 | usr/gen_init_cpio file directives are output. If a directory is |
| 74 | specified as an arugemnt to scripts/gen_initramfs_list.sh then the |
| 75 | contents of the file are simply copied to the output. All of the output |
| 76 | directives from directory scanning and file contents copying are |
| 77 | processed by usr/gen_init_cpio. |
| 78 | |
| 79 | See also 'scripts/gen_initramfs_list.sh -h'. |
| 80 | |
| 81 | Where's this all leading? |
| 82 | ========================= |
| 83 | |
| 84 | The klibc distribution contains some of the necessary software to make |
| 85 | early userspace useful. The klibc distribution is currently |
| 86 | maintained separately from the kernel, but this may change early in |
| 87 | the 2.7 era (it missed the boat for 2.5). |
| 88 | |
| 89 | You can obtain somewhat infrequent snapshots of klibc from |
| 90 | ftp://ftp.kernel.org/pub/linux/libs/klibc/ |
| 91 | |
| 92 | For active users, you are better off using the klibc BitKeeper |
| 93 | repositories, at http://klibc.bkbits.net/ |
| 94 | |
| 95 | The standalone klibc distribution currently provides three components, |
| 96 | in addition to the klibc library: |
| 97 | |
| 98 | - ipconfig, a program that configures network interfaces. It can |
| 99 | configure them statically, or use DHCP to obtain information |
| 100 | dynamically (aka "IP autoconfiguration"). |
| 101 | - nfsmount, a program that can mount an NFS filesystem. |
| 102 | - kinit, the "glue" that uses ipconfig and nfsmount to replace the old |
| 103 | support for IP autoconfig, mount a filesystem over NFS, and continue |
| 104 | system boot using that filesystem as root. |
| 105 | |
| 106 | kinit is built as a single statically linked binary to save space. |
| 107 | |
| 108 | Eventually, several more chunks of kernel functionality will hopefully |
| 109 | move to early userspace: |
| 110 | |
| 111 | - Almost all of init/do_mounts* (the beginning of this is already in |
| 112 | place) |
| 113 | - ACPI table parsing |
| 114 | - Insert unwieldy subsystem that doesn't really need to be in kernel |
| 115 | space here |
| 116 | |
| 117 | If kinit doesn't meet your current needs and you've got bytes to burn, |
| 118 | the klibc distribution includes a small Bourne-compatible shell (ash) |
| 119 | and a number of other utilities, so you can replace kinit and build |
| 120 | custom initramfs images that meet your needs exactly. |
| 121 | |
| 122 | For questions and help, you can sign up for the early userspace |
| 123 | mailing list at http://www.zytor.com/mailman/listinfo/klibc |
| 124 | |
| 125 | How does it work? |
| 126 | ================= |
| 127 | |
| 128 | The kernel has currently 3 ways to mount the root filesystem: |
| 129 | |
| 130 | a) all required device and filesystem drivers compiled into the kernel, no |
| 131 | initrd. init/main.c:init() will call prepare_namespace() to mount the |
| 132 | final root filesystem, based on the root= option and optional init= to run |
| 133 | some other init binary than listed at the end of init/main.c:init(). |
| 134 | |
| 135 | b) some device and filesystem drivers built as modules and stored in an |
| 136 | initrd. The initrd must contain a binary '/linuxrc' which is supposed to |
| 137 | load these driver modules. It is also possible to mount the final root |
| 138 | filesystem via linuxrc and use the pivot_root syscall. The initrd is |
| 139 | mounted and executed via prepare_namespace(). |
| 140 | |
| 141 | c) using initramfs. The call to prepare_namespace() must be skipped. |
| 142 | This means that a binary must do all the work. Said binary can be stored |
| 143 | into initramfs either via modifying usr/gen_init_cpio.c or via the new |
| 144 | initrd format, an cpio archive. It must be called "/init". This binary |
| 145 | is responsible to do all the things prepare_namespace() would do. |
| 146 | |
| 147 | To remain backwards compatibility, the /init binary will only run if it |
| 148 | comes via an initramfs cpio archive. If this is not the case, |
| 149 | init/main.c:init() will run prepare_namespace() to mount the final root |
| 150 | and exec one of the predefined init binaries. |
| 151 | |
| 152 | Bryan O'Sullivan <bos@serpentine.com> |