Rob Landley | 349ff52 | 2014-01-04 13:09:42 -0600 | [diff] [blame] | 1 | <html><head><title>toybox source code walkthrough</title></head> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 2 | <!--#include file="header.html" --> |
| 3 | |
Rob Landley | 2704834 | 2013-08-18 14:24:59 -0500 | [diff] [blame] | 4 | <p><h1><a name="style" /><a href="#style">Code style</a></h1></p> |
Rob Landley | e7c9a6d | 2012-02-28 06:34:09 -0600 | [diff] [blame] | 5 | |
| 6 | <p>The primary goal of toybox is _simple_ code. Keeping the code small is |
Rob Landley | ed6ed62 | 2012-03-06 20:49:03 -0600 | [diff] [blame] | 7 | second, with speed and lots of features coming in somewhere after that. |
| 8 | (For more on that, see the <a href=design.html>design</a> page.)</p> |
Rob Landley | e7c9a6d | 2012-02-28 06:34:09 -0600 | [diff] [blame] | 9 | |
| 10 | <p>A simple implementation usually takes up fewer lines of source code, |
| 11 | meaning more code can fit on the screen at once, meaning the programmer can |
| 12 | see more of it on the screen and thus keep more if in their head at once. |
Rob Landley | ed6ed62 | 2012-03-06 20:49:03 -0600 | [diff] [blame] | 13 | This helps code auditing and thus reduces bugs. That said, sometimes being |
| 14 | more explicit is preferable to being clever enough to outsmart yourself: |
| 15 | don't be so terse your code is unreadable.</p> |
Rob Landley | 5a0660f | 2007-12-27 21:36:44 -0600 | [diff] [blame] | 16 | |
Rob Landley | ca73392 | 2014-05-19 18:24:35 -0500 | [diff] [blame] | 17 | <p>Toybox has an actual coding style guide over on |
| 18 | <a href=design.html#codestyle>the design page</a>, but in general we just |
| 19 | want the code to be consistent.</p> |
Rob Landley | 5a0660f | 2007-12-27 21:36:44 -0600 | [diff] [blame] | 20 | |
Rob Landley | 2704834 | 2013-08-18 14:24:59 -0500 | [diff] [blame] | 21 | <p><h1><a name="building" /><a href="#building">Building Toybox</a></h1></p> |
Rob Landley | e7c9a6d | 2012-02-28 06:34:09 -0600 | [diff] [blame] | 22 | |
| 23 | <p>Toybox is configured using the Kconfig language pioneered by the Linux |
| 24 | kernel, and adopted by many other projects (uClibc, OpenEmbedded, etc). |
| 25 | This generates a ".config" file containing the selected options, which |
Rob Landley | 7aa651a | 2012-11-13 17:14:08 -0600 | [diff] [blame] | 26 | controls which features are included when compiling toybox.</p> |
Rob Landley | e7c9a6d | 2012-02-28 06:34:09 -0600 | [diff] [blame] | 27 | |
| 28 | <p>Each configuration option has a default value. The defaults indicate the |
| 29 | "maximum sane configuration", I.E. if the feature defaults to "n" then it |
| 30 | either isn't complete or is a special-purpose option (such as debugging |
| 31 | code) that isn't intended for general purpose use.</p> |
| 32 | |
| 33 | <p>The standard build invocation is:</p> |
| 34 | |
| 35 | <ul> |
| 36 | <li>make defconfig #(or menuconfig)</li> |
| 37 | <li>make</li> |
| 38 | <li>make install</li> |
| 39 | </ul> |
| 40 | |
| 41 | <p>Type "make help" to see all available build options.</p> |
| 42 | |
| 43 | <p>The file "configure" contains a number of environment variable definitions |
| 44 | which influence the build, such as specifying which compiler to use or where |
| 45 | to install the resulting binaries. This file is included by the build, but |
| 46 | accepts existing definitions of the environment variables, so it may be sourced |
| 47 | or modified by the developer before building and the definitions exported |
| 48 | to the environment will take precedence.</p> |
| 49 | |
| 50 | <p>(To clarify: "configure" describes the build and installation environment, |
| 51 | ".config" lists the features selected by defconfig/menuconfig.)</p> |
Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 52 | |
Rob Landley | 2704834 | 2013-08-18 14:24:59 -0500 | [diff] [blame] | 53 | <p><h1><a name="running"><a href="#running">Running a command</a></h1></p> |
| 54 | |
| 55 | <h2>main</h2> |
| 56 | |
| 57 | <p>The toybox main() function is at the end of main.c at the top level. It has |
| 58 | two possible codepaths, only one of which is configured into any given build |
| 59 | of toybox.</p> |
| 60 | |
| 61 | <p>If CONFIG_SINGLE is selected, toybox is configured to contain only a single |
| 62 | command, so most of the normal setup can be skipped. In this case the |
| 63 | multiplexer isn't used, instead main() calls toy_singleinit() (also in main.c) |
| 64 | to set up global state and parse command line arguments, calls the command's |
| 65 | main function out of toy_list (in the CONFIG_SINGLE case the array has a single entry, no need to search), and if the function returns instead of exiting |
| 66 | it flushes stdout (detecting error) and returns toys.exitval.</p> |
| 67 | |
| 68 | <p>When CONFIG_SINGLE is not selected, main() uses basename() to find the |
| 69 | name it was run as, shifts its argument list one to the right so it lines up |
| 70 | with where the multiplexer function expects it, and calls toybox_main(). This |
| 71 | leverages the multiplexer command's infrastructure to find and run the |
| 72 | appropriate command. (A command name starting with "toybox" will |
| 73 | recursively call toybox_main(); you can go "./toybox toybox toybox toybox ls" |
| 74 | if you want to...)</p> |
| 75 | |
| 76 | <h2>toybox_main</h2> |
| 77 | |
| 78 | <p>The toybox_main() function is also in main,c. It handles a possible |
| 79 | --help option ("toybox --help ls"), prints the list of available commands if no |
| 80 | arguments were provided to the multiplexer (or with full path names if any |
| 81 | other option is provided before a command name, ala "toybox --list"). |
| 82 | Otherwise it calls toy_exec() on its argument list.</p> |
| 83 | |
| 84 | <p>Note that the multiplexer is the first entry in toy_list (the rest of the |
| 85 | list is sorted alphabetically to allow binary search), so toybox_main can |
| 86 | cheat and just grab the first entry to quickly set up its context without |
| 87 | searching. Since all command names go through the multiplexer at least once |
| 88 | in the non-TOYBOX_SINGLE case, this avoids a redundant search of |
| 89 | the list.</p> |
| 90 | |
| 91 | <p>The toy_exec() function is also in main.c. It performs toy_find() to |
| 92 | perform a binary search on the toy_list array to look up the command's |
| 93 | entry by name and saves it in the global variable which, calls toy_init() |
| 94 | to parse command line arguments and set up global state (using which->options), |
| 95 | and calls the appropriate command's main() function (which->toy_main). On |
| 96 | return it flushes all pending ansi FILE * I/O, detects if stdout had an |
| 97 | error, and then calls xexit() (which uses toys.exitval).</p> |
| 98 | |
| 99 | <p><h1><a name="infrastructure" /><a href="#infrastructure">Infrastructure</a></h1></p> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 100 | |
Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 101 | <p>The toybox source code is in following directories:</p> |
| 102 | <ul> |
| 103 | <li>The <a href="#top">top level directory</a> contains the file main.c (were |
| 104 | execution starts), the header file toys.h (included by every command), and |
| 105 | other global infrastructure.</li> |
| 106 | <li>The <a href="#lib">lib directory</a> contains common functions shared by |
Rob Landley | c8d0da5 | 2012-07-15 17:47:08 -0500 | [diff] [blame] | 107 | multiple commands:</li> |
| 108 | <ul> |
| 109 | <li><a href="#lib_lib">lib/lib.c</a></li> |
| 110 | <li><a href="#lib_llist">lib/llist.c</a></li> |
| 111 | <li><a href="#lib_args">lib/args.c</a></li> |
| 112 | <li><a href="#lib_dirtree">lib/dirtree.c</a></li> |
| 113 | </ul> |
Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 114 | <li>The <a href="#toys">toys directory</a> contains the C files implementating |
Rob Landley | 002a11e | 2014-05-22 08:16:55 -0500 | [diff] [blame^] | 115 | each command. Currently it contains five subdirectories categorizing the |
| 116 | commands: posix, lsb, other, example, and pending.</li> |
Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 117 | <li>The <a href="#scripts">scripts directory</a> contains the build and |
| 118 | test infrastructure.</li> |
| 119 | <li>The <a href="#kconfig">kconfig directory</a> contains the configuration |
| 120 | infrastructure implementing menuconfig (copied from the Linux kernel).</li> |
| 121 | <li>The <a href="#generated">generated directory</a> contains intermediate |
| 122 | files generated from other parts of the source code.</li> |
| 123 | </ul> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 124 | |
Rob Landley | bbe500e | 2012-02-26 21:53:15 -0600 | [diff] [blame] | 125 | <a name="adding" /> |
Rob Landley | 7eaf4f5 | 2014-04-09 08:30:09 -0500 | [diff] [blame] | 126 | <p><h1><a href="#adding">Adding a new command</a></h1></p> |
Rob Landley | c0e56ed | 2012-10-08 00:02:30 -0500 | [diff] [blame] | 127 | <p>To add a new command to toybox, add a C file implementing that command under |
Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 128 | the toys directory. No other files need to be modified; the build extracts |
Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 129 | all the information it needs (such as command line arguments) from specially |
Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 130 | formatted comments and macros in the C file. (See the description of the |
Rob Landley | e7c9a6d | 2012-02-28 06:34:09 -0600 | [diff] [blame] | 131 | <a href="#generated">"generated" directory</a> for details.)</p> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 132 | |
Rob Landley | 002a11e | 2014-05-22 08:16:55 -0500 | [diff] [blame^] | 133 | <p>Currently there are five subdirectories under "toys", one for commands |
Rob Landley | c0e56ed | 2012-10-08 00:02:30 -0500 | [diff] [blame] | 134 | defined by the POSIX standard, one for commands defined by the Linux Standard |
Rob Landley | 002a11e | 2014-05-22 08:16:55 -0500 | [diff] [blame^] | 135 | Base, an "other" directory for commands not covered by an obvious standard, |
| 136 | a directory of example commands (templates to use when starting new commands), |
| 137 | and a "pending" directory of commands that need further review/cleanup |
| 138 | before moving to one of the other directories (run these at your own risk, |
| 139 | cleanup patches welcome). |
| 140 | These directories are just for developer convenience sorting the commands, |
| 141 | the directories are otherwise functionally identical. To add a new category, |
| 142 | create the appropriate directory with a README file in it whose first line |
| 143 | is the description menuconfig should use for the directory.)</p> |
Rob Landley | c0e56ed | 2012-10-08 00:02:30 -0500 | [diff] [blame] | 144 | |
Rob Landley | 002a11e | 2014-05-22 08:16:55 -0500 | [diff] [blame^] | 145 | <p>An easy way to start a new command is copy the file "toys/example/hello.c" |
| 146 | to the name of the new command, and modify this copy to implement the new |
| 147 | command (more or less by turning every instance of "hello" into the |
Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 148 | name of your command, updating the command line arguments, globals, and |
Rob Landley | 002a11e | 2014-05-22 08:16:55 -0500 | [diff] [blame^] | 149 | help data, and then filling out its "main" function with code that does |
| 150 | something interesting).</p> |
| 151 | |
| 152 | <p>You could also start with "toys/example/skeleton.c", which provides a lot |
| 153 | more example code (showing several variants of command line option |
| 154 | parsing, how to implement multiple commands in the same file, and so on). |
| 155 | But usually it's just more stuff to delete.</p> |
Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 156 | |
| 157 | <p>Here's a checklist of steps to turn hello.c into another command:</p> |
| 158 | |
| 159 | <ul> |
Rob Landley | 002a11e | 2014-05-22 08:16:55 -0500 | [diff] [blame^] | 160 | <li><p>First "cp toys/example/hello.c toys/other/yourcommand.c" and open |
| 161 | the new file in your preferred text editor.</p> |
| 162 | <ul><li><p>Note that the |
| 163 | name of the new file is significant: it's the name of the new command you're |
| 164 | adding to toybox. The build includes all *.c files under toys/*/ whose |
| 165 | names are a case insensitive match for an enabled config symbol. So |
| 166 | toys/posix/cat.c only gets included if you have "CAT=y" in ".config".</p></li> |
| 167 | </ul></p></li> |
Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 168 | |
| 169 | <li><p>Change the one line comment at the top of the file (currently |
| 170 | "hello.c - A hello world program") to describe your new file.</p></li> |
| 171 | |
| 172 | <li><p>Change the copyright notice to your name, email, and the current |
| 173 | year.</p></li> |
| 174 | |
Rob Landley | c0e56ed | 2012-10-08 00:02:30 -0500 | [diff] [blame] | 175 | <li><p>Give a URL to the relevant standards document, where applicable. |
| 176 | (Sample links to SUSv4 and LSB are provided, feel free to link to other |
| 177 | documentation or standards as appropriate.)</p></li> |
Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 178 | |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 179 | <li><p>Update the USE_YOURCOMMAND(NEWTOY(yourcommand,"blah",0)) line. |
| 180 | The NEWTOY macro fills out this command's <a href="#toy_list">toy_list</a> |
| 181 | structure. The arguments to the NEWTOY macro are:</p> |
| 182 | |
| 183 | <ol> |
| 184 | <li><p>the name used to run your command</p></li> |
Rob Landley | 002a11e | 2014-05-22 08:16:55 -0500 | [diff] [blame^] | 185 | <li><p>the command line argument <a href="#lib_args">option parsing string</a> (0 if none)</p></li> |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 186 | <li><p>a bitfield of TOYFLAG values |
| 187 | (defined in toys.h) providing additional information such as where your |
| 188 | command should be installed on a running system, whether to blank umask |
| 189 | before running, whether or not the command must run as root (and thus should |
| 190 | retain root access if installed SUID), and so on.</p></li> |
| 191 | </ol> |
| 192 | </li> |
Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 193 | |
| 194 | <li><p>Change the kconfig data (from "config YOURCOMMAND" to the end of the |
| 195 | comment block) to supply your command's configuration and help |
| 196 | information. The uppper case config symbols are used by menuconfig, and are |
| 197 | also what the CFG_ and USE_() macros are generated from (see [TODO]). The |
| 198 | help information here is used by menuconfig, and also by the "help" command to |
| 199 | describe your new command. (See [TODO] for details.) By convention, |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 200 | unfinished commands default to "n" and finished commands default to "y", |
| 201 | so "make defconfig" selects all finished commands. (Note, "finished" means |
| 202 | "ready to be used", not that it'll never change again.)<p> |
| 203 | |
| 204 | <p>Each help block should start with a "usage: yourcommand" line explaining |
| 205 | any command line arguments added by this config option. The "help" command |
| 206 | outputs this text, and scripts/config2help.c in the build infrastructure |
| 207 | collates these usage lines for commands with multiple configuration |
| 208 | options when producing generated/help.h.</p> |
| 209 | </li> |
Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 210 | |
Rob Landley | c0e56ed | 2012-10-08 00:02:30 -0500 | [diff] [blame] | 211 | <li><p>Change the "#define FOR_hello" line to "#define FOR_yourcommand" right |
| 212 | before the "#include <toys.h>". (This selects the appropriate FLAG_ macros and |
| 213 | does a "#define TT this.yourcommand" so you can access the global variables |
| 214 | out of the space-saving union of structures. If you aren't using any command |
| 215 | flag bits and aren't defining a GLOBAL block, you can delete this line.)</p></li> |
Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 216 | |
Rob Landley | c0e56ed | 2012-10-08 00:02:30 -0500 | [diff] [blame] | 217 | <li><p>Update the GLOBALS() macro to contain your command's global |
| 218 | variables. If your command has no global variables, delete this macro.</p> |
| 219 | |
| 220 | <p>Variables in the GLOBALS() block are are stored in a space saving |
| 221 | <a href="#toy_union">union of structures</a> format, which may be accessed |
| 222 | using the TT macro as if TT were a global structure (so TT.membername). |
| 223 | If you specified two-character command line arguments in |
| 224 | NEWTOY(), the first few global variables will be initialized by the automatic |
| 225 | argument parsing logic, and the type and order of these variables must |
| 226 | correspond to the arguments specified in NEWTOY(). |
| 227 | (See <a href="#lib_args">lib/args.c</a> for details.)</p></li> |
Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 228 | |
| 229 | <li><p>Rename hello_main() to yourcommand_main(). This is the main() function |
Rob Landley | c0e56ed | 2012-10-08 00:02:30 -0500 | [diff] [blame] | 230 | where execution of your command starts. Your command line options are |
| 231 | already sorted into this.optflags, this.optargs, this.optc, and the GLOBALS() |
| 232 | as appropriate by the time this function is called. (See |
Rob Landley | 002a11e | 2014-05-22 08:16:55 -0500 | [diff] [blame^] | 233 | <a href="#lib_args">get_optflags()</a> for details.)</p></li> |
Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 234 | </ul> |
| 235 | |
Rob Landley | 7eaf4f5 | 2014-04-09 08:30:09 -0500 | [diff] [blame] | 236 | <a name="headers" /><h2><a href="#headers">Headers.</a></h2> |
Rob Landley | 85a3241 | 2013-12-27 06:53:15 -0600 | [diff] [blame] | 237 | |
| 238 | <p>Commands generally don't have their own headers. If it's common code |
| 239 | it can live in lib/, if it isn't put it in the command's .c file. (The line |
| 240 | between implementing multiple commands in a C file via OLDTOY() to share |
| 241 | infrastructure and moving that shared infrastructure to lib/ is a judgement |
| 242 | call. Try to figure out which is simplest.)</p> |
| 243 | |
| 244 | <p>The top level toys.h should #include all the standard (posix) headers |
| 245 | that any command uses. (Partly this is friendly to ccache and partly this |
| 246 | makes the command implementations shorter.) Individual commands should only |
| 247 | need to include nonstandard headers that might prevent that command from |
| 248 | building in some context we'd care about (and thus requiring that command to |
| 249 | be disabled to avoid a build break).</p> |
| 250 | |
| 251 | <p>Target-specific stuff (differences between compiler versions, libc versions, |
| 252 | or operating systems) should be confined to lib/portability.h and |
| 253 | lib/portability.c. (There's even some minimal compile-time environment probing |
| 254 | that writes data to generated/portability.h, see scripts/genconfig.sh.)</p> |
| 255 | |
| 256 | <p>Only include linux/*.h headers from individual commands (not from other |
| 257 | headers), and only if you really need to. Data that varies per architecture |
| 258 | is a good reason to include a header. If you just need a couple constants |
| 259 | that haven't changed since the 1990's, it's ok to #define them yourself or |
| 260 | just use the constant inline with a comment explaining what it is. (A |
| 261 | #define that's only used once isn't really helping.)</p> |
| 262 | |
Rob Landley | 7eaf4f5 | 2014-04-09 08:30:09 -0500 | [diff] [blame] | 263 | <p><a name="top" /><h1><a href="#top">Top level directory.</a></h1></p> |
Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 264 | |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 265 | <p>This directory contains global infrastructure.</p> |
| 266 | |
| 267 | <h3>toys.h</h3> |
Rob Landley | c0e56ed | 2012-10-08 00:02:30 -0500 | [diff] [blame] | 268 | <p>Each command #includes "toys.h" as part of its standard prolog. It |
| 269 | may "#define FOR_commandname" before doing so to get some extra entries |
| 270 | specific to this command.</p> |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 271 | |
| 272 | <p>This file sucks in most of the commonly used standard #includes, so |
| 273 | individual files can just #include "toys.h" and not have to worry about |
| 274 | stdargs.h and so on. Individual commands still need to #include |
| 275 | special-purpose headers that may not be present on all systems (and thus would |
| 276 | prevent toybox from building that command on such a system with that command |
| 277 | enabled). Examples include regex support, any "linux/" or "asm/" headers, mtab |
| 278 | support (mntent.h and sys/mount.h), and so on.</p> |
| 279 | |
| 280 | <p>The toys.h header also defines structures for most of the global variables |
| 281 | provided to each command by toybox_main(). These are described in |
| 282 | detail in the description for main.c, where they are initialized.</p> |
| 283 | |
| 284 | <p>The global variables are grouped into structures (and a union) for space |
| 285 | savings, to more easily track the amount of memory consumed by them, |
| 286 | so that they may be automatically cleared/initialized as needed, and so |
| 287 | that access to global variables is more easily distinguished from access to |
| 288 | local variables.</p> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 289 | |
| 290 | <h3>main.c</h3> |
| 291 | <p>Contains the main() function where execution starts, plus |
| 292 | common infrastructure to initialize global variables and select which command |
Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 293 | to run. The "toybox" multiplexer command also lives here. (This is the |
Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 294 | only command defined outside of the toys directory.)</p> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 295 | |
Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 296 | <p>Execution starts in main() which trims any path off of the first command |
| 297 | name and calls toybox_main(), which calls toy_exec(), which calls toy_find() |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 298 | and toy_init() before calling the appropriate command's function from |
| 299 | toy_list[] (via toys.which->toy_main()). |
Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 300 | If the command is "toybox", execution recurses into toybox_main(), otherwise |
| 301 | the call goes to the appropriate commandname_main() from a C file in the toys |
| 302 | directory.</p> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 303 | |
Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 304 | <p>The following global variables are defined in main.c:</p> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 305 | <ul> |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 306 | <a name="toy_list" /> |
| 307 | <li><p><b>struct toy_list toy_list[]</b> - array describing all the |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 308 | commands currently configured into toybox. The first entry (toy_list[0]) is |
| 309 | for the "toybox" multiplexer command, which runs all the other built-in commands |
| 310 | without symlinks by using its first argument as the name of the command to |
| 311 | run and the rest as that command's argument list (ala "./toybox echo hello"). |
| 312 | The remaining entries are the commands in alphabetical order (for efficient |
| 313 | binary search).</p> |
| 314 | |
| 315 | <p>This is a read-only array initialized at compile time by |
Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 316 | defining macros and #including generated/newtoys.h.</p> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 317 | |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 318 | <p>Members of struct toy_list (defined in "toys.h") include:</p> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 319 | <ul> |
| 320 | <li><p>char *<b>name</b> - the name of this command.</p></li> |
| 321 | <li><p>void (*<b>toy_main</b>)(void) - function pointer to run this |
| 322 | command.</p></li> |
| 323 | <li><p>char *<b>options</b> - command line option string (used by |
| 324 | get_optflags() in lib/args.c to intialize toys.optflags, toys.optargs, and |
Rob Landley | c0e56ed | 2012-10-08 00:02:30 -0500 | [diff] [blame] | 325 | entries in the toy's GLOBALS struct). When this is NULL, no option |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 326 | parsing is done before calling toy_main().</p></li> |
Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 327 | <li><p>int <b>flags</b> - Behavior flags for this command. The following flags are currently understood:</p> |
| 328 | |
| 329 | <ul> |
| 330 | <li><b>TOYFLAG_USR</b> - Install this command under /usr</li> |
| 331 | <li><b>TOYFLAG_BIN</b> - Install this command under /bin</li> |
| 332 | <li><b>TOYFLAG_SBIN</b> - Install this command under /sbin</li> |
| 333 | <li><b>TOYFLAG_NOFORK</b> - This command can be used as a shell builtin.</li> |
| 334 | <li><b>TOYFLAG_UMASK</b> - Call umask(0) before running this command.</li> |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 335 | <li><b>TOYFLAG_STAYROOT</b> - Don't drop permissions for this command if toybox is installed SUID root.</li> |
| 336 | <li><b>TOYFLAG_NEEDROOT</b> - This command cannot function unless run with root access.</li> |
Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 337 | </ul> |
| 338 | <br> |
| 339 | |
| 340 | <p>These flags are combined with | (or). For example, to install a command |
| 341 | in /usr/bin, or together TOYFLAG_USR|TOYFLAG_BIN.</p> |
| 342 | </ul> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 343 | </li> |
| 344 | |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 345 | <li><p><b>struct toy_context toys</b> - global structure containing information |
| 346 | common to all commands, initializd by toy_init() and defined in "toys.h". |
| 347 | Members of this structure include:</p> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 348 | <ul> |
| 349 | <li><p>struct toy_list *<b>which</b> - a pointer to this command's toy_list |
| 350 | structure. Mostly used to grab the name of the running command |
| 351 | (toys->which.name).</p> |
| 352 | </li> |
| 353 | <li><p>int <b>exitval</b> - Exit value of this command. Defaults to zero. The |
| 354 | error_exit() functions will return 1 if this is zero, otherwise they'll |
| 355 | return this value.</p></li> |
| 356 | <li><p>char **<b>argv</b> - "raw" command line options, I.E. the original |
| 357 | unmodified string array passed in to main(). Note that modifying this changes |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 358 | "ps" output, and is not recommended. This array is null terminated; a NULL |
| 359 | entry indicates the end of the array.</p> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 360 | <p>Most commands don't use this field, instead the use optargs, optflags, |
Rob Landley | c0e56ed | 2012-10-08 00:02:30 -0500 | [diff] [blame] | 361 | and the fields in the GLOBALS struct initialized by get_optflags().</p> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 362 | </li> |
| 363 | <li><p>unsigned <b>optflags</b> - Command line option flags, set by |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 364 | <a href="#lib_args">get_optflags()</a>. Indicates which of the command line options listed in |
Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 365 | toys->which.options occurred this time.</p> |
| 366 | |
| 367 | <p>The rightmost command line argument listed in toys->which.options sets bit |
| 368 | 1, the next one sets bit 2, and so on. This means the bits are set in the same |
| 369 | order the binary digits would be listed if typed out as a string. For example, |
| 370 | the option string "abcd" would parse the command line "-c" to set optflags to 2, |
| 371 | "-a" would set optflags to 8, and "-bd" would set optflags to 6 (4|2).</p> |
| 372 | |
| 373 | <p>Only letters are relevant to optflags. In the string "a*b:c#d", d=1, c=2, |
Rob Landley | c0e56ed | 2012-10-08 00:02:30 -0500 | [diff] [blame] | 374 | b=4, a=8. Punctuation after a letter initializes global variables at the |
| 375 | start of the GLOBALS() block (see <a href="#toy_union">union toy_union this</a> |
| 376 | for details).</p> |
| 377 | |
| 378 | <p>The build infrastructure creates FLAG_ macros for each option letter, |
| 379 | corresponding to the bit position, so you can check (toys.optflags & FLAG_x) |
| 380 | to see if a flag was specified. (The correct set of FLAG_ macros is selected |
| 381 | by defining FOR_mycommand before #including toys.h. The macros live in |
| 382 | toys/globals.h which is generated by scripts/make.sh.)</p> |
Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 383 | |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 384 | <p>For more information on option parsing, see <a href="#lib_args">get_optflags()</a>.</p> |
Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 385 | |
| 386 | </li> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 387 | <li><p>char **<b>optargs</b> - Null terminated array of arguments left over |
| 388 | after get_optflags() removed all the ones it understood. Note: optarg[0] is |
| 389 | the first argument, not the command name. Use toys.which->name for the command |
| 390 | name.</p></li> |
Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 391 | <li><p>int <b>optc</b> - Optarg count, equivalent to argc but for |
| 392 | optargs[].<p></li> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 393 | <li><p>int <b>exithelp</b> - Whether error_exit() should print a usage message |
| 394 | via help_main() before exiting. (True during option parsing, defaults to |
| 395 | false afterwards.)</p></li> |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 396 | </ul> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 397 | |
Rob Landley | c0e56ed | 2012-10-08 00:02:30 -0500 | [diff] [blame] | 398 | <a name="toy_union" /> |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 399 | <li><p><b>union toy_union this</b> - Union of structures containing each |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 400 | command's global variables.</p> |
| 401 | |
Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 402 | <p>Global variables are useful: they reduce the overhead of passing extra |
| 403 | command line arguments between functions, they conveniently start prezeroed to |
| 404 | save initialization costs, and the command line argument parsing infrastructure |
| 405 | can also initialize global variables with its results.</p> |
| 406 | |
| 407 | <p>But since each toybox process can only run one command at a time, allocating |
| 408 | space for global variables belonging to other commands you aren't currently |
| 409 | running would be wasteful.</p> |
| 410 | |
| 411 | <p>Toybox handles this by encapsulating each command's global variables in |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 412 | a structure, and declaring a union of those structures with a single global |
Rob Landley | c0e56ed | 2012-10-08 00:02:30 -0500 | [diff] [blame] | 413 | instance (called "this"). The GLOBALS() macro contains the global |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 414 | variables that should go in the current command's global structure. Each |
| 415 | variable can then be accessed as "this.commandname.varname". |
Rob Landley | c0e56ed | 2012-10-08 00:02:30 -0500 | [diff] [blame] | 416 | If you #defined FOR_commandname before including toys.h, the macro TT is |
| 417 | #defined to this.commandname so the variable can then be accessed as |
| 418 | "TT.variable". See toys/hello.c for an example.</p> |
Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 419 | |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 420 | <p>A command that needs global variables should declare a structure to |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 421 | contain them all, and add that structure to this union. A command should never |
| 422 | declare global variables outside of this, because such global variables would |
| 423 | allocate memory when running other commands that don't use those global |
| 424 | variables.</p> |
| 425 | |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 426 | <p>The first few fields of this structure can be intialized by <a href="#lib_args">get_optargs()</a>, |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 427 | as specified by the options field off this command's toy_list entry. See |
| 428 | the get_optargs() description in lib/args.c for details.</p> |
| 429 | </li> |
| 430 | |
Rob Landley | 81b899d | 2007-12-18 02:02:47 -0600 | [diff] [blame] | 431 | <li><b>char toybuf[4096]</b> - a common scratch space buffer so |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 432 | commands don't need to allocate their own. Any command is free to use this, |
| 433 | and it should never be directly referenced by functions in lib/ (although |
| 434 | commands are free to pass toybuf in to a library function as an argument).</li> |
| 435 | </ul> |
| 436 | |
Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 437 | <p>The following functions are defined in main.c:</p> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 438 | <ul> |
| 439 | <li><p>struct toy_list *<b>toy_find</b>(char *name) - Return the toy_list |
| 440 | structure for this command name, or NULL if not found.</p></li> |
Rob Landley | 81b899d | 2007-12-18 02:02:47 -0600 | [diff] [blame] | 441 | <li><p>void <b>toy_init</b>(struct toy_list *which, char *argv[]) - fill out |
| 442 | the global toys structure, calling get_optargs() if necessary.</p></li> |
Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 443 | <li><p>void <b>toy_exec</b>(char *argv[]) - Run a built-in command with |
| 444 | arguments.</p> |
| 445 | <p>Calls toy_find() on argv[0] (which must be just a command name |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 446 | without path). Returns if it can't find this command, otherwise calls |
Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 447 | toy_init(), toys->which.toy_main(), and exit() instead of returning.</p> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 448 | |
Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 449 | <p>Use the library function xexec() to fall back to external executables |
| 450 | in $PATH if toy_exec() can't find a built-in command. Note that toy_exec() |
| 451 | does not strip paths before searching for a command, so "./command" will |
| 452 | never match an internal command.</li> |
| 453 | |
| 454 | <li><p>void <b>toybox_main</b>(void) - the main function for the multiplexer |
| 455 | command (I.E. "toybox"). Given a command name as its first argument, calls |
| 456 | toy_exec() on its arguments. With no arguments, it lists available commands. |
| 457 | If the first argument starts with "-" it lists each command with its default |
| 458 | install path prepended.</p></li> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 459 | |
| 460 | </ul> |
| 461 | |
| 462 | <h3>Config.in</h3> |
| 463 | |
| 464 | <p>Top level configuration file in a stylized variant of |
Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 465 | <a href=http://kernel.org/doc/Documentation/kbuild/kconfig-language.txt>kconfig</a> format. Includes generated/Config.in.</p> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 466 | |
| 467 | <p>These files are directly used by "make menuconfig" to select which commands |
| 468 | to build into toybox (thus generating a .config file), and by |
Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 469 | scripts/config2help.py to create generated/help.h.</p> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 470 | |
Rob Landley | 7eaf4f5 | 2014-04-09 08:30:09 -0500 | [diff] [blame] | 471 | <a name="generated" /> |
| 472 | <h1><a href="#generated">Temporary files:</a></h1> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 473 | |
Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 474 | <p>There is one temporary file in the top level source directory:</p> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 475 | <ul> |
| 476 | <li><p><b>.config</b> - Configuration file generated by kconfig, indicating |
| 477 | which commands (and options to commands) are currently enabled. Used |
Rob Landley | 7eaf4f5 | 2014-04-09 08:30:09 -0500 | [diff] [blame] | 478 | to make generated/config.h and determine which toys/*/*.c files to build.</p> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 479 | |
Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 480 | <p>You can create a human readable "miniconfig" version of this file using |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 481 | <a href=http://landley.net/aboriginal/new_platform.html#miniconfig>these |
Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 482 | instructions</a>.</p> |
| 483 | </li> |
| 484 | </ul> |
| 485 | |
Rob Landley | 7eaf4f5 | 2014-04-09 08:30:09 -0500 | [diff] [blame] | 486 | <p><h2>Directory generated/</h2></p> |
| 487 | |
| 488 | <p>The remaining temporary files live in the "generated/" directory, |
| 489 | which is for files generated at build time from other source files.</p> |
Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 490 | |
| 491 | <ul> |
Rob Landley | 7eaf4f5 | 2014-04-09 08:30:09 -0500 | [diff] [blame] | 492 | <li><p><b>generated/Config.in</b> - Included from the top level Config.in, |
| 493 | contains one or more configuration entries for each command.</p> |
| 494 | |
| 495 | <p>Each command has a configuration entry with an upper case version of |
| 496 | the command name. Options to commands start with the command |
| 497 | name followed by an underscore and the option name. Global options are attached |
| 498 | to the "toybox" command, and thus use the prefix "TOYBOX_". This organization |
| 499 | is used by scripts/cfg2files to select which toys/*/*.c files to compile for a |
| 500 | given .config.</p> |
| 501 | |
| 502 | <p>A command with multiple names (or multiple similar commands implemented in |
| 503 | the same .c file) should have config symbols prefixed with the name of their |
| 504 | C file. I.E. config symbol prefixes are NEWTOY() names. If OLDTOY() names |
| 505 | have config symbols they must be options (symbols with an underscore and |
| 506 | suffix) to the NEWTOY() name. (See generated/toylist.h)</p> |
| 507 | </li> |
| 508 | |
Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 509 | <li><p><b>generated/config.h</b> - list of CFG_SYMBOL and USE_SYMBOL() macros, |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 510 | generated from .config by a sed invocation in the top level Makefile.</p> |
| 511 | |
| 512 | <p>CFG_SYMBOL is a comple time constant set to 1 for enabled symbols and 0 for |
Rob Landley | 7eaf4f5 | 2014-04-09 08:30:09 -0500 | [diff] [blame] | 513 | disabled symbols. This allows the use of normal if() statements to remove |
Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 514 | code at compile time via the optimizer's dead code elimination (which removes |
Rob Landley | 7eaf4f5 | 2014-04-09 08:30:09 -0500 | [diff] [blame] | 515 | from the binary any code that cannot be reached). This saves space without |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 516 | cluttering the code with #ifdefs or leading to configuration dependent build |
Rob Landley | 7eaf4f5 | 2014-04-09 08:30:09 -0500 | [diff] [blame] | 517 | breaks. (See the 1992 Usenix paper |
Rob Landley | b6063de | 2012-01-29 13:54:13 -0600 | [diff] [blame] | 518 | <a href=http://doc.cat-v.org/henry_spencer/ifdef_considered_harmful.pdf>#ifdef |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 519 | Considered Harmful</a> for more information.)</p> |
| 520 | |
| 521 | <p>USE_SYMBOL(code) evaluates to the code in parentheses when the symbol |
Rob Landley | 7eaf4f5 | 2014-04-09 08:30:09 -0500 | [diff] [blame] | 522 | is enabled, and nothing when the symbol is disabled. This can be used |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 523 | for things like varargs or variable declarations which can't always be |
Rob Landley | 7eaf4f5 | 2014-04-09 08:30:09 -0500 | [diff] [blame] | 524 | eliminated by a simple test on CFG_SYMBOL. Note that |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 525 | (unlike CFG_SYMBOL) this is really just a variant of #ifdef, and can |
Rob Landley | 7eaf4f5 | 2014-04-09 08:30:09 -0500 | [diff] [blame] | 526 | still result in configuration dependent build breaks. Use with caution.</p> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 527 | </li> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 528 | |
Rob Landley | 7eaf4f5 | 2014-04-09 08:30:09 -0500 | [diff] [blame] | 529 | <li><p><b>generated/flags.h</b> - FLAG_? macros indicating which command |
| 530 | line options were seen. The option parsing in lib/args.c sets bits in |
| 531 | toys.optflags, which can be tested by anding with the appropriate FLAG_ |
| 532 | macro. (Bare longopts, which have no corresponding short option, will |
| 533 | have the longopt name after FLAG_. All others use the single letter short |
| 534 | option.)</p> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 535 | |
Rob Landley | 7eaf4f5 | 2014-04-09 08:30:09 -0500 | [diff] [blame] | 536 | <p>To get the appropriate macros for your command, #define FOR_commandname |
| 537 | before #including toys.h. To switch macro sets (because you have an OLDTOY() |
| 538 | with different options in the same .c file), #define CLEANUP_oldcommand |
| 539 | and also #define FOR_newcommand, then #include "generated/flags.h" to switch. |
| 540 | </p> |
| 541 | </li> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 542 | |
Rob Landley | 7eaf4f5 | 2014-04-09 08:30:09 -0500 | [diff] [blame] | 543 | <li><p><b>generated/globals.h</b> - |
| 544 | Declares structures to hold the contents of each command's GLOBALS(), |
| 545 | and combines them into "global_union this". (Yes, the name was |
| 546 | chosen to piss off C++ developers who think that C |
| 547 | is merely a subset of C++, not a language in its own right.)</p> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 548 | |
Rob Landley | 7eaf4f5 | 2014-04-09 08:30:09 -0500 | [diff] [blame] | 549 | <p>The union reuses the same memory for each command's global struct: |
| 550 | since only one command's globals are in use at any given time, collapsing |
| 551 | them together saves space. The headers #define TT to the appropriate |
| 552 | "this.commandname", so you can refer to the current command's global |
| 553 | variables out of "this" as TT.variablename.</p> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 554 | |
Rob Landley | 7eaf4f5 | 2014-04-09 08:30:09 -0500 | [diff] [blame] | 555 | <p>The globals start zeroed, and the first few are filled out by the |
| 556 | lib/args.c argument parsing code called from main.c.</p> |
| 557 | </li> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 558 | |
Rob Landley | 7eaf4f5 | 2014-04-09 08:30:09 -0500 | [diff] [blame] | 559 | <li><p><b>toys/help.h</b> - |
| 560 | #defines two help text strings for each command: a single line |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 561 | command_help and an additinal command_help_long. This is used by help_main() |
| 562 | in toys/help.c to display help for commands.</p> |
| 563 | |
Rob Landley | 7eaf4f5 | 2014-04-09 08:30:09 -0500 | [diff] [blame] | 564 | <p>This file is created by scripts/make.sh, which compiles scripts/config2help.c |
| 565 | into the binary generated/config2help, and then runs it against the top |
| 566 | level .config and Config.in files to extract the help text from each config |
| 567 | entry and collate together dependent options.</p> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 568 | |
Rob Landley | 7eaf4f5 | 2014-04-09 08:30:09 -0500 | [diff] [blame] | 569 | <p>This file contains help text for all commands, regardless of current |
| 570 | configuration, but only the ones currently enabled in the .config file |
| 571 | wind up in the help_data[] array, and only the enabled dependent options |
| 572 | have their help text added to the command they depend on.</p> |
| 573 | </li> |
| 574 | |
| 575 | <li><p><b>generated/newtoys.h</b> - |
| 576 | All the NEWTOY() and OLDTOY() macros in alphabetical order, |
| 577 | each of which should be inside the appropriate USE_ macro. (Ok, not _quite_ |
| 578 | alphabetical orer: the "toybox" multiplexer is always the first entry.)</p> |
| 579 | |
| 580 | <p>By #definining NEWTOY() to various things before #including this file, |
| 581 | it may be used to create function prototypes (in toys.h), initialize the |
| 582 | toy_list array (in main.c, the alphabetical order lets toy_find() do a |
| 583 | binary search), initialize the help_data array (in lib/help.c), and so on. |
| 584 | (It's even used to initialize the NEED_OPTIONS macro, which is has a 1 or 0 |
| 585 | for each command using command line option parsing, ORed together. |
| 586 | This allows compile-time dead code elimination to remove the whole of |
| 587 | lib/args.c if nothing currently enabled is using it.)<p> |
| 588 | |
| 589 | <p>Each NEWTOY and OLDTOY macro contains the command name, command line |
| 590 | option string (telling lib/args.c how to parse command line options for |
| 591 | this command), recommended install location, and miscelaneous data such |
| 592 | as whether this command should retain root permissions if installed suid.</p> |
| 593 | </li> |
| 594 | |
| 595 | <li><p><b>toys/oldtoys.h</b> - Macros with the command line option parsing |
| 596 | string for each NEWTOY. This allows an OLDTOY that's just an alias for an |
| 597 | existing command to refer to the existing option string instead of |
| 598 | having to repeat it.</p> |
| 599 | </li> |
| 600 | </ul> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 601 | |
Rob Landley | 137bf34 | 2012-03-09 08:33:57 -0600 | [diff] [blame] | 602 | <a name="lib"> |
Rob Landley | 81b899d | 2007-12-18 02:02:47 -0600 | [diff] [blame] | 603 | <h2>Directory lib/</h2> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 604 | |
Rob Landley | 137bf34 | 2012-03-09 08:33:57 -0600 | [diff] [blame] | 605 | <p>TODO: document lots more here.</p> |
| 606 | |
Rob Landley | c8d0da5 | 2012-07-15 17:47:08 -0500 | [diff] [blame] | 607 | <p>lib: getmountlist(), error_msg/error_exit, xmalloc(), |
Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 608 | strlcpy(), xexec(), xopen()/xread(), xgetcwd(), xabspath(), find_in_path(), |
| 609 | itoa().</p> |
| 610 | |
Rob Landley | 137bf34 | 2012-03-09 08:33:57 -0600 | [diff] [blame] | 611 | <h3>lib/portability.h</h3> |
| 612 | |
| 613 | <p>This file is automatically included from the top of toys.h, and smooths |
| 614 | over differences between platforms (hardware targets, compilers, C libraries, |
| 615 | operating systems, etc).</p> |
| 616 | |
| 617 | <p>This file provides SWAP macros (SWAP_BE16(x) and SWAP_LE32(x) and so on).</p> |
| 618 | |
| 619 | <p>A macro like SWAP_LE32(x) means "The value in x is stored as a little |
| 620 | endian 32 bit value, so perform the translation to/from whatever the native |
| 621 | 32-bit format is". You do the swap once on the way in, and once on the way |
| 622 | out. If your target is already little endian, the macro is a NOP.</p> |
| 623 | |
| 624 | <p>The SWAP macros come in BE and LE each with 16, 32, and 64 bit versions. |
| 625 | In each case, the name of the macro refers to the _external_ representation, |
| 626 | and converts to/from whatever your native representation happens to be (which |
| 627 | can vary depending on what you're currently compiling for).</p> |
| 628 | |
Rob Landley | c8d0da5 | 2012-07-15 17:47:08 -0500 | [diff] [blame] | 629 | <a name="lib_llist"><h3>lib/llist.c</h3> |
| 630 | |
| 631 | <p>Some generic single and doubly linked list functions, which take |
| 632 | advantage of a couple properties of C:</p> |
| 633 | |
| 634 | <ul> |
| 635 | <li><p>Structure elements are laid out in memory in the order listed, and |
| 636 | the first element has no padding. This means you can always treat (typecast) |
| 637 | a pointer to a structure as a pointer to the first element of the structure, |
| 638 | even if you don't know anything about the data following it.</p></li> |
| 639 | |
| 640 | <li><p>An array of length zero at the end of a structure adds no space |
| 641 | to the sizeof() the structure, but if you calculate how much extra space |
| 642 | you want when you malloc() the structure it will be available at the end. |
| 643 | Since C has no bounds checking, this means each struct can have one variable |
| 644 | length array.</p></li> |
| 645 | </ul> |
| 646 | |
| 647 | <p>Toybox's list structures always have their <b>next</b> pointer as |
| 648 | the first entry of each struct, and singly linked lists end with a NULL pointer. |
| 649 | This allows generic code to traverse such lists without knowing anything |
| 650 | else about the specific structs composing them: if your pointer isn't NULL |
| 651 | typecast it to void ** and dereference once to get the next entry.</p> |
| 652 | |
| 653 | <p><b>lib/lib.h</b> defines three structure types:</p> |
| 654 | <ul> |
| 655 | <li><p><b>struct string_list</b> - stores a single string (<b>char str[0]</b>), |
| 656 | memory for which is allocated as part of the node. (I.E. llist_traverse(list, |
| 657 | free); can clean up after this type of list.)</p></li> |
| 658 | |
| 659 | <li><p><b>struct arg_list</b> - stores a pointer to a single string |
| 660 | (<b>char *arg</b>) which is stored in a separate chunk of memory.</p></li> |
| 661 | |
| 662 | <li><p><b>struct double_list</b> - has a second pointer (<b>struct double_list |
| 663 | *prev</b> along with a <b>char *data</b> for payload.</p></li> |
| 664 | </ul> |
| 665 | |
| 666 | <b>List Functions</b> |
| 667 | |
| 668 | <ul> |
| 669 | <li><p>void *<b>llist_pop</b>(void **list) - advances through a list ala |
| 670 | <b>node = llist_pop(&list);</b> This doesn't modify the list contents, |
| 671 | but does advance the pointer you feed it (which is why you pass the _address_ |
| 672 | of that pointer, not the pointer itself).</p></li> |
| 673 | |
| 674 | <li><p>void <b>llist_traverse</b>(void *list, void (*using)(void *data)) - |
| 675 | iterate through a list calling a function on each node.</p></li> |
| 676 | |
| 677 | <li><p>struct double_list *<b>dlist_add</b>(struct double_list **llist, char *data) |
| 678 | - append an entry to a circular linked list. |
| 679 | This function allocates a new struct double_list wrapper and returns the |
| 680 | pointer to the new entry (which you can usually ignore since it's llist->prev, |
| 681 | but if llist was NULL you need it). The argument is the ->data field for the |
| 682 | new node.</p></li> |
| 683 | <ul><li><p>void <b>dlist_add_nomalloc</b>(struct double_list **llist, |
| 684 | struct double_list *new) - append existing struct double_list to |
| 685 | list, does not allocate anything.</p></li></ul> |
| 686 | </ul> |
| 687 | |
Rob Landley | 7eaf4f5 | 2014-04-09 08:30:09 -0500 | [diff] [blame] | 688 | <b>List code trivia questions:</b> |
Rob Landley | c8d0da5 | 2012-07-15 17:47:08 -0500 | [diff] [blame] | 689 | |
| 690 | <ul> |
| 691 | <li><p><b>Why do arg_list and double_list contain a char * payload instead of |
| 692 | a void *?</b> - Because you always have to typecast a void * to use it, and |
Rob Landley | 7eaf4f5 | 2014-04-09 08:30:09 -0500 | [diff] [blame] | 693 | typecasting a char * does no harm. Since strings are the most common |
| 694 | payload, and doing math on the pointer ala |
| 695 | "(type *)(ptr+sizeof(thing)+sizeof(otherthing))" requires ptr to be char * |
| 696 | anyway (at least according to the C standard), defaulting to char * saves |
| 697 | a typecast.</p> |
Rob Landley | c8d0da5 | 2012-07-15 17:47:08 -0500 | [diff] [blame] | 698 | </li> |
| 699 | |
| 700 | <li><p><b>Why do the names ->str, ->arg, and ->data differ?</b> - To force |
| 701 | you to keep track of which one you're using, calling free(node->str) would |
| 702 | be bad, and _failing_ to free(node->arg) leaks memory.</p></li> |
| 703 | |
| 704 | <li><p><b>Why does llist_pop() take a void * instead of void **?</b> - |
| 705 | because the stupid compiler complains about "type punned pointers" when |
Rob Landley | 7eaf4f5 | 2014-04-09 08:30:09 -0500 | [diff] [blame] | 706 | you typecast and dereference on the same line, |
Rob Landley | c8d0da5 | 2012-07-15 17:47:08 -0500 | [diff] [blame] | 707 | due to insane FSF developers hardwiring limitations of their optimizer |
| 708 | into gcc's warning system. Since C automatically typecasts any other |
Rob Landley | 7eaf4f5 | 2014-04-09 08:30:09 -0500 | [diff] [blame] | 709 | pointer type to and from void *, the current code works fine. It's sad that it |
Rob Landley | c8d0da5 | 2012-07-15 17:47:08 -0500 | [diff] [blame] | 710 | won't warn you if you forget the &, but the code crashes pretty quickly in |
| 711 | that case.</p></li> |
| 712 | |
| 713 | <li><p><b>How do I assemble a singly-linked-list in order?</b> - use |
| 714 | a double_list, dlist_add() your entries, and then break the circle with |
| 715 | <b>list->prev->next = NULL;</b> when done.</li> |
| 716 | </ul> |
| 717 | |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 718 | <a name="lib_args"><h3>lib/args.c</h3> |
Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 719 | |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 720 | <p>Toybox's main.c automatically parses command line options before calling the |
| 721 | command's main function. Option parsing starts in get_optflags(), which stores |
| 722 | results in the global structures "toys" (optflags and optargs) and "this".</p> |
| 723 | |
| 724 | <p>The option parsing infrastructure stores a bitfield in toys.optflags to |
| 725 | indicate which options the current command line contained. Arguments |
| 726 | attached to those options are saved into the command's global structure |
| 727 | ("this"). Any remaining command line arguments are collected together into |
| 728 | the null-terminated array toys.optargs, with the length in toys.optc. (Note |
| 729 | that toys.optargs does not contain the current command name at position zero, |
| 730 | use "toys.which->name" for that.) The raw command line arguments get_optflags() |
| 731 | parsed are retained unmodified in toys.argv[].</p> |
| 732 | |
| 733 | <p>Toybox's option parsing logic is controlled by an "optflags" string, using |
| 734 | a format reminiscent of getopt's optargs but has several important differences. |
| 735 | Toybox does not use the getopt() |
| 736 | function out of the C library, get_optflags() is an independent implementation |
| 737 | which doesn't permute the original arguments (and thus doesn't change how the |
| 738 | command is displayed in ps and top), and has many features not present in |
| 739 | libc optargs() (such as the ability to describe long options in the same string |
| 740 | as normal options).</p> |
| 741 | |
| 742 | <p>Each command's NEWTOY() macro has an optflags string as its middle argument, |
| 743 | which sets toy_list.options for that command to tell get_optflags() what |
| 744 | command line arguments to look for, and what to do with them. |
| 745 | If a command has no option |
| 746 | definition string (I.E. the argument is NULL), option parsing is skipped |
| 747 | for that command, which must look at the raw data in toys.argv to parse its |
| 748 | own arguments. (If no currently enabled command uses option parsing, |
| 749 | get_optflags() is optimized out of the resulting binary by the compiler's |
| 750 | --gc-sections option.)</p> |
| 751 | |
| 752 | <p>You don't have to free the option strings, which point into the environment |
| 753 | space (I.E. the string data is not copied). A TOYFLAG_NOFORK command |
| 754 | that uses the linked list type "*" should free the list objects but not |
| 755 | the data they point to, via "llist_free(TT.mylist, NULL);". (If it's not |
| 756 | NOFORK, exit() will free all the malloced data anyway unless you want |
| 757 | to implement a CONFIG_TOYBOX_FREE cleanup for it.)</p> |
| 758 | |
| 759 | <h4>Optflags format string</h4> |
| 760 | |
| 761 | <p>Note: the optflags option description string format is much more |
| 762 | concisely described by a large comment at the top of lib/args.c.</p> |
| 763 | |
| 764 | <p>The general theory is that letters set optflags, and punctuation describes |
| 765 | other actions the option parsing logic should take.</p> |
| 766 | |
| 767 | <p>For example, suppose the command line <b>command -b fruit -d walrus -a 42</b> |
| 768 | is parsed using the optflags string "<b>a#b:c:d</b>". (I.E. |
| 769 | toys.which->options="a#b:c:d" and argv = ["command", "-b", "fruit", "-d", |
| 770 | "walrus", "-a", "42"]). When get_optflags() returns, the following data is |
| 771 | available to command_main(): |
| 772 | |
| 773 | <ul> |
| 774 | <li><p>In <b>struct toys</b>: |
| 775 | <ul> |
| 776 | <li>toys.optflags = 13; // -a = 8 | -b = 4 | -d = 1</li> |
| 777 | <li>toys.optargs[0] = "walrus"; // leftover argument</li> |
| 778 | <li>toys.optargs[1] = NULL; // end of list</li> |
Rob Landley | b911d4d | 2013-09-21 14:27:26 -0500 | [diff] [blame] | 779 | <li>toys.optc = 1; // there was 1 leftover argument</li> |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 780 | <li>toys.argv[] = {"-b", "fruit", "-d", "walrus", "-a", "42"}; // The original command line arguments |
| 781 | </ul> |
| 782 | <p></li> |
| 783 | |
| 784 | <li><p>In <b>union this</b> (treated as <b>long this[]</b>): |
| 785 | <ul> |
| 786 | <li>this[0] = NULL; // -c didn't get an argument this time, so get_optflags() didn't change it and toys_init() zeroed "this" during setup.)</li> |
| 787 | <li>this[1] = (long)"fruit"; // argument to -b</li> |
| 788 | <li>this[2] = 42; // argument to -a</li> |
| 789 | </ul> |
| 790 | </p></li> |
| 791 | </ul> |
| 792 | |
| 793 | <p>If the command's globals are:</p> |
| 794 | |
| 795 | <blockquote><pre> |
Rob Landley | c0e56ed | 2012-10-08 00:02:30 -0500 | [diff] [blame] | 796 | GLOBALS( |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 797 | char *c; |
| 798 | char *b; |
| 799 | long a; |
| 800 | ) |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 801 | </pre></blockquote> |
| 802 | <p>That would mean TT.c == NULL, TT.b == "fruit", and TT.a == 42. (Remember, |
| 803 | each entry that receives an argument must be a long or pointer, to line up |
| 804 | with the array position. Right to left in the optflags string corresponds to |
Rob Landley | c0e56ed | 2012-10-08 00:02:30 -0500 | [diff] [blame] | 805 | top to bottom in GLOBALS().</p> |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 806 | |
Rob Landley | b911d4d | 2013-09-21 14:27:26 -0500 | [diff] [blame] | 807 | <p>Put globals not filled out by the option parsing logic at the end of the |
| 808 | GLOBALS block. Common practice is to list the options one per line (to |
| 809 | make the ordering explicit, first to last in globals corresponds to right |
| 810 | to left in the option string), then leave a blank line before any non-option |
| 811 | globals.</p> |
| 812 | |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 813 | <p><b>long toys.optflags</b></p> |
| 814 | |
| 815 | <p>Each option in the optflags string corresponds to a bit position in |
| 816 | toys.optflags, with the same value as a corresponding binary digit. The |
| 817 | rightmost argument is (1<<0), the next to last is (1<<1) and so on. If |
Rob Landley | b4a0efa | 2012-02-06 21:15:19 -0600 | [diff] [blame] | 818 | the option isn't encountered while parsing argv[], its bit remains 0.</p> |
| 819 | |
| 820 | <p>For example, |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 821 | the optflags string "abcd" would parse the command line argument "-c" to set |
| 822 | optflags to 2, "-a" would set optflags to 8, "-bd" would set optflags to |
| 823 | 6 (I.E. 4|2), and "-a -c" would set optflags to 10 (2|8).</p> |
| 824 | |
| 825 | <p>Only letters are relevant to optflags, punctuation is skipped: in the |
| 826 | string "a*b:c#d", d=1, c=2, b=4, a=8. The punctuation after a letter |
| 827 | usually indicate that the option takes an argument.</p> |
| 828 | |
Rob Landley | b4a0efa | 2012-02-06 21:15:19 -0600 | [diff] [blame] | 829 | <p>Since toys.optflags is an unsigned int, it only stores 32 bits. (Which is |
| 830 | the amount a long would have on 32-bit platforms anyway; 64 bit code on |
| 831 | 32 bit platforms is too expensive to require in common code used by almost |
| 832 | all commands.) Bit positions beyond the 1<<31 aren't recorded, but |
| 833 | parsing higher options can still set global variables.</p> |
| 834 | |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 835 | <p><b>Automatically setting global variables from arguments (union this)</b></p> |
| 836 | |
| 837 | <p>The following punctuation characters may be appended to an optflags |
| 838 | argument letter, indicating the option takes an additional argument:</p> |
| 839 | |
| 840 | <ul> |
| 841 | <li><b>:</b> - plus a string argument, keep most recent if more than one.</li> |
| 842 | <li><b>*</b> - plus a string argument, appended to a linked list.</li> |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 843 | <li><b>@</b> - plus an occurrence counter (stored in a long)</li> |
Rob Landley | b6063de | 2012-01-29 13:54:13 -0600 | [diff] [blame] | 844 | <li><b>#</b> - plus a signed long argument. |
Rob Landley | b911d4d | 2013-09-21 14:27:26 -0500 | [diff] [blame] | 845 | <li><b>-</b> - plus a signed long argument defaulting to negative (start argument with + to force a positive value).</li> |
Rob Landley | b6063de | 2012-01-29 13:54:13 -0600 | [diff] [blame] | 846 | <li><b>.</b> - plus a floating point argument (if CFG_TOYBOX_FLOAT).</li> |
| 847 | <ul>The following can be appended to a float or double: |
| 848 | <li><b><123</b> - error if argument is less than this</li> |
| 849 | <li><b>>123</b> - error if argument is greater than this</li> |
| 850 | <li><b>=123</b> - default value if argument not supplied</li> |
| 851 | </ul> |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 852 | </ul> |
| 853 | |
Rob Landley | b911d4d | 2013-09-21 14:27:26 -0500 | [diff] [blame] | 854 | <p>A note about "." and CFG_TOYBOX_FLOAT: option parsing only understands <>= |
| 855 | after . when CFG_TOYBOX_FLOAT |
| 856 | is enabled. (Otherwise the code to determine where floating point constants |
| 857 | end drops out; it requires floating point). When disabled, it can reserve a |
| 858 | global data slot for the argument (so offsets won't change in your |
| 859 | GLOBALS[] block), but will never fill it out. You can handle |
| 860 | this by using the USE_BLAH() macros with C string concatenation, ala: |
| 861 | "abc." USE_TOYBOX_FLOAT("<1.23>4.56=7.89") "def"</p> |
| 862 | |
| 863 | <p><b>GLOBALS</b></p> |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 864 | |
| 865 | <p>Options which have an argument fill in the corresponding slot in the global |
| 866 | union "this" (see generated/globals.h), treating it as an array of longs |
Rob Landley | b911d4d | 2013-09-21 14:27:26 -0500 | [diff] [blame] | 867 | with the rightmost saved in this[0]. As described above, using "a*b:c#d", |
| 868 | "-c 42" would set this[0] = 42; and "-b 42" would set this[1] = "42"; each |
| 869 | slot is left NULL if the corresponding argument is not encountered.</p> |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 870 | |
| 871 | <p>This behavior is useful because the LP64 standard ensures long and pointer |
Rob Landley | b4a0efa | 2012-02-06 21:15:19 -0600 | [diff] [blame] | 872 | are the same size. C99 guarantees structure members will occur in memory |
| 873 | in the same order they're declared, and that padding won't be inserted between |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 874 | consecutive variables of register size. Thus the first few entries can |
| 875 | be longs or pointers corresponding to the saved arguments.</p> |
| 876 | |
Rob Landley | b911d4d | 2013-09-21 14:27:26 -0500 | [diff] [blame] | 877 | <p>See toys/other/hello.c for a longer example of parsing options into the |
| 878 | GLOBALS block.</p> |
| 879 | |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 880 | <p><b>char *toys.optargs[]</b></p> |
| 881 | |
| 882 | <p>Command line arguments in argv[] which are not consumed by option parsing |
| 883 | (I.E. not recognized either as -flags or arguments to -flags) will be copied |
| 884 | to toys.optargs[], with the length of that array in toys.optc. |
| 885 | (When toys.optc is 0, no unrecognized command line arguments remain.) |
| 886 | The order of entries is preserved, and as with argv[] this new array is also |
| 887 | terminated by a NULL entry.</p> |
| 888 | |
| 889 | <p>Option parsing can require a minimum or maximum number of optargs left |
| 890 | over, by adding "<1" (read "at least one") or ">9" ("at most nine") to the |
| 891 | start of the optflags string.</p> |
| 892 | |
| 893 | <p>The special argument "--" terminates option parsing, storing all remaining |
| 894 | arguments in optargs. The "--" itself is consumed.</p> |
| 895 | |
| 896 | <p><b>Other optflags control characters</b></p> |
| 897 | |
| 898 | <p>The following characters may occur at the start of each command's |
| 899 | optflags string, before any options that would set a bit in toys.optflags:</p> |
| 900 | |
| 901 | <ul> |
| 902 | <li><b>^</b> - stop at first nonoption argument (for nice, xargs...)</li> |
| 903 | <li><b>?</b> - allow unknown arguments (pass non-option arguments starting |
| 904 | with - through to optargs instead of erroring out).</li> |
| 905 | <li><b>&</b> - the first argument has imaginary dash (ala tar/ps. If given twice, all arguments have imaginary dash.)</li> |
| 906 | <li><b><</b> - must be followed by a decimal digit indicating at least this many leftover arguments are needed in optargs (default 0)</li> |
| 907 | <li><b>></b> - must be followed by a decimal digit indicating at most this many leftover arguments allowed (default MAX_INT)</li> |
| 908 | </ul> |
| 909 | |
| 910 | <p>The following characters may be appended to an option character, but do |
| 911 | not by themselves indicate an extra argument should be saved in this[]. |
| 912 | (Technically any character not recognized as a control character sets an |
| 913 | optflag, but letters are never control characters.)</p> |
| 914 | |
| 915 | <ul> |
| 916 | <li><b>^</b> - stop parsing options after encountering this option, everything else goes into optargs.</li> |
| 917 | <li><b>|</b> - this option is required. If more than one marked, only one is required.</li> |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 918 | </ul> |
| 919 | |
Rob Landley | b6063de | 2012-01-29 13:54:13 -0600 | [diff] [blame] | 920 | <p>The following may be appended to a float or double:</p> |
| 921 | |
| 922 | <ul> |
| 923 | <li><b><123</b> - error if argument is less than this</li> |
| 924 | <li><b>>123</b> - error if argument is greater than this</li> |
| 925 | <li><b>=123</b> - default value if argument not supplied</li> |
| 926 | </ul> |
| 927 | |
| 928 | <p>Option parsing only understands <>= after . when CFG_TOYBOX_FLOAT |
| 929 | is enabled. (Otherwise the code to determine where floating point constants |
| 930 | end drops out. When disabled, it can reserve a global data slot for the |
| 931 | argument so offsets won't change, but will never fill it out.). You can handle |
| 932 | this by using the USE_BLAH() macros with C string concatenation, ala:</p> |
| 933 | |
| 934 | <blockquote>"abc." USE_TOYBOX_FLOAT("<1.23>4.56=7.89") "def"</blockquote> |
| 935 | |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 936 | <p><b>--longopts</b></p> |
| 937 | |
| 938 | <p>The optflags string can contain long options, which are enclosed in |
| 939 | parentheses. They may be appended to an existing option character, in |
| 940 | which case the --longopt is a synonym for that option, ala "a:(--fred)" |
| 941 | which understands "-a blah" or "--fred blah" as synonyms.</p> |
| 942 | |
| 943 | <p>Longopts may also appear before any other options in the optflags string, |
| 944 | in which case they have no corresponding short argument, but instead set |
| 945 | their own bit based on position. So for "(walrus)#(blah)xy:z" "command |
| 946 | --walrus 42" would set toys.optflags = 16 (-z = 1, -y = 2, -x = 4, --blah = 8) |
| 947 | and would assign this[1] = 42;</p> |
| 948 | |
| 949 | <p>A short option may have multiple longopt synonyms, "a(one)(two)", but |
| 950 | each "bare longopt" (ala "(one)(two)abc" before any option characters) |
| 951 | always sets its own bit (although you can group them with +X).</p> |
Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 952 | |
Rob Landley | b911d4d | 2013-09-21 14:27:26 -0500 | [diff] [blame] | 953 | <p><b>[groups]</b></p> |
| 954 | |
| 955 | <p>At the end of the option string, square bracket groups can define |
| 956 | relationships between existing options. (This only applies to short |
| 957 | options, bare --longopts can't participate.)</p> |
| 958 | |
| 959 | <p>The first character of the group defines the type, the remaining |
| 960 | characters are options it applies to:</p> |
| 961 | |
| 962 | <ul> |
| 963 | <li><b>-</b> - Exclusive, switch off all others in this group.</li> |
| 964 | <li><b>+</b> - Inclusive, switch on all others in this group.</li> |
| 965 | <li><b>!</b> - Error, fail if more than one defined.</li> |
| 966 | </ul> |
| 967 | |
| 968 | <p>So "abc[-abc]" means -ab = -b, -ba = -a, -abc = -c. "abc[+abc]" |
| 969 | means -ab=-abc, -c=-abc, and "abc[!abc] means -ab calls error_exit("no -b |
| 970 | with -a"). Note that [-] groups clear the GLOBALS option slot of |
| 971 | options they're switching back off, but [+] won't set options it didn't see |
| 972 | (just the optflags).</p> |
| 973 | |
| 974 | <p><b>whitespace</b></p> |
| 975 | |
| 976 | <p>Arguments may occur with or without a space (I.E. "-a 42" or "-a42"). |
| 977 | The command line argument "-abc" may be interepreted many different ways: |
| 978 | the optflags string "cba" sets toys.optflags = 7, "c:ba" sets toys.optflags=4 |
| 979 | and saves "ba" as the argument to -c, and "cb:a" sets optflags to 6 and saves |
| 980 | "c" as the argument to -b.</p> |
| 981 | |
| 982 | <p>Note that & changes whitespace handling, so that the command line |
| 983 | "tar cvfCj outfile.tar.bz2 topdir filename" is parsed the same as |
| 984 | "tar filename -c -v -j -f outfile.tar.bz2 -C topdir". Note that "tar -cvfCj |
| 985 | one two three" would equal "tar -c -v -f Cj one two three". (This matches |
| 986 | historical usage.)</p> |
| 987 | |
| 988 | <p>Appending a space to the option in the option string ("a: b") makes it |
| 989 | require a space, I.E. "-ab" is interpreted as "-a" "-b". That way "kill -stop" |
| 990 | differs from "kill -s top".</p> |
| 991 | |
| 992 | <p>Appending ; to a longopt in the option string makes its argument optional, |
| 993 | and only settable with =, so in ls "(color):;" can accept "ls --color" and |
| 994 | "ls --color=auto" without complaining that the first has no argument.</p> |
| 995 | |
Rob Landley | c8d0da5 | 2012-07-15 17:47:08 -0500 | [diff] [blame] | 996 | <a name="lib_dirtree"><h3>lib/dirtree.c</h3> |
| 997 | |
| 998 | <p>The directory tree traversal code should be sufficiently generic |
| 999 | that commands never need to use readdir(), scandir(), or the fts.h family |
| 1000 | of functions.</p> |
| 1001 | |
| 1002 | <p>These functions do not call chdir() or rely on PATH_MAX. Instead they |
| 1003 | use openat() and friends, using one filehandle per directory level to |
| 1004 | recurseinto subdirectories. (I.E. they can descend 1000 directories deep |
| 1005 | if setrlimit(RLIMIT_NOFILE) allows enough open filehandles, and the default |
| 1006 | in /proc/self/limits is generally 1024.)</p> |
| 1007 | |
| 1008 | <p>The basic dirtree functions are:</p> |
| 1009 | |
| 1010 | <ul> |
| 1011 | <li><p><b>dirtree_read(char *path, int (*callback)(struct dirtree node))</b> - |
| 1012 | recursively read directories, either applying callback() or returning |
| 1013 | a tree of struct dirtree if callback is NULL.</p></li> |
| 1014 | |
| 1015 | <li><p><b>dirtree_path(struct dirtree *node, int *plen)</b> - malloc() a |
| 1016 | string containing the path from the root of this tree to this node. If |
| 1017 | plen isn't NULL then *plen is how many extra bytes to malloc at the end |
| 1018 | of string.</p></li> |
| 1019 | |
| 1020 | <li><p><b>dirtree_parentfd(struct dirtree *node)</b> - return fd of |
| 1021 | containing directory, for use with openat() and such.</p></li> |
| 1022 | </ul> |
| 1023 | |
| 1024 | <p>The <b>dirtree_read()</b> function takes two arguments, a starting path for |
| 1025 | the root of the tree, and a callback function. The callback takes a |
| 1026 | <b>struct dirtree *</b> (from lib/lib.h) as its argument. If the callback is |
| 1027 | NULL, the traversal uses a default callback (dirtree_notdotdot()) which |
| 1028 | recursively assembles a tree of struct dirtree nodes for all files under |
| 1029 | this directory and subdirectories (filtering out "." and ".." entries), |
| 1030 | after which dirtree_read() returns the pointer to the root node of this |
| 1031 | snapshot tree.</p> |
| 1032 | |
| 1033 | <p>Otherwise the callback() is called on each entry in the directory, |
| 1034 | with struct dirtree * as its argument. This includes the initial |
| 1035 | node created by dirtree_read() at the top of the tree.</p> |
| 1036 | |
| 1037 | <p><b>struct dirtree</b></p> |
| 1038 | |
| 1039 | <p>Each struct dirtree node contains <b>char name[]</b> and <b>struct stat |
| 1040 | st</b> entries describing a file, plus a <b>char *symlink</b> |
| 1041 | which is NULL for non-symlinks.</p> |
| 1042 | |
| 1043 | <p>During a callback function, the <b>int data</b> field of directory nodes |
| 1044 | contains a dirfd (for use with the openat() family of functions). This is |
| 1045 | generally used by calling dirtree_parentfd() on the callback's node argument. |
| 1046 | For symlinks, data contains the length of the symlink string. On the second |
| 1047 | callback from DIRTREE_COMEAGAIN (depth-first traversal) data = -1 for |
| 1048 | all nodes (that's how you can tell it's the second callback).</p> |
| 1049 | |
| 1050 | <p>Users of this code may put anything they like into the <b>long extra</b> |
| 1051 | field. For example, "cp" and "mv" use this to store a dirfd for the destination |
| 1052 | directory (and use DIRTREE_COMEAGAIN to get the second callback so they can |
| 1053 | close(node->extra) to avoid running out of filehandles). |
| 1054 | This field is not directly used by the dirtree code, and |
| 1055 | thanks to LP64 it's large enough to store a typecast pointer to an |
| 1056 | arbitrary struct.</p> |
| 1057 | |
| 1058 | <p>The return value of the callback combines flags (with boolean or) to tell |
| 1059 | the traversal infrastructure how to behave:</p> |
| 1060 | |
| 1061 | <ul> |
| 1062 | <li><p><b>DIRTREE_SAVE</b> - Save this node, assembling a tree. (Without |
| 1063 | this the struct dirtree is freed after the callback returns. Filtering out |
| 1064 | siblings is fine, but discarding a parent while keeping its child leaks |
| 1065 | memory.)</p></li> |
| 1066 | <li><p><b>DIRTREE_ABORT</b> - Do not examine any more entries in this |
| 1067 | directory. (Does not propagate up tree: to abort entire traversal, |
| 1068 | return DIRTREE_ABORT from parent callbacks too.)</p></li> |
| 1069 | <li><p><b>DIRTREE_RECURSE</b> - Examine directory contents. Ignored for |
| 1070 | non-directory entries. The remaining flags only take effect when |
| 1071 | recursing into the children of a directory.</p></li> |
| 1072 | <li><p><b>DIRTREE_COMEAGAIN</b> - Call the callback a second time after |
| 1073 | examining all directory contents, allowing depth-first traversal. |
| 1074 | On the second call, dirtree->data = -1.</p></li> |
| 1075 | <li><p><b>DIRTREE_SYMFOLLOW</b> - follow symlinks when populating children's |
| 1076 | <b>struct stat st</b> (by feeding a nonzero value to the symfollow argument of |
| 1077 | dirtree_add_node()), which means DIRTREE_RECURSE treats symlinks to |
| 1078 | directories as directories. (Avoiding infinite recursion is the callback's |
| 1079 | problem: the non-NULL dirtree->symlink can still distinguish between |
| 1080 | them.)</p></li> |
| 1081 | </ul> |
| 1082 | |
| 1083 | <p>Each struct dirtree contains three pointers (next, parent, and child) |
| 1084 | to other struct dirtree.</p> |
| 1085 | |
| 1086 | <p>The <b>parent</b> pointer indicates the directory |
| 1087 | containing this entry; even when not assembling a persistent tree of |
| 1088 | nodes the parent entries remain live up to the root of the tree while |
| 1089 | child nodes are active. At the top of the tree the parent pointer is |
| 1090 | NULL, meaning the node's name[] is either an absolute path or relative |
| 1091 | to cwd. The function dirtree_parentfd() gets the directory file descriptor |
| 1092 | for use with openat() and friends, returning AT_FDCWD at the top of tree.</p> |
| 1093 | |
| 1094 | <p>The <b>child</b> pointer points to the first node of the list of contents of |
| 1095 | this directory. If the directory contains no files, or the entry isn't |
| 1096 | a directory, child is NULL.</p> |
| 1097 | |
| 1098 | <p>The <b>next</b> pointer indicates sibling nodes in the same directory as this |
| 1099 | node, and since it's the first entry in the struct the llist.c traversal |
| 1100 | mechanisms work to iterate over sibling nodes. Each dirtree node is a |
| 1101 | single malloc() (even char *symlink points to memory at the end of the node), |
| 1102 | so llist_free() works but its callback must descend into child nodes (freeing |
| 1103 | a tree, not just a linked list), plus whatever the user stored in extra.</p> |
| 1104 | |
| 1105 | <p>The <b>dirtree_read</b>() function is a simple wrapper, calling <b>dirtree_add_node</b>() |
| 1106 | to create a root node relative to the current directory, then calling |
| 1107 | <b>handle_callback</b>() on that node (which recurses as instructed by the callback |
| 1108 | return flags). Some commands (such as chgrp) bypass this wrapper, for example |
| 1109 | to control whether or not to follow symlinks to the root node; symlinks |
| 1110 | listed on the command line are often treated differently than symlinks |
| 1111 | encountered during recursive directory traversal). |
| 1112 | |
| 1113 | <p>The ls command not only bypasses the wrapper, but never returns |
| 1114 | <b>DIRTREE_RECURSE</b> from the callback, instead calling <b>dirtree_recurse</b>() manually |
| 1115 | from elsewhere in the program. This gives ls -lR manual control |
| 1116 | of traversal order, which is neither depth first nor breadth first but |
| 1117 | instead a sort of FIFO order requried by the ls standard.</p> |
| 1118 | |
Rob Landley | 7eaf4f5 | 2014-04-09 08:30:09 -0500 | [diff] [blame] | 1119 | <a name="toys"> |
| 1120 | <h1><a href="#toys">Directory toys/</a></h1> |
Rob Landley | c0e56ed | 2012-10-08 00:02:30 -0500 | [diff] [blame] | 1121 | |
| 1122 | <p>This directory contains command implementations. Each command is a single |
| 1123 | self-contained file. Adding a new command involves adding a single |
| 1124 | file, and removing a command involves removing that file. Commands use |
| 1125 | shared infrastructure from the lib/ and generated/ directories.</p> |
| 1126 | |
| 1127 | <p>Currently there are three subdirectories under "toys/" containing commands |
| 1128 | described in POSIX-2008, the Linux Standard Base 4.1, or "other". The only |
| 1129 | difference this makes is which menu the command shows up in during "make |
| 1130 | menuconfig", the directories are otherwise identical. Note that they commands |
| 1131 | exist within a single namespace at runtime, so you can't have the same |
| 1132 | command in multiple subdirectories.</p> |
| 1133 | |
| 1134 | <p>(There are actually four sub-menus in "make menuconfig", the fourth |
| 1135 | contains global configuration options for toybox, and lives in Config.in at |
| 1136 | the top level.)</p> |
| 1137 | |
| 1138 | <p>See <a href="#adding">adding a new command</a> for details on the |
| 1139 | layout of a command file.</p> |
| 1140 | |
Rob Landley | 81b899d | 2007-12-18 02:02:47 -0600 | [diff] [blame] | 1141 | <h2>Directory scripts/</h2> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 1142 | |
Rob Landley | 1f4f41a | 2012-10-08 21:31:07 -0500 | [diff] [blame] | 1143 | <p>Build infrastructure. The makefile calls scripts/make.sh for "make" |
| 1144 | and scripts/install.sh for "make install".</p> |
| 1145 | |
| 1146 | <p>There's also a test suite, "make test" calls make/test.sh, which runs all |
| 1147 | the tests in make/test/*. You can run individual tests via |
| 1148 | "scripts/test.sh command", or "TEST_HOST=1 scripts/test.sh command" to run |
| 1149 | that test against the host implementation instead of the toybox one.</p> |
| 1150 | |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 1151 | <h3>scripts/cfg2files.sh</h3> |
| 1152 | |
| 1153 | <p>Run .config through this filter to get a list of enabled commands, which |
| 1154 | is turned into a list of files in toys via a sed invocation in the top level |
| 1155 | Makefile. |
| 1156 | </p> |
| 1157 | |
Rob Landley | 81b899d | 2007-12-18 02:02:47 -0600 | [diff] [blame] | 1158 | <h2>Directory kconfig/</h2> |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 1159 | |
| 1160 | <p>Menuconfig infrastructure copied from the Linux kernel. See the |
| 1161 | Linux kernel's Documentation/kbuild/kconfig-language.txt</p> |
| 1162 | |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 1163 | <a name="generated"> |
| 1164 | <h2>Directory generated/</h2> |
| 1165 | |
| 1166 | <p>All the files in this directory except the README are generated by the |
Rob Landley | ca73392 | 2014-05-19 18:24:35 -0500 | [diff] [blame] | 1167 | build. (See scripts/make.sh)</p> |
Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 1168 | |
| 1169 | <ul> |
| 1170 | <li><p><b>config.h</b> - CFG_COMMAND and USE_COMMAND() macros set by menuconfig via .config.</p></li> |
| 1171 | |
| 1172 | <li><p><b>Config.in</b> - Kconfig entries for each command. Included by top level Config.in. The help text in here is used to generated help.h</p></li> |
| 1173 | |
| 1174 | <li><p><b>help.h</b> - Help text strings for use by "help" command. Building |
| 1175 | this file requires python on the host system, so the prebuilt file is shipped |
| 1176 | in the build tarball to avoid requiring python to build toybox.</p></li> |
| 1177 | |
| 1178 | <li><p><b>newtoys.h</b> - List of NEWTOY() or OLDTOY() macros for all available |
| 1179 | commands. Associates command_main() functions with command names, provides |
| 1180 | option string for command line parsing (<a href="#lib_args">see lib/args.c</a>), |
| 1181 | specifies where to install each command and whether toysh should fork before |
| 1182 | calling it.</p></li> |
| 1183 | </ul> |
| 1184 | |
| 1185 | <p>Everything in this directory is a derivative file produced from something |
| 1186 | else. The entire directory is deleted by "make distclean".</p> |
Rob Landley | ca73392 | 2014-05-19 18:24:35 -0500 | [diff] [blame] | 1187 | |
Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 1188 | <!--#include file="footer.html" --> |