| Rob Landley | 349ff52 | 2014-01-04 13:09:42 -0600 | [diff] [blame] | 1 | <html><head><title>toybox source code walkthrough</title></head> | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 2 | <!--#include file="header.html" --> | 
|  | 3 |  | 
| Rob Landley | 2704834 | 2013-08-18 14:24:59 -0500 | [diff] [blame] | 4 | <p><h1><a name="style" /><a href="#style">Code style</a></h1></p> | 
| Rob Landley | e7c9a6d | 2012-02-28 06:34:09 -0600 | [diff] [blame] | 5 |  | 
|  | 6 | <p>The primary goal of toybox is _simple_ code. Keeping the code small is | 
| Rob Landley | ed6ed62 | 2012-03-06 20:49:03 -0600 | [diff] [blame] | 7 | second, with speed and lots of features coming in somewhere after that. | 
|  | 8 | (For more on that, see the <a href=design.html>design</a> page.)</p> | 
| Rob Landley | e7c9a6d | 2012-02-28 06:34:09 -0600 | [diff] [blame] | 9 |  | 
|  | 10 | <p>A simple implementation usually takes up fewer lines of source code, | 
|  | 11 | meaning more code can fit on the screen at once, meaning the programmer can | 
|  | 12 | see more of it on the screen and thus keep more if in their head at once. | 
| Rob Landley | ed6ed62 | 2012-03-06 20:49:03 -0600 | [diff] [blame] | 13 | This helps code auditing and thus reduces bugs. That said, sometimes being | 
|  | 14 | more explicit is preferable to being clever enough to outsmart yourself: | 
|  | 15 | don't be so terse your code is unreadable.</p> | 
| Rob Landley | 5a0660f | 2007-12-27 21:36:44 -0600 | [diff] [blame] | 16 |  | 
| Rob Landley | 7aa651a | 2012-11-13 17:14:08 -0600 | [diff] [blame] | 17 | <p>Toybox source uses two spaces per indentation level, and wraps at 80 | 
|  | 18 | columns.</p> | 
| Rob Landley | 5a0660f | 2007-12-27 21:36:44 -0600 | [diff] [blame] | 19 |  | 
|  | 20 | <p>Gotos are allowed for error handling, and for breaking out of | 
|  | 21 | nested loops.  In general, a goto should only jump forward (not back), and | 
|  | 22 | should either jump to the end of an outer loop, or to error handling code | 
|  | 23 | at the end of the function.  Goto labels are never indented: they override the | 
|  | 24 | block structure of the file.  Putting them at the left edge makes them easy | 
|  | 25 | to spot as overrides to the normal flow of control, which they are.</p> | 
|  | 26 |  | 
| Rob Landley | 2704834 | 2013-08-18 14:24:59 -0500 | [diff] [blame] | 27 | <p><h1><a name="building" /><a href="#building">Building Toybox</a></h1></p> | 
| Rob Landley | e7c9a6d | 2012-02-28 06:34:09 -0600 | [diff] [blame] | 28 |  | 
|  | 29 | <p>Toybox is configured using the Kconfig language pioneered by the Linux | 
|  | 30 | kernel, and adopted by many other projects (uClibc, OpenEmbedded, etc). | 
|  | 31 | This generates a ".config" file containing the selected options, which | 
| Rob Landley | 7aa651a | 2012-11-13 17:14:08 -0600 | [diff] [blame] | 32 | controls which features are included when compiling toybox.</p> | 
| Rob Landley | e7c9a6d | 2012-02-28 06:34:09 -0600 | [diff] [blame] | 33 |  | 
|  | 34 | <p>Each configuration option has a default value. The defaults indicate the | 
|  | 35 | "maximum sane configuration", I.E. if the feature defaults to "n" then it | 
|  | 36 | either isn't complete or is a special-purpose option (such as debugging | 
|  | 37 | code) that isn't intended for general purpose use.</p> | 
|  | 38 |  | 
|  | 39 | <p>The standard build invocation is:</p> | 
|  | 40 |  | 
|  | 41 | <ul> | 
|  | 42 | <li>make defconfig #(or menuconfig)</li> | 
|  | 43 | <li>make</li> | 
|  | 44 | <li>make install</li> | 
|  | 45 | </ul> | 
|  | 46 |  | 
|  | 47 | <p>Type "make help" to see all available build options.</p> | 
|  | 48 |  | 
|  | 49 | <p>The file "configure" contains a number of environment variable definitions | 
|  | 50 | which influence the build, such as specifying which compiler to use or where | 
|  | 51 | to install the resulting binaries. This file is included by the build, but | 
|  | 52 | accepts existing definitions of the environment variables, so it may be sourced | 
|  | 53 | or modified by the developer before building and the definitions exported | 
|  | 54 | to the environment will take precedence.</p> | 
|  | 55 |  | 
|  | 56 | <p>(To clarify: "configure" describes the build and installation environment, | 
|  | 57 | ".config" lists the features selected by defconfig/menuconfig.)</p> | 
| Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 58 |  | 
| Rob Landley | 2704834 | 2013-08-18 14:24:59 -0500 | [diff] [blame] | 59 | <p><h1><a name="running"><a href="#running">Running a command</a></h1></p> | 
|  | 60 |  | 
|  | 61 | <h2>main</h2> | 
|  | 62 |  | 
|  | 63 | <p>The toybox main() function is at the end of main.c at the top level. It has | 
|  | 64 | two possible codepaths, only one of which is configured into any given build | 
|  | 65 | of toybox.</p> | 
|  | 66 |  | 
|  | 67 | <p>If CONFIG_SINGLE is selected, toybox is configured to contain only a single | 
|  | 68 | command, so most of the normal setup can be skipped. In this case the | 
|  | 69 | multiplexer isn't used, instead main() calls toy_singleinit() (also in main.c) | 
|  | 70 | to set up global state and parse command line arguments, calls the command's | 
|  | 71 | main function out of toy_list (in the CONFIG_SINGLE case the array has a single entry, no need to search), and if the function returns instead of exiting | 
|  | 72 | it flushes stdout (detecting error) and returns toys.exitval.</p> | 
|  | 73 |  | 
|  | 74 | <p>When CONFIG_SINGLE is not selected, main() uses basename() to find the | 
|  | 75 | name it was run as, shifts its argument list one to the right so it lines up | 
|  | 76 | with where the multiplexer function expects it, and calls toybox_main(). This | 
|  | 77 | leverages the multiplexer command's infrastructure to find and run the | 
|  | 78 | appropriate command. (A command name starting with "toybox" will | 
|  | 79 | recursively call toybox_main(); you can go "./toybox toybox toybox toybox ls" | 
|  | 80 | if you want to...)</p> | 
|  | 81 |  | 
|  | 82 | <h2>toybox_main</h2> | 
|  | 83 |  | 
|  | 84 | <p>The toybox_main() function is also in main,c. It handles a possible | 
|  | 85 | --help option ("toybox --help ls"), prints the list of available commands if no | 
|  | 86 | arguments were provided to the multiplexer (or with full path names if any | 
|  | 87 | other option is provided before a command name, ala "toybox --list"). | 
|  | 88 | Otherwise it calls toy_exec() on its argument list.</p> | 
|  | 89 |  | 
|  | 90 | <p>Note that the multiplexer is the first entry in toy_list (the rest of the | 
|  | 91 | list is sorted alphabetically to allow binary search), so toybox_main can | 
|  | 92 | cheat and just grab the first entry to quickly set up its context without | 
|  | 93 | searching. Since all command names go through the multiplexer at least once | 
|  | 94 | in the non-TOYBOX_SINGLE case, this avoids a redundant search of | 
|  | 95 | the list.</p> | 
|  | 96 |  | 
|  | 97 | <p>The toy_exec() function is also in main.c. It performs toy_find() to | 
|  | 98 | perform a binary search on the toy_list array to look up the command's | 
|  | 99 | entry by name and saves it in the global variable which, calls toy_init() | 
|  | 100 | to parse command line arguments and set up global state (using which->options), | 
|  | 101 | and calls the appropriate command's main() function (which->toy_main). On | 
|  | 102 | return it flushes all pending ansi FILE * I/O, detects if stdout had an | 
|  | 103 | error, and then calls xexit() (which uses toys.exitval).</p> | 
|  | 104 |  | 
|  | 105 | <p><h1><a name="infrastructure" /><a href="#infrastructure">Infrastructure</a></h1></p> | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 106 |  | 
| Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 107 | <p>The toybox source code is in following directories:</p> | 
|  | 108 | <ul> | 
|  | 109 | <li>The <a href="#top">top level directory</a> contains the file main.c (were | 
|  | 110 | execution starts), the header file toys.h (included by every command), and | 
|  | 111 | other global infrastructure.</li> | 
|  | 112 | <li>The <a href="#lib">lib directory</a> contains common functions shared by | 
| Rob Landley | c8d0da5 | 2012-07-15 17:47:08 -0500 | [diff] [blame] | 113 | multiple commands:</li> | 
|  | 114 | <ul> | 
|  | 115 | <li><a href="#lib_lib">lib/lib.c</a></li> | 
|  | 116 | <li><a href="#lib_llist">lib/llist.c</a></li> | 
|  | 117 | <li><a href="#lib_args">lib/args.c</a></li> | 
|  | 118 | <li><a href="#lib_dirtree">lib/dirtree.c</a></li> | 
|  | 119 | </ul> | 
| Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 120 | <li>The <a href="#toys">toys directory</a> contains the C files implementating | 
| Rob Landley | c0e56ed | 2012-10-08 00:02:30 -0500 | [diff] [blame] | 121 | each command. Currently it contains three subdirectories: | 
|  | 122 | posix, lsb, and other.</li> | 
| Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 123 | <li>The <a href="#scripts">scripts directory</a> contains the build and | 
|  | 124 | test infrastructure.</li> | 
|  | 125 | <li>The <a href="#kconfig">kconfig directory</a> contains the configuration | 
|  | 126 | infrastructure implementing menuconfig (copied from the Linux kernel).</li> | 
|  | 127 | <li>The <a href="#generated">generated directory</a> contains intermediate | 
|  | 128 | files generated from other parts of the source code.</li> | 
|  | 129 | </ul> | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 130 |  | 
| Rob Landley | bbe500e | 2012-02-26 21:53:15 -0600 | [diff] [blame] | 131 | <a name="adding" /> | 
| Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 132 | <p><h1>Adding a new command</h1></p> | 
| Rob Landley | c0e56ed | 2012-10-08 00:02:30 -0500 | [diff] [blame] | 133 | <p>To add a new command to toybox, add a C file implementing that command under | 
| Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 134 | the toys directory.  No other files need to be modified; the build extracts | 
| Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 135 | all the information it needs (such as command line arguments) from specially | 
| Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 136 | formatted comments and macros in the C file.  (See the description of the | 
| Rob Landley | e7c9a6d | 2012-02-28 06:34:09 -0600 | [diff] [blame] | 137 | <a href="#generated">"generated" directory</a> for details.)</p> | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 138 |  | 
| Rob Landley | c0e56ed | 2012-10-08 00:02:30 -0500 | [diff] [blame] | 139 | <p>Currently there are three subdirectories under "toys", one for commands | 
|  | 140 | defined by the POSIX standard, one for commands defined by the Linux Standard | 
|  | 141 | Base, and one for all other commands. (This is just for developer convenience | 
|  | 142 | sorting them, the directories are otherwise functionally identical.)</p> | 
|  | 143 |  | 
|  | 144 | <p>An easy way to start a new command is copy the file "toys/other/hello.c" to | 
| Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 145 | the name of the new command, and modify this copy to implement the new command. | 
| Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 146 | This file is an example command meant to be used as a "skeleton" for | 
| Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 147 | new commands (more or less by turning every instance of "hello" into the | 
|  | 148 | name of your command, updating the command line arguments, globals, and | 
|  | 149 | help data,  and then filling out its "main" function with code that does | 
| Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 150 | something interesting).  It provides examples of all the build infrastructure | 
|  | 151 | (including optional elements like command line argument parsing and global | 
|  | 152 | variables that a "hello world" program doesn't strictly need).</p> | 
| Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 153 |  | 
|  | 154 | <p>Here's a checklist of steps to turn hello.c into another command:</p> | 
|  | 155 |  | 
|  | 156 | <ul> | 
| Rob Landley | c0e56ed | 2012-10-08 00:02:30 -0500 | [diff] [blame] | 157 | <li><p>First "cd toys/other" and "cp hello.c yourcommand.c".  Note that the name | 
| Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 158 | of this file is significant, it's the name of the new command you're adding | 
|  | 159 | to toybox.  Open your new file in your favorite editor.</p></li> | 
|  | 160 |  | 
|  | 161 | <li><p>Change the one line comment at the top of the file (currently | 
|  | 162 | "hello.c - A hello world program") to describe your new file.</p></li> | 
|  | 163 |  | 
|  | 164 | <li><p>Change the copyright notice to your name, email, and the current | 
|  | 165 | year.</p></li> | 
|  | 166 |  | 
| Rob Landley | c0e56ed | 2012-10-08 00:02:30 -0500 | [diff] [blame] | 167 | <li><p>Give a URL to the relevant standards document, where applicable. | 
|  | 168 | (Sample links to SUSv4 and LSB are provided, feel free to link to other | 
|  | 169 | documentation or standards as appropriate.)</p></li> | 
| Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 170 |  | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 171 | <li><p>Update the USE_YOURCOMMAND(NEWTOY(yourcommand,"blah",0)) line. | 
|  | 172 | The NEWTOY macro fills out this command's <a href="#toy_list">toy_list</a> | 
|  | 173 | structure.  The arguments to the NEWTOY macro are:</p> | 
|  | 174 |  | 
|  | 175 | <ol> | 
|  | 176 | <li><p>the name used to run your command</p></li> | 
|  | 177 | <li><p>the command line argument <a href="#lib_args">option parsing string</a> (NULL if none)</p></li> | 
|  | 178 | <li><p>a bitfield of TOYFLAG values | 
|  | 179 | (defined in toys.h) providing additional information such as where your | 
|  | 180 | command should be installed on a running system, whether to blank umask | 
|  | 181 | before running, whether or not the command must run as root (and thus should | 
|  | 182 | retain root access if installed SUID), and so on.</p></li> | 
|  | 183 | </ol> | 
|  | 184 | </li> | 
| Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 185 |  | 
|  | 186 | <li><p>Change the kconfig data (from "config YOURCOMMAND" to the end of the | 
|  | 187 | comment block) to supply your command's configuration and help | 
|  | 188 | information.  The uppper case config symbols are used by menuconfig, and are | 
|  | 189 | also what the CFG_ and USE_() macros are generated from (see [TODO]).  The | 
|  | 190 | help information here is used by menuconfig, and also by the "help" command to | 
|  | 191 | describe your new command.  (See [TODO] for details.)  By convention, | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 192 | unfinished commands default to "n" and finished commands default to "y", | 
|  | 193 | so "make defconfig" selects all finished commands.  (Note, "finished" means | 
|  | 194 | "ready to be used", not that it'll never change again.)<p> | 
|  | 195 |  | 
|  | 196 | <p>Each help block should start with a "usage: yourcommand" line explaining | 
|  | 197 | any command line arguments added by this config option.  The "help" command | 
|  | 198 | outputs this text, and scripts/config2help.c in the build infrastructure | 
|  | 199 | collates these usage lines for commands with multiple configuration | 
|  | 200 | options when producing generated/help.h.</p> | 
|  | 201 | </li> | 
| Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 202 |  | 
| Rob Landley | c0e56ed | 2012-10-08 00:02:30 -0500 | [diff] [blame] | 203 | <li><p>Change the "#define FOR_hello" line to "#define FOR_yourcommand" right | 
|  | 204 | before the "#include <toys.h>". (This selects the appropriate FLAG_ macros and | 
|  | 205 | does a "#define TT this.yourcommand" so you can access the global variables | 
|  | 206 | out of the space-saving union of structures. If you aren't using any command | 
|  | 207 | flag bits and aren't defining a GLOBAL block, you can delete this line.)</p></li> | 
| Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 208 |  | 
| Rob Landley | c0e56ed | 2012-10-08 00:02:30 -0500 | [diff] [blame] | 209 | <li><p>Update the GLOBALS() macro to contain your command's global | 
|  | 210 | variables. If your command has no global variables, delete this macro.</p> | 
|  | 211 |  | 
|  | 212 | <p>Variables in the GLOBALS() block are are stored in a space saving | 
|  | 213 | <a href="#toy_union">union of structures</a> format, which may be accessed | 
|  | 214 | using the TT macro as if TT were a global structure (so TT.membername). | 
|  | 215 | If you specified two-character command line arguments in | 
|  | 216 | NEWTOY(), the first few global variables will be initialized by the automatic | 
|  | 217 | argument parsing logic, and the type and order of these variables must | 
|  | 218 | correspond to the arguments specified in NEWTOY(). | 
|  | 219 | (See <a href="#lib_args">lib/args.c</a> for details.)</p></li> | 
| Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 220 |  | 
|  | 221 | <li><p>Rename hello_main() to yourcommand_main().  This is the main() function | 
| Rob Landley | c0e56ed | 2012-10-08 00:02:30 -0500 | [diff] [blame] | 222 | where execution of your command starts. Your command line options are | 
|  | 223 | already sorted into this.optflags, this.optargs, this.optc, and the GLOBALS() | 
|  | 224 | as appropriate by the time this function is called. (See | 
|  | 225 | <a href="#lib_args">get_optflags()</a> for details.</p></li> | 
| Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 226 | </ul> | 
|  | 227 |  | 
| Rob Landley | 85a3241 | 2013-12-27 06:53:15 -0600 | [diff] [blame] | 228 | <a name="headers" /><h2>Headers.</h2> | 
|  | 229 |  | 
|  | 230 | <p>Commands generally don't have their own headers. If it's common code | 
|  | 231 | it can live in lib/, if it isn't put it in the command's .c file. (The line | 
|  | 232 | between implementing multiple commands in a C file via OLDTOY() to share | 
|  | 233 | infrastructure and moving that shared infrastructure to lib/ is a judgement | 
|  | 234 | call. Try to figure out which is simplest.)</p> | 
|  | 235 |  | 
|  | 236 | <p>The top level toys.h should #include all the standard (posix) headers | 
|  | 237 | that any command uses. (Partly this is friendly to ccache and partly this | 
|  | 238 | makes the command implementations shorter.) Individual commands should only | 
|  | 239 | need to include nonstandard headers that might prevent that command from | 
|  | 240 | building in some context we'd care about (and thus requiring that command to | 
|  | 241 | be disabled to avoid a build break).</p> | 
|  | 242 |  | 
|  | 243 | <p>Target-specific stuff (differences between compiler versions, libc versions, | 
|  | 244 | or operating systems) should be confined to lib/portability.h and | 
|  | 245 | lib/portability.c. (There's even some minimal compile-time environment probing | 
|  | 246 | that writes data to generated/portability.h, see scripts/genconfig.sh.)</p> | 
|  | 247 |  | 
|  | 248 | <p>Only include linux/*.h headers from individual commands (not from other | 
|  | 249 | headers), and only if you really need to. Data that varies per architecture | 
|  | 250 | is a good reason to include a header. If you just need a couple constants | 
|  | 251 | that haven't changed since the 1990's, it's ok to #define them yourself or | 
|  | 252 | just use the constant inline with a comment explaining what it is. (A | 
|  | 253 | #define that's only used once isn't really helping.)</p> | 
|  | 254 |  | 
| Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 255 | <p><a name="top" /><h2>Top level directory.</h2></p> | 
|  | 256 |  | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 257 | <p>This directory contains global infrastructure.</p> | 
|  | 258 |  | 
|  | 259 | <h3>toys.h</h3> | 
| Rob Landley | c0e56ed | 2012-10-08 00:02:30 -0500 | [diff] [blame] | 260 | <p>Each command #includes "toys.h" as part of its standard prolog. It | 
|  | 261 | may "#define FOR_commandname" before doing so to get some extra entries | 
|  | 262 | specific to this command.</p> | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 263 |  | 
|  | 264 | <p>This file sucks in most of the commonly used standard #includes, so | 
|  | 265 | individual files can just #include "toys.h" and not have to worry about | 
|  | 266 | stdargs.h and so on.  Individual commands still need to #include | 
|  | 267 | special-purpose headers that may not be present on all systems (and thus would | 
|  | 268 | prevent toybox from building that command on such a system with that command | 
|  | 269 | enabled).  Examples include regex support, any "linux/" or "asm/" headers, mtab | 
|  | 270 | support (mntent.h and sys/mount.h), and so on.</p> | 
|  | 271 |  | 
|  | 272 | <p>The toys.h header also defines structures for most of the global variables | 
|  | 273 | provided to each command by toybox_main().  These are described in | 
|  | 274 | detail in the description for main.c, where they are initialized.</p> | 
|  | 275 |  | 
|  | 276 | <p>The global variables are grouped into structures (and a union) for space | 
|  | 277 | savings, to more easily track the amount of memory consumed by them, | 
|  | 278 | so that they may be automatically cleared/initialized as needed, and so | 
|  | 279 | that access to global variables is more easily distinguished from access to | 
|  | 280 | local variables.</p> | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 281 |  | 
|  | 282 | <h3>main.c</h3> | 
|  | 283 | <p>Contains the main() function where execution starts, plus | 
|  | 284 | common infrastructure to initialize global variables and select which command | 
| Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 285 | to run.  The "toybox" multiplexer command also lives here.  (This is the | 
| Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 286 | only command defined outside of the toys directory.)</p> | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 287 |  | 
| Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 288 | <p>Execution starts in main() which trims any path off of the first command | 
|  | 289 | name and calls toybox_main(), which calls toy_exec(), which calls toy_find() | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 290 | and toy_init() before calling the appropriate command's function from | 
|  | 291 | toy_list[] (via toys.which->toy_main()). | 
| Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 292 | If the command is "toybox", execution recurses into toybox_main(), otherwise | 
|  | 293 | the call goes to the appropriate commandname_main() from a C file in the toys | 
|  | 294 | directory.</p> | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 295 |  | 
| Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 296 | <p>The following global variables are defined in main.c:</p> | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 297 | <ul> | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 298 | <a name="toy_list" /> | 
|  | 299 | <li><p><b>struct toy_list toy_list[]</b> - array describing all the | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 300 | commands currently configured into toybox.  The first entry (toy_list[0]) is | 
|  | 301 | for the "toybox" multiplexer command, which runs all the other built-in commands | 
|  | 302 | without symlinks by using its first argument as the name of the command to | 
|  | 303 | run and the rest as that command's argument list (ala "./toybox echo hello"). | 
|  | 304 | The remaining entries are the commands in alphabetical order (for efficient | 
|  | 305 | binary search).</p> | 
|  | 306 |  | 
|  | 307 | <p>This is a read-only array initialized at compile time by | 
| Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 308 | defining macros and #including generated/newtoys.h.</p> | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 309 |  | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 310 | <p>Members of struct toy_list (defined in "toys.h") include:</p> | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 311 | <ul> | 
|  | 312 | <li><p>char *<b>name</b> - the name of this command.</p></li> | 
|  | 313 | <li><p>void (*<b>toy_main</b>)(void) - function pointer to run this | 
|  | 314 | command.</p></li> | 
|  | 315 | <li><p>char *<b>options</b> - command line option string (used by | 
|  | 316 | get_optflags() in lib/args.c to intialize toys.optflags, toys.optargs, and | 
| Rob Landley | c0e56ed | 2012-10-08 00:02:30 -0500 | [diff] [blame] | 317 | entries in the toy's GLOBALS struct).  When this is NULL, no option | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 318 | parsing is done before calling toy_main().</p></li> | 
| Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 319 | <li><p>int <b>flags</b> - Behavior flags for this command.  The following flags are currently understood:</p> | 
|  | 320 |  | 
|  | 321 | <ul> | 
|  | 322 | <li><b>TOYFLAG_USR</b> - Install this command under /usr</li> | 
|  | 323 | <li><b>TOYFLAG_BIN</b> - Install this command under /bin</li> | 
|  | 324 | <li><b>TOYFLAG_SBIN</b> - Install this command under /sbin</li> | 
|  | 325 | <li><b>TOYFLAG_NOFORK</b> - This command can be used as a shell builtin.</li> | 
|  | 326 | <li><b>TOYFLAG_UMASK</b> - Call umask(0) before running this command.</li> | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 327 | <li><b>TOYFLAG_STAYROOT</b> - Don't drop permissions for this command if toybox is installed SUID root.</li> | 
|  | 328 | <li><b>TOYFLAG_NEEDROOT</b> - This command cannot function unless run with root access.</li> | 
| Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 329 | </ul> | 
|  | 330 | <br> | 
|  | 331 |  | 
|  | 332 | <p>These flags are combined with | (or).  For example, to install a command | 
|  | 333 | in /usr/bin, or together TOYFLAG_USR|TOYFLAG_BIN.</p> | 
|  | 334 | </ul> | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 335 | </li> | 
|  | 336 |  | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 337 | <li><p><b>struct toy_context toys</b> - global structure containing information | 
|  | 338 | common to all commands, initializd by toy_init() and defined in "toys.h". | 
|  | 339 | Members of this structure include:</p> | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 340 | <ul> | 
|  | 341 | <li><p>struct toy_list *<b>which</b> - a pointer to this command's toy_list | 
|  | 342 | structure.  Mostly used to grab the name of the running command | 
|  | 343 | (toys->which.name).</p> | 
|  | 344 | </li> | 
|  | 345 | <li><p>int <b>exitval</b> - Exit value of this command.  Defaults to zero.  The | 
|  | 346 | error_exit() functions will return 1 if this is zero, otherwise they'll | 
|  | 347 | return this value.</p></li> | 
|  | 348 | <li><p>char **<b>argv</b> - "raw" command line options, I.E. the original | 
|  | 349 | unmodified string array passed in to main().  Note that modifying this changes | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 350 | "ps" output, and is not recommended.  This array is null terminated; a NULL | 
|  | 351 | entry indicates the end of the array.</p> | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 352 | <p>Most commands don't use this field, instead the use optargs, optflags, | 
| Rob Landley | c0e56ed | 2012-10-08 00:02:30 -0500 | [diff] [blame] | 353 | and the fields in the GLOBALS struct initialized by get_optflags().</p> | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 354 | </li> | 
|  | 355 | <li><p>unsigned <b>optflags</b> - Command line option flags, set by | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 356 | <a href="#lib_args">get_optflags()</a>.  Indicates which of the command line options listed in | 
| Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 357 | toys->which.options occurred this time.</p> | 
|  | 358 |  | 
|  | 359 | <p>The rightmost command line argument listed in toys->which.options sets bit | 
|  | 360 | 1, the next one sets bit 2, and so on.  This means the bits are set in the same | 
|  | 361 | order the binary digits would be listed if typed out as a string.  For example, | 
|  | 362 | the option string "abcd" would parse the command line "-c" to set optflags to 2, | 
|  | 363 | "-a" would set optflags to 8, and "-bd" would set optflags to 6 (4|2).</p> | 
|  | 364 |  | 
|  | 365 | <p>Only letters are relevant to optflags.  In the string "a*b:c#d", d=1, c=2, | 
| Rob Landley | c0e56ed | 2012-10-08 00:02:30 -0500 | [diff] [blame] | 366 | b=4, a=8.  Punctuation after a letter initializes global variables at the | 
|  | 367 | start of the GLOBALS() block (see <a href="#toy_union">union toy_union this</a> | 
|  | 368 | for details).</p> | 
|  | 369 |  | 
|  | 370 | <p>The build infrastructure creates FLAG_ macros for each option letter, | 
|  | 371 | corresponding to the bit position, so you can check (toys.optflags & FLAG_x) | 
|  | 372 | to see if a flag was specified. (The correct set of FLAG_ macros is selected | 
|  | 373 | by defining FOR_mycommand before #including toys.h. The macros live in | 
|  | 374 | toys/globals.h which is generated by scripts/make.sh.)</p> | 
| Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 375 |  | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 376 | <p>For more information on option parsing, see <a href="#lib_args">get_optflags()</a>.</p> | 
| Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 377 |  | 
|  | 378 | </li> | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 379 | <li><p>char **<b>optargs</b> - Null terminated array of arguments left over | 
|  | 380 | after get_optflags() removed all the ones it understood.  Note: optarg[0] is | 
|  | 381 | the first argument, not the command name.  Use toys.which->name for the command | 
|  | 382 | name.</p></li> | 
| Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 383 | <li><p>int <b>optc</b> - Optarg count, equivalent to argc but for | 
|  | 384 | optargs[].<p></li> | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 385 | <li><p>int <b>exithelp</b> - Whether error_exit() should print a usage message | 
|  | 386 | via help_main() before exiting.  (True during option parsing, defaults to | 
|  | 387 | false afterwards.)</p></li> | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 388 | </ul> | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 389 |  | 
| Rob Landley | c0e56ed | 2012-10-08 00:02:30 -0500 | [diff] [blame] | 390 | <a name="toy_union" /> | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 391 | <li><p><b>union toy_union this</b> - Union of structures containing each | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 392 | command's global variables.</p> | 
|  | 393 |  | 
| Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 394 | <p>Global variables are useful: they reduce the overhead of passing extra | 
|  | 395 | command line arguments between functions, they conveniently start prezeroed to | 
|  | 396 | save initialization costs, and the command line argument parsing infrastructure | 
|  | 397 | can also initialize global variables with its results.</p> | 
|  | 398 |  | 
|  | 399 | <p>But since each toybox process can only run one command at a time, allocating | 
|  | 400 | space for global variables belonging to other commands you aren't currently | 
|  | 401 | running would be wasteful.</p> | 
|  | 402 |  | 
|  | 403 | <p>Toybox handles this by encapsulating each command's global variables in | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 404 | a structure, and declaring a union of those structures with a single global | 
| Rob Landley | c0e56ed | 2012-10-08 00:02:30 -0500 | [diff] [blame] | 405 | instance (called "this").  The GLOBALS() macro contains the global | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 406 | variables that should go in the current command's global structure.  Each | 
|  | 407 | variable can then be accessed as "this.commandname.varname". | 
| Rob Landley | c0e56ed | 2012-10-08 00:02:30 -0500 | [diff] [blame] | 408 | If you #defined FOR_commandname before including toys.h, the macro TT is | 
|  | 409 | #defined to this.commandname so the variable can then be accessed as | 
|  | 410 | "TT.variable".  See toys/hello.c for an example.</p> | 
| Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 411 |  | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 412 | <p>A command that needs global variables should declare a structure to | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 413 | contain them all, and add that structure to this union.  A command should never | 
|  | 414 | declare global variables outside of this, because such global variables would | 
|  | 415 | allocate memory when running other commands that don't use those global | 
|  | 416 | variables.</p> | 
|  | 417 |  | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 418 | <p>The first few fields of this structure can be intialized by <a href="#lib_args">get_optargs()</a>, | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 419 | as specified by the options field off this command's toy_list entry.  See | 
|  | 420 | the get_optargs() description in lib/args.c for details.</p> | 
|  | 421 | </li> | 
|  | 422 |  | 
| Rob Landley | 81b899d | 2007-12-18 02:02:47 -0600 | [diff] [blame] | 423 | <li><b>char toybuf[4096]</b> - a common scratch space buffer so | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 424 | commands don't need to allocate their own.  Any command is free to use this, | 
|  | 425 | and it should never be directly referenced by functions in lib/ (although | 
|  | 426 | commands are free to pass toybuf in to a library function as an argument).</li> | 
|  | 427 | </ul> | 
|  | 428 |  | 
| Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 429 | <p>The following functions are defined in main.c:</p> | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 430 | <ul> | 
|  | 431 | <li><p>struct toy_list *<b>toy_find</b>(char *name) - Return the toy_list | 
|  | 432 | structure for this command name, or NULL if not found.</p></li> | 
| Rob Landley | 81b899d | 2007-12-18 02:02:47 -0600 | [diff] [blame] | 433 | <li><p>void <b>toy_init</b>(struct toy_list *which, char *argv[]) - fill out | 
|  | 434 | the global toys structure, calling get_optargs() if necessary.</p></li> | 
| Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 435 | <li><p>void <b>toy_exec</b>(char *argv[]) - Run a built-in command with | 
|  | 436 | arguments.</p> | 
|  | 437 | <p>Calls toy_find() on argv[0] (which must be just a command name | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 438 | without path).  Returns if it can't find this command, otherwise calls | 
| Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 439 | toy_init(), toys->which.toy_main(), and exit() instead of returning.</p> | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 440 |  | 
| Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 441 | <p>Use the library function xexec() to fall back to external executables | 
|  | 442 | in $PATH if toy_exec() can't find a built-in command.  Note that toy_exec() | 
|  | 443 | does not strip paths before searching for a command, so "./command" will | 
|  | 444 | never match an internal command.</li> | 
|  | 445 |  | 
|  | 446 | <li><p>void <b>toybox_main</b>(void) - the main function for the multiplexer | 
|  | 447 | command (I.E. "toybox").  Given a command name as its first argument, calls | 
|  | 448 | toy_exec() on its arguments.  With no arguments, it lists available commands. | 
|  | 449 | If the first argument starts with "-" it lists each command with its default | 
|  | 450 | install path prepended.</p></li> | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 451 |  | 
|  | 452 | </ul> | 
|  | 453 |  | 
|  | 454 | <h3>Config.in</h3> | 
|  | 455 |  | 
|  | 456 | <p>Top level configuration file in a stylized variant of | 
| Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 457 | <a href=http://kernel.org/doc/Documentation/kbuild/kconfig-language.txt>kconfig</a> format.  Includes generated/Config.in.</p> | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 458 |  | 
|  | 459 | <p>These files are directly used by "make menuconfig" to select which commands | 
|  | 460 | to build into toybox (thus generating a .config file), and by | 
| Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 461 | scripts/config2help.py to create generated/help.h.</p> | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 462 |  | 
|  | 463 | <h3>Temporary files:</h3> | 
|  | 464 |  | 
| Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 465 | <p>There is one temporary file in the top level source directory:</p> | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 466 | <ul> | 
|  | 467 | <li><p><b>.config</b> - Configuration file generated by kconfig, indicating | 
|  | 468 | which commands (and options to commands) are currently enabled.  Used | 
| Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 469 | to make generated/config.h and determine which toys/*.c files to build.</p> | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 470 |  | 
| Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 471 | <p>You can create a human readable "miniconfig" version of this file using | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 472 | <a href=http://landley.net/aboriginal/new_platform.html#miniconfig>these | 
| Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 473 | instructions</a>.</p> | 
|  | 474 | </li> | 
|  | 475 | </ul> | 
|  | 476 |  | 
| Rob Landley | e7c9a6d | 2012-02-28 06:34:09 -0600 | [diff] [blame] | 477 | <a name="generated" /> | 
| Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 478 | <p>The "generated/" directory contains files generated from other source code | 
|  | 479 | in toybox.  All of these files can be recreated by the build system, although | 
|  | 480 | some (such as generated/help.h) are shipped in release versions to reduce | 
|  | 481 | environmental dependencies (I.E. so you don't need python on your build | 
|  | 482 | system).</p> | 
|  | 483 |  | 
|  | 484 | <ul> | 
|  | 485 | <li><p><b>generated/config.h</b> - list of CFG_SYMBOL and USE_SYMBOL() macros, | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 486 | generated from .config by a sed invocation in the top level Makefile.</p> | 
|  | 487 |  | 
|  | 488 | <p>CFG_SYMBOL is a comple time constant set to 1 for enabled symbols and 0 for | 
| Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 489 | disabled symbols.  This allows the use of normal if() statements to remove | 
|  | 490 | code at compile time via the optimizer's dead code elimination (which removes | 
|  | 491 | from the binary any code that cannot be reached).  This saves space without | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 492 | cluttering the code with #ifdefs or leading to configuration dependent build | 
|  | 493 | breaks.  (See the 1992 Usenix paper | 
| Rob Landley | b6063de | 2012-01-29 13:54:13 -0600 | [diff] [blame] | 494 | <a href=http://doc.cat-v.org/henry_spencer/ifdef_considered_harmful.pdf>#ifdef | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 495 | Considered Harmful</a> for more information.)</p> | 
|  | 496 |  | 
|  | 497 | <p>USE_SYMBOL(code) evaluates to the code in parentheses when the symbol | 
|  | 498 | is enabled, and nothing when the symbol is disabled.  This can be used | 
|  | 499 | for things like varargs or variable declarations which can't always be | 
| Rob Landley | 6882ee8 | 2008-02-12 18:41:34 -0600 | [diff] [blame] | 500 | eliminated by a simple test on CFG_SYMBOL.  Note that | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 501 | (unlike CFG_SYMBOL) this is really just a variant of #ifdef, and can | 
|  | 502 | still result in configuration dependent build breaks.  Use with caution.</p> | 
|  | 503 | </li> | 
|  | 504 | </ul> | 
|  | 505 |  | 
| Rob Landley | 81b899d | 2007-12-18 02:02:47 -0600 | [diff] [blame] | 506 | <p><h2>Directory toys/</h2></p> | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 507 |  | 
|  | 508 | <h3>toys/Config.in</h3> | 
|  | 509 |  | 
|  | 510 | <p>Included from the top level Config.in, contains one or more | 
|  | 511 | configuration entries for each command.</p> | 
|  | 512 |  | 
| Rob Landley | 81b899d | 2007-12-18 02:02:47 -0600 | [diff] [blame] | 513 | <p>Each command has a configuration entry matching the command name (although | 
|  | 514 | configuration symbols are uppercase and command names are lower case). | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 515 | Options to commands start with the command name followed by an underscore and | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 516 | the option name.  Global options are attached to the "toybox" command, | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 517 | and thus use the prefix "TOYBOX_".  This organization is used by | 
| Rob Landley | 81b899d | 2007-12-18 02:02:47 -0600 | [diff] [blame] | 518 | scripts/cfg2files to select which toys/*.c files to compile for a given | 
|  | 519 | .config.</p> | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 520 |  | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 521 | <p>A command with multiple names (or multiple similar commands implemented in | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 522 | the same .c file) should have config symbols prefixed with the name of their | 
|  | 523 | C file.  I.E. config symbol prefixes are NEWTOY() names.  If OLDTOY() names | 
|  | 524 | have config symbols they're options (symbols with an underscore and suffix) | 
|  | 525 | to the NEWTOY() name.  (See toys/toylist.h)</p> | 
|  | 526 |  | 
|  | 527 | <h3>toys/toylist.h</h3> | 
| Rob Landley | 81b899d | 2007-12-18 02:02:47 -0600 | [diff] [blame] | 528 | <p>The first half of this file prototypes all the structures to hold | 
| Rob Landley | da09b7f | 2007-12-20 06:29:59 -0600 | [diff] [blame] | 529 | global variables for each command, and puts them in toy_union.  These | 
|  | 530 | prototypes are only included if the macro NEWTOY isn't defined (in which | 
|  | 531 | case NEWTOY is defined to a default value that produces function | 
|  | 532 | prototypes).</p> | 
| Rob Landley | 81b899d | 2007-12-18 02:02:47 -0600 | [diff] [blame] | 533 |  | 
| Rob Landley | da09b7f | 2007-12-20 06:29:59 -0600 | [diff] [blame] | 534 | <p>The second half of this file lists all the commands in alphabetical | 
|  | 535 | order, along with their command line arguments and install location. | 
|  | 536 | Each command has an appropriate configuration guard so only the commands that | 
|  | 537 | are enabled wind up in the list.</p> | 
|  | 538 |  | 
|  | 539 | <p>The first time this header is #included, it defines structures and | 
|  | 540 | produces function prototypes for the commands in the toys directory.</p> | 
|  | 541 |  | 
|  | 542 |  | 
|  | 543 | <p>The first time it's included, it defines structures and produces function | 
|  | 544 | prototypes. | 
|  | 545 | This | 
| Rob Landley | 81b899d | 2007-12-18 02:02:47 -0600 | [diff] [blame] | 546 | is used to initialize toy_list in main.c, and later in that file to initialize | 
|  | 547 | NEED_OPTIONS (to figure out whether the command like parsing logic is needed), | 
|  | 548 | and to put the help entries in the right order in toys/help.c.</p> | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 549 |  | 
|  | 550 | <h3>toys/help.h</h3> | 
|  | 551 |  | 
|  | 552 | <p>#defines two help text strings for each command: a single line | 
|  | 553 | command_help and an additinal command_help_long.  This is used by help_main() | 
|  | 554 | in toys/help.c to display help for commands.</p> | 
|  | 555 |  | 
|  | 556 | <p>Although this file is generated from Config.in help entries by | 
|  | 557 | scripts/config2help.py, it's shipped in release tarballs so you don't need | 
|  | 558 | python on the build system.  (If you check code out of source control, or | 
|  | 559 | modify Config.in, then you'll need python installed to rebuild it.)</p> | 
|  | 560 |  | 
|  | 561 | <p>This file contains help for all commands, regardless of current | 
|  | 562 | configuration, but only the currently enabled ones are entered into help_data[] | 
|  | 563 | in toys/help.c.</p> | 
|  | 564 |  | 
| Rob Landley | 137bf34 | 2012-03-09 08:33:57 -0600 | [diff] [blame] | 565 | <a name="lib"> | 
| Rob Landley | 81b899d | 2007-12-18 02:02:47 -0600 | [diff] [blame] | 566 | <h2>Directory lib/</h2> | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 567 |  | 
| Rob Landley | 137bf34 | 2012-03-09 08:33:57 -0600 | [diff] [blame] | 568 | <p>TODO: document lots more here.</p> | 
|  | 569 |  | 
| Rob Landley | c8d0da5 | 2012-07-15 17:47:08 -0500 | [diff] [blame] | 570 | <p>lib: getmountlist(), error_msg/error_exit, xmalloc(), | 
| Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 571 | strlcpy(), xexec(), xopen()/xread(), xgetcwd(), xabspath(), find_in_path(), | 
|  | 572 | itoa().</p> | 
|  | 573 |  | 
| Rob Landley | 137bf34 | 2012-03-09 08:33:57 -0600 | [diff] [blame] | 574 | <h3>lib/portability.h</h3> | 
|  | 575 |  | 
|  | 576 | <p>This file is automatically included from the top of toys.h, and smooths | 
|  | 577 | over differences between platforms (hardware targets, compilers, C libraries, | 
|  | 578 | operating systems, etc).</p> | 
|  | 579 |  | 
|  | 580 | <p>This file provides SWAP macros (SWAP_BE16(x) and SWAP_LE32(x) and so on).</p> | 
|  | 581 |  | 
|  | 582 | <p>A macro like SWAP_LE32(x) means "The value in x is stored as a little | 
|  | 583 | endian 32 bit value, so perform the translation to/from whatever the native | 
|  | 584 | 32-bit format is".  You do the swap once on the way in, and once on the way | 
|  | 585 | out. If your target is already little endian, the macro is a NOP.</p> | 
|  | 586 |  | 
|  | 587 | <p>The SWAP macros come in BE and LE each with 16, 32, and 64 bit versions. | 
|  | 588 | In each case, the name of the macro refers to the _external_ representation, | 
|  | 589 | and converts to/from whatever your native representation happens to be (which | 
|  | 590 | can vary depending on what you're currently compiling for).</p> | 
|  | 591 |  | 
| Rob Landley | c8d0da5 | 2012-07-15 17:47:08 -0500 | [diff] [blame] | 592 | <a name="lib_llist"><h3>lib/llist.c</h3> | 
|  | 593 |  | 
|  | 594 | <p>Some generic single and doubly linked list functions, which take | 
|  | 595 | advantage of a couple properties of C:</p> | 
|  | 596 |  | 
|  | 597 | <ul> | 
|  | 598 | <li><p>Structure elements are laid out in memory in the order listed, and | 
|  | 599 | the first element has no padding. This means you can always treat (typecast) | 
|  | 600 | a pointer to a structure as a pointer to the first element of the structure, | 
|  | 601 | even if you don't know anything about the data following it.</p></li> | 
|  | 602 |  | 
|  | 603 | <li><p>An array of length zero at the end of a structure adds no space | 
|  | 604 | to the sizeof() the structure, but if you calculate how much extra space | 
|  | 605 | you want when you malloc() the structure it will be available at the end. | 
|  | 606 | Since C has no bounds checking, this means each struct can have one variable | 
|  | 607 | length array.</p></li> | 
|  | 608 | </ul> | 
|  | 609 |  | 
|  | 610 | <p>Toybox's list structures always have their <b>next</b> pointer as | 
|  | 611 | the first entry of each struct, and singly linked lists end with a NULL pointer. | 
|  | 612 | This allows generic code to traverse such lists without knowing anything | 
|  | 613 | else about the specific structs composing them: if your pointer isn't NULL | 
|  | 614 | typecast it to void ** and dereference once to get the next entry.</p> | 
|  | 615 |  | 
|  | 616 | <p><b>lib/lib.h</b> defines three structure types:</p> | 
|  | 617 | <ul> | 
|  | 618 | <li><p><b>struct string_list</b> - stores a single string (<b>char str[0]</b>), | 
|  | 619 | memory for which is allocated as part of the node. (I.E. llist_traverse(list, | 
|  | 620 | free); can clean up after this type of list.)</p></li> | 
|  | 621 |  | 
|  | 622 | <li><p><b>struct arg_list</b> - stores a pointer to a single string | 
|  | 623 | (<b>char *arg</b>) which is stored in a separate chunk of memory.</p></li> | 
|  | 624 |  | 
|  | 625 | <li><p><b>struct double_list</b> - has a second pointer (<b>struct double_list | 
|  | 626 | *prev</b> along with a <b>char *data</b> for payload.</p></li> | 
|  | 627 | </ul> | 
|  | 628 |  | 
|  | 629 | <b>List Functions</b> | 
|  | 630 |  | 
|  | 631 | <ul> | 
|  | 632 | <li><p>void *<b>llist_pop</b>(void **list) - advances through a list ala | 
|  | 633 | <b>node = llist_pop(&list);</b>  This doesn't modify the list contents, | 
|  | 634 | but does advance the pointer you feed it (which is why you pass the _address_ | 
|  | 635 | of that pointer, not the pointer itself).</p></li> | 
|  | 636 |  | 
|  | 637 | <li><p>void <b>llist_traverse</b>(void *list, void (*using)(void *data)) - | 
|  | 638 | iterate through a list calling a function on each node.</p></li> | 
|  | 639 |  | 
|  | 640 | <li><p>struct double_list *<b>dlist_add</b>(struct double_list **llist, char *data) | 
|  | 641 | - append an entry to a circular linked list. | 
|  | 642 | This function allocates a new struct double_list wrapper and returns the | 
|  | 643 | pointer to the new entry (which you can usually ignore since it's llist->prev, | 
|  | 644 | but if llist was NULL you need it). The argument is the ->data field for the | 
|  | 645 | new node.</p></li> | 
|  | 646 | <ul><li><p>void <b>dlist_add_nomalloc</b>(struct double_list **llist, | 
|  | 647 | struct double_list *new) - append existing struct double_list to | 
|  | 648 | list, does not allocate anything.</p></li></ul> | 
|  | 649 | </ul> | 
|  | 650 |  | 
|  | 651 | <b>Trivia questions:</b> | 
|  | 652 |  | 
|  | 653 | <ul> | 
|  | 654 | <li><p><b>Why do arg_list and double_list contain a char * payload instead of | 
|  | 655 | a void *?</b> - Because you always have to typecast a void * to use it, and | 
|  | 656 | typecasting a char * does no harm. Thus having it default to the most common | 
|  | 657 | pointer type saves a few typecasts (strings are the most common payload), | 
|  | 658 | and doesn't hurt anything otherwise.</p> | 
|  | 659 | </li> | 
|  | 660 |  | 
|  | 661 | <li><p><b>Why do the names ->str, ->arg, and ->data differ?</b> - To force | 
|  | 662 | you to keep track of which one you're using, calling free(node->str) would | 
|  | 663 | be bad, and _failing_ to free(node->arg) leaks memory.</p></li> | 
|  | 664 |  | 
|  | 665 | <li><p><b>Why does llist_pop() take a void * instead of void **?</b> - | 
|  | 666 | because the stupid compiler complains about "type punned pointers" when | 
|  | 667 | you typecast and dereference ont he same line, | 
|  | 668 | due to insane FSF developers hardwiring limitations of their optimizer | 
|  | 669 | into gcc's warning system. Since C automatically typecasts any other | 
|  | 670 | pointer _down_ to a void *, the current code works fine. It's sad that it | 
|  | 671 | won't warn you if you forget the &, but the code crashes pretty quickly in | 
|  | 672 | that case.</p></li> | 
|  | 673 |  | 
|  | 674 | <li><p><b>How do I assemble a singly-linked-list in order?</b> - use | 
|  | 675 | a double_list, dlist_add() your entries, and then break the circle with | 
|  | 676 | <b>list->prev->next = NULL;</b> when done.</li> | 
|  | 677 | </ul> | 
|  | 678 |  | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 679 | <a name="lib_args"><h3>lib/args.c</h3> | 
| Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 680 |  | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 681 | <p>Toybox's main.c automatically parses command line options before calling the | 
|  | 682 | command's main function.  Option parsing starts in get_optflags(), which stores | 
|  | 683 | results in the global structures "toys" (optflags and optargs) and "this".</p> | 
|  | 684 |  | 
|  | 685 | <p>The option parsing infrastructure stores a bitfield in toys.optflags to | 
|  | 686 | indicate which options the current command line contained.  Arguments | 
|  | 687 | attached to those options are saved into the command's global structure | 
|  | 688 | ("this").  Any remaining command line arguments are collected together into | 
|  | 689 | the null-terminated array toys.optargs, with the length in toys.optc.  (Note | 
|  | 690 | that toys.optargs does not contain the current command name at position zero, | 
|  | 691 | use "toys.which->name" for that.)  The raw command line arguments get_optflags() | 
|  | 692 | parsed are retained unmodified in toys.argv[].</p> | 
|  | 693 |  | 
|  | 694 | <p>Toybox's option parsing logic is controlled by an "optflags" string, using | 
|  | 695 | a format reminiscent of getopt's optargs but has several important differences. | 
|  | 696 | Toybox does not use the getopt() | 
|  | 697 | function out of the C library, get_optflags() is an independent implementation | 
|  | 698 | which doesn't permute the original arguments (and thus doesn't change how the | 
|  | 699 | command is displayed in ps and top), and has many features not present in | 
|  | 700 | libc optargs() (such as the ability to describe long options in the same string | 
|  | 701 | as normal options).</p> | 
|  | 702 |  | 
|  | 703 | <p>Each command's NEWTOY() macro has an optflags string as its middle argument, | 
|  | 704 | which sets toy_list.options for that command to tell get_optflags() what | 
|  | 705 | command line arguments to look for, and what to do with them. | 
|  | 706 | If a command has no option | 
|  | 707 | definition string (I.E. the argument is NULL), option parsing is skipped | 
|  | 708 | for that command, which must look at the raw data in toys.argv to parse its | 
|  | 709 | own arguments.  (If no currently enabled command uses option parsing, | 
|  | 710 | get_optflags() is optimized out of the resulting binary by the compiler's | 
|  | 711 | --gc-sections option.)</p> | 
|  | 712 |  | 
|  | 713 | <p>You don't have to free the option strings, which point into the environment | 
|  | 714 | space (I.E. the string data is not copied).  A TOYFLAG_NOFORK command | 
|  | 715 | that uses the linked list type "*" should free the list objects but not | 
|  | 716 | the data they point to, via "llist_free(TT.mylist, NULL);".  (If it's not | 
|  | 717 | NOFORK, exit() will free all the malloced data anyway unless you want | 
|  | 718 | to implement a CONFIG_TOYBOX_FREE cleanup for it.)</p> | 
|  | 719 |  | 
|  | 720 | <h4>Optflags format string</h4> | 
|  | 721 |  | 
|  | 722 | <p>Note: the optflags option description string format is much more | 
|  | 723 | concisely described by a large comment at the top of lib/args.c.</p> | 
|  | 724 |  | 
|  | 725 | <p>The general theory is that letters set optflags, and punctuation describes | 
|  | 726 | other actions the option parsing logic should take.</p> | 
|  | 727 |  | 
|  | 728 | <p>For example, suppose the command line <b>command -b fruit -d walrus -a 42</b> | 
|  | 729 | is parsed using the optflags string "<b>a#b:c:d</b>".  (I.E. | 
|  | 730 | toys.which->options="a#b:c:d" and argv = ["command", "-b", "fruit", "-d", | 
|  | 731 | "walrus", "-a", "42"]).  When get_optflags() returns, the following data is | 
|  | 732 | available to command_main(): | 
|  | 733 |  | 
|  | 734 | <ul> | 
|  | 735 | <li><p>In <b>struct toys</b>: | 
|  | 736 | <ul> | 
|  | 737 | <li>toys.optflags = 13; // -a = 8 | -b = 4 | -d = 1</li> | 
|  | 738 | <li>toys.optargs[0] = "walrus"; // leftover argument</li> | 
|  | 739 | <li>toys.optargs[1] = NULL; // end of list</li> | 
| Rob Landley | b911d4d | 2013-09-21 14:27:26 -0500 | [diff] [blame] | 740 | <li>toys.optc = 1; // there was 1 leftover argument</li> | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 741 | <li>toys.argv[] = {"-b", "fruit", "-d", "walrus", "-a", "42"}; // The original command line arguments | 
|  | 742 | </ul> | 
|  | 743 | <p></li> | 
|  | 744 |  | 
|  | 745 | <li><p>In <b>union this</b> (treated as <b>long this[]</b>): | 
|  | 746 | <ul> | 
|  | 747 | <li>this[0] = NULL; // -c didn't get an argument this time, so get_optflags() didn't change it and toys_init() zeroed "this" during setup.)</li> | 
|  | 748 | <li>this[1] = (long)"fruit"; // argument to -b</li> | 
|  | 749 | <li>this[2] = 42; // argument to -a</li> | 
|  | 750 | </ul> | 
|  | 751 | </p></li> | 
|  | 752 | </ul> | 
|  | 753 |  | 
|  | 754 | <p>If the command's globals are:</p> | 
|  | 755 |  | 
|  | 756 | <blockquote><pre> | 
| Rob Landley | c0e56ed | 2012-10-08 00:02:30 -0500 | [diff] [blame] | 757 | GLOBALS( | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 758 | char *c; | 
|  | 759 | char *b; | 
|  | 760 | long a; | 
|  | 761 | ) | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 762 | </pre></blockquote> | 
|  | 763 | <p>That would mean TT.c == NULL, TT.b == "fruit", and TT.a == 42.  (Remember, | 
|  | 764 | each entry that receives an argument must be a long or pointer, to line up | 
|  | 765 | with the array position.  Right to left in the optflags string corresponds to | 
| Rob Landley | c0e56ed | 2012-10-08 00:02:30 -0500 | [diff] [blame] | 766 | top to bottom in GLOBALS().</p> | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 767 |  | 
| Rob Landley | b911d4d | 2013-09-21 14:27:26 -0500 | [diff] [blame] | 768 | <p>Put globals not filled out by the option parsing logic at the end of the | 
|  | 769 | GLOBALS block. Common practice is to list the options one per line (to | 
|  | 770 | make the ordering explicit, first to last in globals corresponds to right | 
|  | 771 | to left in the option string), then leave a blank line before any non-option | 
|  | 772 | globals.</p> | 
|  | 773 |  | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 774 | <p><b>long toys.optflags</b></p> | 
|  | 775 |  | 
|  | 776 | <p>Each option in the optflags string corresponds to a bit position in | 
|  | 777 | toys.optflags, with the same value as a corresponding binary digit.  The | 
|  | 778 | rightmost argument is (1<<0), the next to last is (1<<1) and so on.  If | 
| Rob Landley | b4a0efa | 2012-02-06 21:15:19 -0600 | [diff] [blame] | 779 | the option isn't encountered while parsing argv[], its bit remains 0.</p> | 
|  | 780 |  | 
|  | 781 | <p>For example, | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 782 | the optflags string "abcd" would parse the command line argument "-c" to set | 
|  | 783 | optflags to 2, "-a" would set optflags to 8, "-bd" would set optflags to | 
|  | 784 | 6 (I.E. 4|2), and "-a -c" would set optflags to 10 (2|8).</p> | 
|  | 785 |  | 
|  | 786 | <p>Only letters are relevant to optflags, punctuation is skipped: in the | 
|  | 787 | string "a*b:c#d", d=1, c=2, b=4, a=8.  The punctuation after a letter | 
|  | 788 | usually indicate that the option takes an argument.</p> | 
|  | 789 |  | 
| Rob Landley | b4a0efa | 2012-02-06 21:15:19 -0600 | [diff] [blame] | 790 | <p>Since toys.optflags is an unsigned int, it only stores 32 bits.  (Which is | 
|  | 791 | the amount a long would have on 32-bit platforms anyway; 64 bit code on | 
|  | 792 | 32 bit platforms is too expensive to require in common code used by almost | 
|  | 793 | all commands.)  Bit positions beyond the 1<<31 aren't recorded, but | 
|  | 794 | parsing higher options can still set global variables.</p> | 
|  | 795 |  | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 796 | <p><b>Automatically setting global variables from arguments (union this)</b></p> | 
|  | 797 |  | 
|  | 798 | <p>The following punctuation characters may be appended to an optflags | 
|  | 799 | argument letter, indicating the option takes an additional argument:</p> | 
|  | 800 |  | 
|  | 801 | <ul> | 
|  | 802 | <li><b>:</b> - plus a string argument, keep most recent if more than one.</li> | 
|  | 803 | <li><b>*</b> - plus a string argument, appended to a linked list.</li> | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 804 | <li><b>@</b> - plus an occurrence counter (stored in a long)</li> | 
| Rob Landley | b6063de | 2012-01-29 13:54:13 -0600 | [diff] [blame] | 805 | <li><b>#</b> - plus a signed long argument. | 
| Rob Landley | b911d4d | 2013-09-21 14:27:26 -0500 | [diff] [blame] | 806 | <li><b>-</b> - plus a signed long argument defaulting to negative (start argument with + to force a positive value).</li> | 
| Rob Landley | b6063de | 2012-01-29 13:54:13 -0600 | [diff] [blame] | 807 | <li><b>.</b> - plus a floating point argument (if CFG_TOYBOX_FLOAT).</li> | 
|  | 808 | <ul>The following can be appended to a float or double: | 
|  | 809 | <li><b><123</b> - error if argument is less than this</li> | 
|  | 810 | <li><b>>123</b> - error if argument is greater than this</li> | 
|  | 811 | <li><b>=123</b> - default value if argument not supplied</li> | 
|  | 812 | </ul> | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 813 | </ul> | 
|  | 814 |  | 
| Rob Landley | b911d4d | 2013-09-21 14:27:26 -0500 | [diff] [blame] | 815 | <p>A note about "." and CFG_TOYBOX_FLOAT: option parsing only understands <>= | 
|  | 816 | after . when CFG_TOYBOX_FLOAT | 
|  | 817 | is enabled. (Otherwise the code to determine where floating point constants | 
|  | 818 | end drops out; it requires floating point).  When disabled, it can reserve a | 
|  | 819 | global data slot for the argument (so offsets won't change in your | 
|  | 820 | GLOBALS[] block), but will never fill it out. You can handle | 
|  | 821 | this by using the USE_BLAH() macros with C string concatenation, ala: | 
|  | 822 | "abc." USE_TOYBOX_FLOAT("<1.23>4.56=7.89") "def"</p> | 
|  | 823 |  | 
|  | 824 | <p><b>GLOBALS</b></p> | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 825 |  | 
|  | 826 | <p>Options which have an argument fill in the corresponding slot in the global | 
|  | 827 | union "this" (see generated/globals.h), treating it as an array of longs | 
| Rob Landley | b911d4d | 2013-09-21 14:27:26 -0500 | [diff] [blame] | 828 | with the rightmost saved in this[0].  As described above, using "a*b:c#d", | 
|  | 829 | "-c 42" would set this[0] = 42; and "-b 42" would set this[1] = "42"; each | 
|  | 830 | slot is left NULL if the corresponding argument is not encountered.</p> | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 831 |  | 
|  | 832 | <p>This behavior is useful because the LP64 standard ensures long and pointer | 
| Rob Landley | b4a0efa | 2012-02-06 21:15:19 -0600 | [diff] [blame] | 833 | are the same size. C99 guarantees structure members will occur in memory | 
|  | 834 | in the same order they're declared, and that padding won't be inserted between | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 835 | consecutive variables of register size.  Thus the first few entries can | 
|  | 836 | be longs or pointers corresponding to the saved arguments.</p> | 
|  | 837 |  | 
| Rob Landley | b911d4d | 2013-09-21 14:27:26 -0500 | [diff] [blame] | 838 | <p>See toys/other/hello.c for a longer example of parsing options into the | 
|  | 839 | GLOBALS block.</p> | 
|  | 840 |  | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 841 | <p><b>char *toys.optargs[]</b></p> | 
|  | 842 |  | 
|  | 843 | <p>Command line arguments in argv[] which are not consumed by option parsing | 
|  | 844 | (I.E. not recognized either as -flags or arguments to -flags) will be copied | 
|  | 845 | to toys.optargs[], with the length of that array in toys.optc. | 
|  | 846 | (When toys.optc is 0, no unrecognized command line arguments remain.) | 
|  | 847 | The order of entries is preserved, and as with argv[] this new array is also | 
|  | 848 | terminated by a NULL entry.</p> | 
|  | 849 |  | 
|  | 850 | <p>Option parsing can require a minimum or maximum number of optargs left | 
|  | 851 | over, by adding "<1" (read "at least one") or ">9" ("at most nine") to the | 
|  | 852 | start of the optflags string.</p> | 
|  | 853 |  | 
|  | 854 | <p>The special argument "--" terminates option parsing, storing all remaining | 
|  | 855 | arguments in optargs.  The "--" itself is consumed.</p> | 
|  | 856 |  | 
|  | 857 | <p><b>Other optflags control characters</b></p> | 
|  | 858 |  | 
|  | 859 | <p>The following characters may occur at the start of each command's | 
|  | 860 | optflags string, before any options that would set a bit in toys.optflags:</p> | 
|  | 861 |  | 
|  | 862 | <ul> | 
|  | 863 | <li><b>^</b> - stop at first nonoption argument (for nice, xargs...)</li> | 
|  | 864 | <li><b>?</b> - allow unknown arguments (pass non-option arguments starting | 
|  | 865 | with - through to optargs instead of erroring out).</li> | 
|  | 866 | <li><b>&</b> - the first argument has imaginary dash (ala tar/ps.  If given twice, all arguments have imaginary dash.)</li> | 
|  | 867 | <li><b><</b> - must be followed by a decimal digit indicating at least this many leftover arguments are needed in optargs (default 0)</li> | 
|  | 868 | <li><b>></b> - must be followed by a decimal digit indicating at most this many leftover arguments allowed (default MAX_INT)</li> | 
|  | 869 | </ul> | 
|  | 870 |  | 
|  | 871 | <p>The following characters may be appended to an option character, but do | 
|  | 872 | not by themselves indicate an extra argument should be saved in this[]. | 
|  | 873 | (Technically any character not recognized as a control character sets an | 
|  | 874 | optflag, but letters are never control characters.)</p> | 
|  | 875 |  | 
|  | 876 | <ul> | 
|  | 877 | <li><b>^</b> - stop parsing options after encountering this option, everything else goes into optargs.</li> | 
|  | 878 | <li><b>|</b> - this option is required.  If more than one marked, only one is required.</li> | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 879 | </ul> | 
|  | 880 |  | 
| Rob Landley | b6063de | 2012-01-29 13:54:13 -0600 | [diff] [blame] | 881 | <p>The following may be appended to a float or double:</p> | 
|  | 882 |  | 
|  | 883 | <ul> | 
|  | 884 | <li><b><123</b> - error if argument is less than this</li> | 
|  | 885 | <li><b>>123</b> - error if argument is greater than this</li> | 
|  | 886 | <li><b>=123</b> - default value if argument not supplied</li> | 
|  | 887 | </ul> | 
|  | 888 |  | 
|  | 889 | <p>Option parsing only understands <>= after . when CFG_TOYBOX_FLOAT | 
|  | 890 | is enabled. (Otherwise the code to determine where floating point constants | 
|  | 891 | end drops out.  When disabled, it can reserve a global data slot for the | 
|  | 892 | argument so offsets won't change, but will never fill it out.). You can handle | 
|  | 893 | this by using the USE_BLAH() macros with C string concatenation, ala:</p> | 
|  | 894 |  | 
|  | 895 | <blockquote>"abc." USE_TOYBOX_FLOAT("<1.23>4.56=7.89") "def"</blockquote> | 
|  | 896 |  | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 897 | <p><b>--longopts</b></p> | 
|  | 898 |  | 
|  | 899 | <p>The optflags string can contain long options, which are enclosed in | 
|  | 900 | parentheses.  They may be appended to an existing option character, in | 
|  | 901 | which case the --longopt is a synonym for that option, ala "a:(--fred)" | 
|  | 902 | which understands "-a blah" or "--fred blah" as synonyms.</p> | 
|  | 903 |  | 
|  | 904 | <p>Longopts may also appear before any other options in the optflags string, | 
|  | 905 | in which case they have no corresponding short argument, but instead set | 
|  | 906 | their own bit based on position.  So for "(walrus)#(blah)xy:z" "command | 
|  | 907 | --walrus 42" would set toys.optflags = 16 (-z = 1, -y = 2, -x = 4, --blah = 8) | 
|  | 908 | and would assign this[1] = 42;</p> | 
|  | 909 |  | 
|  | 910 | <p>A short option may have multiple longopt synonyms, "a(one)(two)", but | 
|  | 911 | each "bare longopt" (ala "(one)(two)abc" before any option characters) | 
|  | 912 | always sets its own bit (although you can group them with +X).</p> | 
| Rob Landley | 7c04f01 | 2008-01-20 19:00:16 -0600 | [diff] [blame] | 913 |  | 
| Rob Landley | b911d4d | 2013-09-21 14:27:26 -0500 | [diff] [blame] | 914 | <p><b>[groups]</b></p> | 
|  | 915 |  | 
|  | 916 | <p>At the end of the option string, square bracket groups can define | 
|  | 917 | relationships between existing options. (This only applies to short | 
|  | 918 | options, bare --longopts can't participate.)</p> | 
|  | 919 |  | 
|  | 920 | <p>The first character of the group defines the type, the remaining | 
|  | 921 | characters are options it applies to:</p> | 
|  | 922 |  | 
|  | 923 | <ul> | 
|  | 924 | <li><b>-</b> - Exclusive, switch off all others in this group.</li> | 
|  | 925 | <li><b>+</b> - Inclusive, switch on all others in this group.</li> | 
|  | 926 | <li><b>!</b> - Error, fail if more than one defined.</li> | 
|  | 927 | </ul> | 
|  | 928 |  | 
|  | 929 | <p>So "abc[-abc]" means -ab = -b, -ba = -a, -abc = -c. "abc[+abc]" | 
|  | 930 | means -ab=-abc, -c=-abc, and "abc[!abc] means -ab calls error_exit("no -b | 
|  | 931 | with -a"). Note that [-] groups clear the GLOBALS option slot of | 
|  | 932 | options they're switching back off, but [+] won't set options it didn't see | 
|  | 933 | (just the optflags).</p> | 
|  | 934 |  | 
|  | 935 | <p><b>whitespace</b></p> | 
|  | 936 |  | 
|  | 937 | <p>Arguments may occur with or without a space (I.E. "-a 42" or "-a42"). | 
|  | 938 | The command line argument "-abc" may be interepreted many different ways: | 
|  | 939 | the optflags string "cba" sets toys.optflags = 7, "c:ba" sets toys.optflags=4 | 
|  | 940 | and saves "ba" as the argument to -c, and "cb:a" sets optflags to 6 and saves | 
|  | 941 | "c" as the argument to -b.</p> | 
|  | 942 |  | 
|  | 943 | <p>Note that & changes whitespace handling, so that the command line | 
|  | 944 | "tar cvfCj outfile.tar.bz2 topdir filename" is parsed the same as | 
|  | 945 | "tar filename -c -v -j -f outfile.tar.bz2 -C topdir". Note that "tar -cvfCj | 
|  | 946 | one two three" would equal "tar -c -v -f Cj one two three". (This matches | 
|  | 947 | historical usage.)</p> | 
|  | 948 |  | 
|  | 949 | <p>Appending a space to the option in the option string ("a: b") makes it | 
|  | 950 | require a space, I.E. "-ab" is interpreted as "-a" "-b". That way "kill -stop" | 
|  | 951 | differs from "kill -s top".</p> | 
|  | 952 |  | 
|  | 953 | <p>Appending ; to a longopt in the option string makes its argument optional, | 
|  | 954 | and only settable with =, so in ls "(color):;" can accept "ls --color" and | 
|  | 955 | "ls --color=auto" without complaining that the first has no argument.</p> | 
|  | 956 |  | 
| Rob Landley | c8d0da5 | 2012-07-15 17:47:08 -0500 | [diff] [blame] | 957 | <a name="lib_dirtree"><h3>lib/dirtree.c</h3> | 
|  | 958 |  | 
|  | 959 | <p>The directory tree traversal code should be sufficiently generic | 
|  | 960 | that commands never need to use readdir(), scandir(), or the fts.h family | 
|  | 961 | of functions.</p> | 
|  | 962 |  | 
|  | 963 | <p>These functions do not call chdir() or rely on PATH_MAX. Instead they | 
|  | 964 | use openat() and friends, using one filehandle per directory level to | 
|  | 965 | recurseinto subdirectories. (I.E. they can descend 1000 directories deep | 
|  | 966 | if setrlimit(RLIMIT_NOFILE) allows enough open filehandles, and the default | 
|  | 967 | in /proc/self/limits is generally 1024.)</p> | 
|  | 968 |  | 
|  | 969 | <p>The basic dirtree functions are:</p> | 
|  | 970 |  | 
|  | 971 | <ul> | 
|  | 972 | <li><p><b>dirtree_read(char *path, int (*callback)(struct dirtree node))</b> - | 
|  | 973 | recursively read directories, either applying callback() or returning | 
|  | 974 | a tree of struct dirtree if callback is NULL.</p></li> | 
|  | 975 |  | 
|  | 976 | <li><p><b>dirtree_path(struct dirtree *node, int *plen)</b> - malloc() a | 
|  | 977 | string containing the path from the root of this tree to this node. If | 
|  | 978 | plen isn't NULL then *plen is how many extra bytes to malloc at the end | 
|  | 979 | of string.</p></li> | 
|  | 980 |  | 
|  | 981 | <li><p><b>dirtree_parentfd(struct dirtree *node)</b> - return fd of | 
|  | 982 | containing directory, for use with openat() and such.</p></li> | 
|  | 983 | </ul> | 
|  | 984 |  | 
|  | 985 | <p>The <b>dirtree_read()</b> function takes two arguments, a starting path for | 
|  | 986 | the root of the tree, and a callback function. The callback takes a | 
|  | 987 | <b>struct dirtree *</b> (from lib/lib.h) as its argument. If the callback is | 
|  | 988 | NULL, the traversal uses a default callback (dirtree_notdotdot()) which | 
|  | 989 | recursively assembles a tree of struct dirtree nodes for all files under | 
|  | 990 | this directory and subdirectories (filtering out "." and ".." entries), | 
|  | 991 | after which dirtree_read() returns the pointer to the root node of this | 
|  | 992 | snapshot tree.</p> | 
|  | 993 |  | 
|  | 994 | <p>Otherwise the callback() is called on each entry in the directory, | 
|  | 995 | with struct dirtree * as its argument. This includes the initial | 
|  | 996 | node created by dirtree_read() at the top of the tree.</p> | 
|  | 997 |  | 
|  | 998 | <p><b>struct dirtree</b></p> | 
|  | 999 |  | 
|  | 1000 | <p>Each struct dirtree node contains <b>char name[]</b> and <b>struct stat | 
|  | 1001 | st</b> entries describing a file, plus a <b>char *symlink</b> | 
|  | 1002 | which is NULL for non-symlinks.</p> | 
|  | 1003 |  | 
|  | 1004 | <p>During a callback function, the <b>int data</b> field of directory nodes | 
|  | 1005 | contains a dirfd (for use with the openat() family of functions). This is | 
|  | 1006 | generally used by calling dirtree_parentfd() on the callback's node argument. | 
|  | 1007 | For symlinks, data contains the length of the symlink string. On the second | 
|  | 1008 | callback from DIRTREE_COMEAGAIN (depth-first traversal) data = -1 for | 
|  | 1009 | all nodes (that's how you can tell it's the second callback).</p> | 
|  | 1010 |  | 
|  | 1011 | <p>Users of this code may put anything they like into the <b>long extra</b> | 
|  | 1012 | field. For example, "cp" and "mv" use this to store a dirfd for the destination | 
|  | 1013 | directory (and use DIRTREE_COMEAGAIN to get the second callback so they can | 
|  | 1014 | close(node->extra) to avoid running out of filehandles). | 
|  | 1015 | This field is not directly used by the dirtree code, and | 
|  | 1016 | thanks to LP64 it's large enough to store a typecast pointer to an | 
|  | 1017 | arbitrary struct.</p> | 
|  | 1018 |  | 
|  | 1019 | <p>The return value of the callback combines flags (with boolean or) to tell | 
|  | 1020 | the traversal infrastructure how to behave:</p> | 
|  | 1021 |  | 
|  | 1022 | <ul> | 
|  | 1023 | <li><p><b>DIRTREE_SAVE</b> - Save this node, assembling a tree. (Without | 
|  | 1024 | this the struct dirtree is freed after the callback returns. Filtering out | 
|  | 1025 | siblings is fine, but discarding a parent while keeping its child leaks | 
|  | 1026 | memory.)</p></li> | 
|  | 1027 | <li><p><b>DIRTREE_ABORT</b> - Do not examine any more entries in this | 
|  | 1028 | directory. (Does not propagate up tree: to abort entire traversal, | 
|  | 1029 | return DIRTREE_ABORT from parent callbacks too.)</p></li> | 
|  | 1030 | <li><p><b>DIRTREE_RECURSE</b> - Examine directory contents. Ignored for | 
|  | 1031 | non-directory entries. The remaining flags only take effect when | 
|  | 1032 | recursing into the children of a directory.</p></li> | 
|  | 1033 | <li><p><b>DIRTREE_COMEAGAIN</b> - Call the callback a second time after | 
|  | 1034 | examining all directory contents, allowing depth-first traversal. | 
|  | 1035 | On the second call, dirtree->data = -1.</p></li> | 
|  | 1036 | <li><p><b>DIRTREE_SYMFOLLOW</b> - follow symlinks when populating children's | 
|  | 1037 | <b>struct stat st</b> (by feeding a nonzero value to the symfollow argument of | 
|  | 1038 | dirtree_add_node()), which means DIRTREE_RECURSE treats symlinks to | 
|  | 1039 | directories as directories. (Avoiding infinite recursion is the callback's | 
|  | 1040 | problem: the non-NULL dirtree->symlink can still distinguish between | 
|  | 1041 | them.)</p></li> | 
|  | 1042 | </ul> | 
|  | 1043 |  | 
|  | 1044 | <p>Each struct dirtree contains three pointers (next, parent, and child) | 
|  | 1045 | to other struct dirtree.</p> | 
|  | 1046 |  | 
|  | 1047 | <p>The <b>parent</b> pointer indicates the directory | 
|  | 1048 | containing this entry; even when not assembling a persistent tree of | 
|  | 1049 | nodes the parent entries remain live up to the root of the tree while | 
|  | 1050 | child nodes are active. At the top of the tree the parent pointer is | 
|  | 1051 | NULL, meaning the node's name[] is either an absolute path or relative | 
|  | 1052 | to cwd. The function dirtree_parentfd() gets the directory file descriptor | 
|  | 1053 | for use with openat() and friends, returning AT_FDCWD at the top of tree.</p> | 
|  | 1054 |  | 
|  | 1055 | <p>The <b>child</b> pointer points to the first node of the list of contents of | 
|  | 1056 | this directory. If the directory contains no files, or the entry isn't | 
|  | 1057 | a directory, child is NULL.</p> | 
|  | 1058 |  | 
|  | 1059 | <p>The <b>next</b> pointer indicates sibling nodes in the same directory as this | 
|  | 1060 | node, and since it's the first entry in the struct the llist.c traversal | 
|  | 1061 | mechanisms work to iterate over sibling nodes. Each dirtree node is a | 
|  | 1062 | single malloc() (even char *symlink points to memory at the end of the node), | 
|  | 1063 | so llist_free() works but its callback must descend into child nodes (freeing | 
|  | 1064 | a tree, not just a linked list), plus whatever the user stored in extra.</p> | 
|  | 1065 |  | 
|  | 1066 | <p>The <b>dirtree_read</b>() function is a simple wrapper, calling <b>dirtree_add_node</b>() | 
|  | 1067 | to create a root node relative to the current directory, then calling | 
|  | 1068 | <b>handle_callback</b>() on that node (which recurses as instructed by the callback | 
|  | 1069 | return flags). Some commands (such as chgrp) bypass this wrapper, for example | 
|  | 1070 | to control whether or not to follow symlinks to the root node; symlinks | 
|  | 1071 | listed on the command line are often treated differently than symlinks | 
|  | 1072 | encountered during recursive directory traversal). | 
|  | 1073 |  | 
|  | 1074 | <p>The ls command not only bypasses the wrapper, but never returns | 
|  | 1075 | <b>DIRTREE_RECURSE</b> from the callback, instead calling <b>dirtree_recurse</b>() manually | 
|  | 1076 | from elsewhere in the program. This gives ls -lR manual control | 
|  | 1077 | of traversal order, which is neither depth first nor breadth first but | 
|  | 1078 | instead a sort of FIFO order requried by the ls standard.</p> | 
|  | 1079 |  | 
| Rob Landley | c0e56ed | 2012-10-08 00:02:30 -0500 | [diff] [blame] | 1080 | <a name="#toys"> | 
|  | 1081 | <h2>Directory toys/</h2> | 
|  | 1082 |  | 
|  | 1083 | <p>This directory contains command implementations. Each command is a single | 
|  | 1084 | self-contained file. Adding a new command involves adding a single | 
|  | 1085 | file, and removing a command involves removing that file. Commands use | 
|  | 1086 | shared infrastructure from the lib/ and generated/ directories.</p> | 
|  | 1087 |  | 
|  | 1088 | <p>Currently there are three subdirectories under "toys/" containing commands | 
|  | 1089 | described in POSIX-2008, the Linux Standard Base 4.1, or "other". The only | 
|  | 1090 | difference this makes is which menu the command shows up in during "make | 
|  | 1091 | menuconfig", the directories are otherwise identical. Note that they commands | 
|  | 1092 | exist within a single namespace at runtime, so you can't have the same | 
|  | 1093 | command in multiple subdirectories.</p> | 
|  | 1094 |  | 
|  | 1095 | <p>(There are actually four sub-menus in "make menuconfig", the fourth | 
|  | 1096 | contains global configuration options for toybox, and lives in Config.in at | 
|  | 1097 | the top level.)</p> | 
|  | 1098 |  | 
|  | 1099 | <p>See <a href="#adding">adding a new command</a> for details on the | 
|  | 1100 | layout of a command file.</p> | 
|  | 1101 |  | 
| Rob Landley | 81b899d | 2007-12-18 02:02:47 -0600 | [diff] [blame] | 1102 | <h2>Directory scripts/</h2> | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 1103 |  | 
| Rob Landley | 1f4f41a | 2012-10-08 21:31:07 -0500 | [diff] [blame] | 1104 | <p>Build infrastructure. The makefile calls scripts/make.sh for "make" | 
|  | 1105 | and scripts/install.sh for "make install".</p> | 
|  | 1106 |  | 
|  | 1107 | <p>There's also a test suite, "make test" calls make/test.sh, which runs all | 
|  | 1108 | the tests in make/test/*. You can run individual tests via | 
|  | 1109 | "scripts/test.sh command", or "TEST_HOST=1 scripts/test.sh command" to run | 
|  | 1110 | that test against the host implementation instead of the toybox one.</p> | 
|  | 1111 |  | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 1112 | <h3>scripts/cfg2files.sh</h3> | 
|  | 1113 |  | 
|  | 1114 | <p>Run .config through this filter to get a list of enabled commands, which | 
|  | 1115 | is turned into a list of files in toys via a sed invocation in the top level | 
|  | 1116 | Makefile. | 
|  | 1117 | </p> | 
|  | 1118 |  | 
| Rob Landley | 81b899d | 2007-12-18 02:02:47 -0600 | [diff] [blame] | 1119 | <h2>Directory kconfig/</h2> | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 1120 |  | 
|  | 1121 | <p>Menuconfig infrastructure copied from the Linux kernel.  See the | 
|  | 1122 | Linux kernel's Documentation/kbuild/kconfig-language.txt</p> | 
|  | 1123 |  | 
| Rob Landley | 66a69d9 | 2012-01-16 01:44:17 -0600 | [diff] [blame] | 1124 | <a name="generated"> | 
|  | 1125 | <h2>Directory generated/</h2> | 
|  | 1126 |  | 
|  | 1127 | <p>All the files in this directory except the README are generated by the | 
|  | 1128 | build.  (See scripts/make.sh)</p> | 
|  | 1129 |  | 
|  | 1130 | <ul> | 
|  | 1131 | <li><p><b>config.h</b> - CFG_COMMAND and USE_COMMAND() macros set by menuconfig via .config.</p></li> | 
|  | 1132 |  | 
|  | 1133 | <li><p><b>Config.in</b> - Kconfig entries for each command.  Included by top level Config.in.  The help text in here is used to generated help.h</p></li> | 
|  | 1134 |  | 
|  | 1135 | <li><p><b>help.h</b> - Help text strings for use by "help" command.  Building | 
|  | 1136 | this file requires python on the host system, so the prebuilt file is shipped | 
|  | 1137 | in the build tarball to avoid requiring python to build toybox.</p></li> | 
|  | 1138 |  | 
|  | 1139 | <li><p><b>newtoys.h</b> - List of NEWTOY() or OLDTOY() macros for all available | 
|  | 1140 | commands.  Associates command_main() functions with command names, provides | 
|  | 1141 | option string for command line parsing (<a href="#lib_args">see lib/args.c</a>), | 
|  | 1142 | specifies where to install each command and whether toysh should fork before | 
|  | 1143 | calling it.</p></li> | 
|  | 1144 | </ul> | 
|  | 1145 |  | 
|  | 1146 | <p>Everything in this directory is a derivative file produced from something | 
|  | 1147 | else.  The entire directory is deleted by "make distclean".</p> | 
| Rob Landley | 4e68de1 | 2007-12-13 07:00:27 -0600 | [diff] [blame] | 1148 | <!--#include file="footer.html" --> |