Igor Murashkin | aaebaa0 | 2015-01-26 10:55:53 -0800 | [diff] [blame] | 1 | Cmdline |
| 2 | =================== |
| 3 | |
| 4 | Introduction |
| 5 | ------------- |
| 6 | This directory contains the classes that do common command line tool initialization and parsing. The |
| 7 | long term goal is eventually for all `art` command-line tools to be using these helpers. |
| 8 | |
| 9 | ---------- |
| 10 | |
| 11 | |
| 12 | ## Cmdline Parser |
| 13 | ------------- |
| 14 | |
| 15 | The `CmdlineParser` class provides a fluent interface using a domain-specific language to quickly |
| 16 | generate a type-safe value parser that process a user-provided list of strings (`argv`). Currently, |
| 17 | it can parse a string into a `VariantMap`, although in the future it might be desirable to parse |
| 18 | into any struct of any field. |
| 19 | |
| 20 | To use, create a `CmdlineParser::Builder` and then chain the `Define` methods together with |
| 21 | `WithType` and `IntoXX` methods. |
| 22 | |
| 23 | ### Quick Start |
| 24 | For example, to save the values into a user-defined variant map: |
| 25 | |
| 26 | ``` |
| 27 | struct FruitVariantMap : VariantMap { |
| 28 | static const Key<int> Apple; |
| 29 | static const Key<double> Orange; |
| 30 | static const Key<bool> Help; |
| 31 | }; |
| 32 | // Note that some template boilerplate has been avoided for clarity. |
| 33 | // See variant_map_test.cc for how to completely define a custom map. |
| 34 | |
| 35 | using FruitParser = CmdlineParser<FruitVariantMap, FruitVariantMap::Key>; |
| 36 | |
| 37 | FruitParser MakeParser() { |
| 38 | auto&& builder = FruitParser::Builder(); |
| 39 | builder. |
| 40 | .Define("--help") |
| 41 | .IntoKey(FruitVariantMap::Help) |
| 42 | Define("--apple:_") |
| 43 | .WithType<int>() |
| 44 | .IntoKey(FruitVariantMap::Apple) |
| 45 | .Define("--orange:_") |
| 46 | .WithType<double>() |
| 47 | .WithRange(0.0, 1.0) |
| 48 | .IntoKey(FruitVariantMap::Orange); |
| 49 | |
| 50 | return builder.Build(); |
| 51 | } |
| 52 | |
| 53 | int main(char** argv, int argc) { |
| 54 | auto parser = MakeParser(); |
| 55 | auto result = parser.parse(argv, argc)); |
| 56 | if (result.isError()) { |
| 57 | std::cerr << result.getMessage() << std::endl; |
| 58 | return EXIT_FAILURE; |
| 59 | } |
| 60 | auto map = parser.GetArgumentsMap(); |
| 61 | std::cout << "Help? " << map.GetOrDefault(FruitVariantMap::Help) << std::endl; |
| 62 | std::cout << "Apple? " << map.GetOrDefault(FruitVariantMap::Apple) << std::endl; |
| 63 | std::cout << "Orange? " << map.GetOrDefault(FruitVariantMap::Orange) << std::endl; |
| 64 | |
| 65 | return EXIT_SUCCESS; |
| 66 | } |
| 67 | ``` |
| 68 | |
| 69 | In the above code sample, we define a parser which is capable of parsing something like `--help |
| 70 | --apple:123 --orange:0.456` . It will error out automatically if invalid flags are given, or if the |
| 71 | appropriate flags are given but of the the wrong type/range. So for example, `--foo` will not parse |
| 72 | (invalid argument), neither will `--apple:fruit` (fruit is not an int) nor `--orange:1234` (1234 is |
| 73 | out of range of [0.0, 1.0]) |
| 74 | |
| 75 | ### Argument Definitions in Detail |
| 76 | #### Define method |
| 77 | The 'Define' method takes one or more aliases for the argument. Common examples might be `{"-h", |
| 78 | "--help"}` where both `--help` and `-h` are aliases for the same argument. |
| 79 | |
| 80 | The simplest kind of argument just tests for presence, but we often want to parse out a particular |
| 81 | type of value (such as an int or double as in the above `FruitVariantMap` example). To do that, a |
| 82 | _wildcard_ must be used to denote the location within the token that the type will be parsed out of. |
| 83 | |
| 84 | For example with `-orange:_` the parse would know to check all tokens in an `argv` list for the |
| 85 | `-orange:` prefix and then strip it, leaving only the remains to be parsed. |
| 86 | |
| 87 | #### WithType method (optional) |
| 88 | After an argument definition is provided, the parser builder needs to know what type the argument |
| 89 | will be in order to provide the type safety and make sure the rest of the argument definition is |
| 90 | correct as early as possible (in essence, everything but the parsing of the argument name is done at |
| 91 | compile time). |
| 92 | |
| 93 | Everything that follows a `WithType<T>()` call is thus type checked to only take `T` values. |
| 94 | |
| 95 | If this call is omitted, the parser generator assumes you are building a `Unit` type (i.e. an |
| 96 | argument that only cares about presence). |
| 97 | |
| 98 | #### WithRange method (optional) |
| 99 | Some values will not make sense outside of a `[min, max]` range, so this is an option to quickly add |
| 100 | a range check without writing custom code. The range check is performed after the main parsing |
| 101 | happens and happens for any type implementing the `<=` operators. |
| 102 | |
| 103 | #### WithValueMap (optional) |
| 104 | When parsing an enumeration, it might be very convenient to map a list of possible argument string |
| 105 | values into its runtime value. |
| 106 | |
| 107 | With something like |
| 108 | ``` |
| 109 | .Define("-hello:_") |
| 110 | .WithValueMap({"world", kWorld}, |
| 111 | {"galaxy", kGalaxy}) |
| 112 | ``` |
| 113 | It will parse either `-hello:world` or `-hello:galaxy` only (and error out on other variations of |
| 114 | `-hello:whatever`), converting it to the type-safe value of `kWorld` or `kGalaxy` respectively. |
| 115 | |
| 116 | This is meant to be another shorthand (like `WithRange`) to avoid writing a custom type parser. In |
| 117 | general it takes a variadic number of `pair<const char* /*arg name*/, T /*value*/>`. |
| 118 | |
| 119 | #### WithValues (optional) |
| 120 | When an argument definition has multiple aliases with no wildcards, it might be convenient to |
| 121 | quickly map them into discrete values. |
| 122 | |
| 123 | For example: |
| 124 | ``` |
| 125 | .Define({"-xinterpret", "-xnointerpret"}) |
| 126 | .WithValues({true, false} |
| 127 | ``` |
| 128 | It will parse `-xinterpret` as `true` and `-xnointerpret` as `false`. |
| 129 | |
| 130 | In general, it uses the position of the argument alias to map into the WithValues position value. |
| 131 | |
| 132 | (Note that this method will not work when the argument definitions have a wildcard because there is |
| 133 | no way to position-ally match that). |
| 134 | |
| 135 | #### AppendValues (optional) |
| 136 | By default, the argument is assumed to appear exactly once, and if the user specifies it more than |
| 137 | once, only the latest value is taken into account (and all previous occurrences of the argument are |
| 138 | ignored). |
| 139 | |
| 140 | In some situations, we may want to accumulate the argument values instead of discarding the previous |
| 141 | ones. |
| 142 | |
| 143 | For example |
| 144 | ``` |
| 145 | .Define("-D") |
| 146 | .WithType<std::vector<std::string>)() |
| 147 | .AppendValues() |
| 148 | ``` |
| 149 | Will parse something like `-Dhello -Dworld -Dbar -Dbaz` into `std::vector<std::string>{"hello", |
| 150 | "world", "bar", "baz"}`. |
| 151 | |
| 152 | ### Setting an argument parse target (required) |
| 153 | To complete an argument definition, the parser generator also needs to know where to save values. |
| 154 | Currently, only `IntoKey` is supported, but that may change in the future. |
| 155 | |
| 156 | #### IntoKey (required) |
| 157 | This specifies that when a value is parsed, it will get saved into a variant map using the specific |
| 158 | key. |
| 159 | |
| 160 | For example, |
| 161 | ``` |
| 162 | .Define("-help") |
| 163 | .IntoKey(Map::Help) |
| 164 | ``` |
| 165 | will save occurrences of the `-help` argument by doing a `Map.Set(Map::Help, ParsedValue("-help"))` |
| 166 | where `ParsedValue` is an imaginary function that parses the `-help` argment into a specific type |
| 167 | set by `WithType`. |
| 168 | |
| 169 | ### Ignoring unknown arguments |
| 170 | This is highly discouraged, but for compatibility with `JNI` which allows argument ignores, there is |
| 171 | an option to ignore any argument tokens that are not known to the parser. This is done with the |
| 172 | `Ignore` function which takes a list of argument definition names. |
| 173 | |
| 174 | It's semantically equivalent to making a series of argument definitions that map to `Unit` but don't |
| 175 | get saved anywhere. Values will still get parsed as normal, so it will *not* ignore known arguments |
| 176 | with invalid values, only user-arguments for which it could not find a matching argument definition. |
| 177 | |
| 178 | ### Parsing custom types |
| 179 | Any type can be parsed from a string by specializing the `CmdlineType` class and implementing the |
| 180 | static interface provided by `CmdlineTypeParser`. It is recommended to inherit from |
| 181 | `CmdlineTypeParser` since it already provides default implementations for every method. |
| 182 | |
| 183 | The `Parse` method should be implemented for most types. Some types will allow appending (such as an |
| 184 | `std::vector<std::string>` and are meant to be used with `AppendValues` in which case the |
| 185 | `ParseAndAppend` function should be implemented. |
| 186 | |
| 187 | For example: |
| 188 | ``` |
| 189 | template <> |
| 190 | struct CmdlineType<double> : CmdlineTypeParser<double> { |
| 191 | Result Parse(const std::string& str) { |
| 192 | char* end = nullptr; |
| 193 | errno = 0; |
| 194 | double value = strtod(str.c_str(), &end); |
| 195 | |
| 196 | if (*end != '\0') { |
| 197 | return Result::Failure("Failed to parse double from " + str); |
| 198 | } |
| 199 | if (errno == ERANGE) { |
| 200 | return Result::OutOfRange( |
| 201 | "Failed to parse double from " + str + "; overflow/underflow occurred"); |
| 202 | } |
| 203 | |
| 204 | return Result::Success(value); |
| 205 | } |
| 206 | |
| 207 | static const char* Name() { return "double"; } |
| 208 | // note: Name() is just here for more user-friendly errors, |
| 209 | // but in the future we will use non-standard ways of getting the type name |
| 210 | // at compile-time and this will no longer be required |
| 211 | }; |
| 212 | ``` |
| 213 | Will parse any non-append argument definitions with a type of `double`. |
| 214 | |
| 215 | For an appending example: |
| 216 | ``` |
| 217 | template <> |
| 218 | struct CmdlineType<std::vector<std::string>> : CmdlineTypeParser<std::vector<std::string>> { |
| 219 | Result ParseAndAppend(const std::string& args, |
| 220 | std::vector<std::string>& existing_value) { |
| 221 | existing_value.push_back(args); |
| 222 | return Result::SuccessNoValue(); |
| 223 | } |
| 224 | static const char* Name() { return "std::vector<std::string>"; } |
| 225 | }; |
| 226 | ``` |
| 227 | Will parse multiple instances of the same argument repeatedly into the `existing_value` (which will |
| 228 | be default-constructed to `T{}` for the first occurrence of the argument). |
| 229 | |
| 230 | #### What is a `Result`? |
| 231 | `Result` is a typedef for `CmdlineParseResult<T>` and it acts similar to a poor version of |
| 232 | `Either<Left, Right>` in Haskell. In particular, it would be similar to `Either< int ErrorCode, |
| 233 | Maybe<T> >`. |
| 234 | |
| 235 | There are helpers like `Result::Success(value)`, `Result::Failure(string message)` and so on to |
| 236 | quickly construct these without caring about the type. |
| 237 | |
| 238 | When successfully parsing a single value, `Result::Success(value)` should be used, and when |
| 239 | successfully parsing an appended value, use `Result::SuccessNoValue()` and write back the new value |
| 240 | into `existing_value` as an out-parameter. |
| 241 | |
| 242 | When many arguments are parsed, the result is collapsed down to a `CmdlineResult` which acts as a |
| 243 | `Either<int ErrorCode, Unit>` where the right side simply indicates success. When values are |
| 244 | successfully stored, the parser will automatically save it into the target destination as a side |
| 245 | effect. |