Nick Kledzik | f60a927 | 2012-12-12 20:46:15 +0000 | [diff] [blame] | 1 | ===================== |
| 2 | YAML I/O |
| 3 | ===================== |
| 4 | |
| 5 | .. contents:: |
| 6 | :local: |
| 7 | |
| 8 | Introduction to YAML |
| 9 | ==================== |
| 10 | |
| 11 | YAML is a human readable data serialization language. The full YAML language |
| 12 | spec can be read at `yaml.org |
| 13 | <http://www.yaml.org/spec/1.2/spec.html#Introduction>`_. The simplest form of |
| 14 | yaml is just "scalars", "mappings", and "sequences". A scalar is any number |
| 15 | or string. The pound/hash symbol (#) begins a comment line. A mapping is |
| 16 | a set of key-value pairs where the key ends with a colon. For example: |
| 17 | |
| 18 | .. code-block:: yaml |
| 19 | |
| 20 | # a mapping |
| 21 | name: Tom |
| 22 | hat-size: 7 |
| 23 | |
| 24 | A sequence is a list of items where each item starts with a leading dash ('-'). |
| 25 | For example: |
| 26 | |
| 27 | .. code-block:: yaml |
| 28 | |
| 29 | # a sequence |
| 30 | - x86 |
| 31 | - x86_64 |
| 32 | - PowerPC |
| 33 | |
| 34 | You can combine mappings and sequences by indenting. For example a sequence |
| 35 | of mappings in which one of the mapping values is itself a sequence: |
| 36 | |
| 37 | .. code-block:: yaml |
| 38 | |
| 39 | # a sequence of mappings with one key's value being a sequence |
| 40 | - name: Tom |
| 41 | cpus: |
| 42 | - x86 |
| 43 | - x86_64 |
| 44 | - name: Bob |
| 45 | cpus: |
| 46 | - x86 |
| 47 | - name: Dan |
| 48 | cpus: |
| 49 | - PowerPC |
| 50 | - x86 |
| 51 | |
| 52 | Sometime sequences are known to be short and the one entry per line is too |
| 53 | verbose, so YAML offers an alternate syntax for sequences called a "Flow |
| 54 | Sequence" in which you put comma separated sequence elements into square |
| 55 | brackets. The above example could then be simplified to : |
| 56 | |
| 57 | |
| 58 | .. code-block:: yaml |
| 59 | |
| 60 | # a sequence of mappings with one key's value being a flow sequence |
| 61 | - name: Tom |
| 62 | cpus: [ x86, x86_64 ] |
| 63 | - name: Bob |
| 64 | cpus: [ x86 ] |
| 65 | - name: Dan |
| 66 | cpus: [ PowerPC, x86 ] |
| 67 | |
| 68 | |
| 69 | Introduction to YAML I/O |
| 70 | ======================== |
| 71 | |
| 72 | The use of indenting makes the YAML easy for a human to read and understand, |
| 73 | but having a program read and write YAML involves a lot of tedious details. |
| 74 | The YAML I/O library structures and simplifies reading and writing YAML |
| 75 | documents. |
| 76 | |
| 77 | YAML I/O assumes you have some "native" data structures which you want to be |
| 78 | able to dump as YAML and recreate from YAML. The first step is to try |
| 79 | writing example YAML for your data structures. You may find after looking at |
| 80 | possible YAML representations that a direct mapping of your data structures |
| 81 | to YAML is not very readable. Often the fields are not in the order that |
| 82 | a human would find readable. Or the same information is replicated in multiple |
| 83 | locations, making it hard for a human to write such YAML correctly. |
| 84 | |
| 85 | In relational database theory there is a design step called normalization in |
| 86 | which you reorganize fields and tables. The same considerations need to |
| 87 | go into the design of your YAML encoding. But, you may not want to change |
Alex Rosenberg | 7f5af7f | 2013-02-18 02:44:09 +0000 | [diff] [blame] | 88 | your existing native data structures. Therefore, when writing out YAML |
Nick Kledzik | f60a927 | 2012-12-12 20:46:15 +0000 | [diff] [blame] | 89 | there may be a normalization step, and when reading YAML there would be a |
| 90 | corresponding denormalization step. |
| 91 | |
| 92 | YAML I/O uses a non-invasive, traits based design. YAML I/O defines some |
| 93 | abstract base templates. You specialize those templates on your data types. |
Alex Rosenberg | 7f5af7f | 2013-02-18 02:44:09 +0000 | [diff] [blame] | 94 | For instance, if you have an enumerated type FooBar you could specialize |
Nick Kledzik | f60a927 | 2012-12-12 20:46:15 +0000 | [diff] [blame] | 95 | ScalarEnumerationTraits on that type and define the enumeration() method: |
| 96 | |
| 97 | .. code-block:: c++ |
| 98 | |
| 99 | using llvm::yaml::ScalarEnumerationTraits; |
| 100 | using llvm::yaml::IO; |
| 101 | |
| 102 | template <> |
| 103 | struct ScalarEnumerationTraits<FooBar> { |
| 104 | static void enumeration(IO &io, FooBar &value) { |
| 105 | ... |
| 106 | } |
| 107 | }; |
| 108 | |
| 109 | |
| 110 | As with all YAML I/O template specializations, the ScalarEnumerationTraits is used for |
| 111 | both reading and writing YAML. That is, the mapping between in-memory enum |
Daniel Dunbar | bf2e7b5 | 2013-05-20 22:39:48 +0000 | [diff] [blame] | 112 | values and the YAML string representation is only in one place. |
Nick Kledzik | f60a927 | 2012-12-12 20:46:15 +0000 | [diff] [blame] | 113 | This assures that the code for writing and parsing of YAML stays in sync. |
| 114 | |
| 115 | To specify a YAML mappings, you define a specialization on |
Alex Rosenberg | 7f5af7f | 2013-02-18 02:44:09 +0000 | [diff] [blame] | 116 | llvm::yaml::MappingTraits. |
Nick Kledzik | f60a927 | 2012-12-12 20:46:15 +0000 | [diff] [blame] | 117 | If your native data structure happens to be a struct that is already normalized, |
| 118 | then the specialization is simple. For example: |
| 119 | |
| 120 | .. code-block:: c++ |
| 121 | |
Alex Rosenberg | 7f5af7f | 2013-02-18 02:44:09 +0000 | [diff] [blame] | 122 | using llvm::yaml::MappingTraits; |
Nick Kledzik | f60a927 | 2012-12-12 20:46:15 +0000 | [diff] [blame] | 123 | using llvm::yaml::IO; |
| 124 | |
| 125 | template <> |
Alex Rosenberg | 7f5af7f | 2013-02-18 02:44:09 +0000 | [diff] [blame] | 126 | struct MappingTraits<Person> { |
Nick Kledzik | f60a927 | 2012-12-12 20:46:15 +0000 | [diff] [blame] | 127 | static void mapping(IO &io, Person &info) { |
| 128 | io.mapRequired("name", info.name); |
| 129 | io.mapOptional("hat-size", info.hatSize); |
| 130 | } |
| 131 | }; |
| 132 | |
| 133 | |
Alex Rosenberg | 7f5af7f | 2013-02-18 02:44:09 +0000 | [diff] [blame] | 134 | A YAML sequence is automatically inferred if you data type has begin()/end() |
Nick Kledzik | f60a927 | 2012-12-12 20:46:15 +0000 | [diff] [blame] | 135 | iterators and a push_back() method. Therefore any of the STL containers |
| 136 | (such as std::vector<>) will automatically translate to YAML sequences. |
| 137 | |
| 138 | Once you have defined specializations for your data types, you can |
| 139 | programmatically use YAML I/O to write a YAML document: |
| 140 | |
| 141 | .. code-block:: c++ |
| 142 | |
| 143 | using llvm::yaml::Output; |
| 144 | |
| 145 | Person tom; |
| 146 | tom.name = "Tom"; |
| 147 | tom.hatSize = 8; |
| 148 | Person dan; |
| 149 | dan.name = "Dan"; |
| 150 | dan.hatSize = 7; |
| 151 | std::vector<Person> persons; |
| 152 | persons.push_back(tom); |
| 153 | persons.push_back(dan); |
| 154 | |
| 155 | Output yout(llvm::outs()); |
| 156 | yout << persons; |
| 157 | |
| 158 | This would write the following: |
| 159 | |
| 160 | .. code-block:: yaml |
| 161 | |
| 162 | - name: Tom |
| 163 | hat-size: 8 |
| 164 | - name: Dan |
| 165 | hat-size: 7 |
| 166 | |
| 167 | And you can also read such YAML documents with the following code: |
| 168 | |
| 169 | .. code-block:: c++ |
| 170 | |
| 171 | using llvm::yaml::Input; |
| 172 | |
| 173 | typedef std::vector<Person> PersonList; |
| 174 | std::vector<PersonList> docs; |
| 175 | |
| 176 | Input yin(document.getBuffer()); |
| 177 | yin >> docs; |
| 178 | |
| 179 | if ( yin.error() ) |
| 180 | return; |
| 181 | |
| 182 | // Process read document |
| 183 | for ( PersonList &pl : docs ) { |
| 184 | for ( Person &person : pl ) { |
| 185 | cout << "name=" << person.name; |
| 186 | } |
| 187 | } |
| 188 | |
| 189 | One other feature of YAML is the ability to define multiple documents in a |
| 190 | single file. That is why reading YAML produces a vector of your document type. |
| 191 | |
| 192 | |
| 193 | |
| 194 | Error Handling |
| 195 | ============== |
| 196 | |
| 197 | When parsing a YAML document, if the input does not match your schema (as |
| 198 | expressed in your XxxTraits<> specializations). YAML I/O |
| 199 | will print out an error message and your Input object's error() method will |
| 200 | return true. For instance the following document: |
| 201 | |
| 202 | .. code-block:: yaml |
| 203 | |
| 204 | - name: Tom |
| 205 | shoe-size: 12 |
| 206 | - name: Dan |
| 207 | hat-size: 7 |
| 208 | |
| 209 | Has a key (shoe-size) that is not defined in the schema. YAML I/O will |
| 210 | automatically generate this error: |
| 211 | |
| 212 | .. code-block:: yaml |
| 213 | |
| 214 | YAML:2:2: error: unknown key 'shoe-size' |
| 215 | shoe-size: 12 |
| 216 | ^~~~~~~~~ |
| 217 | |
| 218 | Similar errors are produced for other input not conforming to the schema. |
| 219 | |
| 220 | |
| 221 | Scalars |
| 222 | ======= |
| 223 | |
| 224 | YAML scalars are just strings (i.e. not a sequence or mapping). The YAML I/O |
| 225 | library provides support for translating between YAML scalars and specific |
| 226 | C++ types. |
| 227 | |
| 228 | |
| 229 | Built-in types |
| 230 | -------------- |
| 231 | The following types have built-in support in YAML I/O: |
| 232 | |
| 233 | * bool |
| 234 | * float |
| 235 | * double |
| 236 | * StringRef |
| 237 | * int64_t |
| 238 | * int32_t |
| 239 | * int16_t |
| 240 | * int8_t |
| 241 | * uint64_t |
| 242 | * uint32_t |
| 243 | * uint16_t |
| 244 | * uint8_t |
| 245 | |
Alex Rosenberg | 7f5af7f | 2013-02-18 02:44:09 +0000 | [diff] [blame] | 246 | That is, you can use those types in fields of MappingTraits or as element type |
Nick Kledzik | f60a927 | 2012-12-12 20:46:15 +0000 | [diff] [blame] | 247 | in sequence. When reading, YAML I/O will validate that the string found |
| 248 | is convertible to that type and error out if not. |
| 249 | |
| 250 | |
| 251 | Unique types |
| 252 | ------------ |
| 253 | Given that YAML I/O is trait based, the selection of how to convert your data |
| 254 | to YAML is based on the type of your data. But in C++ type matching, typedefs |
| 255 | do not generate unique type names. That means if you have two typedefs of |
| 256 | unsigned int, to YAML I/O both types look exactly like unsigned int. To |
| 257 | facilitate make unique type names, YAML I/O provides a macro which is used |
| 258 | like a typedef on built-in types, but expands to create a class with conversion |
| 259 | operators to and from the base type. For example: |
| 260 | |
| 261 | .. code-block:: c++ |
| 262 | |
| 263 | LLVM_YAML_STRONG_TYPEDEF(uint32_t, MyFooFlags) |
| 264 | LLVM_YAML_STRONG_TYPEDEF(uint32_t, MyBarFlags) |
| 265 | |
| 266 | This generates two classes MyFooFlags and MyBarFlags which you can use in your |
| 267 | native data structures instead of uint32_t. They are implicitly |
| 268 | converted to and from uint32_t. The point of creating these unique types |
| 269 | is that you can now specify traits on them to get different YAML conversions. |
| 270 | |
| 271 | Hex types |
| 272 | --------- |
| 273 | An example use of a unique type is that YAML I/O provides fixed sized unsigned |
| 274 | integers that are written with YAML I/O as hexadecimal instead of the decimal |
| 275 | format used by the built-in integer types: |
| 276 | |
| 277 | * Hex64 |
| 278 | * Hex32 |
| 279 | * Hex16 |
| 280 | * Hex8 |
| 281 | |
| 282 | You can use llvm::yaml::Hex32 instead of uint32_t and the only different will |
| 283 | be that when YAML I/O writes out that type it will be formatted in hexadecimal. |
| 284 | |
| 285 | |
| 286 | ScalarEnumerationTraits |
| 287 | ----------------------- |
| 288 | YAML I/O supports translating between in-memory enumerations and a set of string |
| 289 | values in YAML documents. This is done by specializing ScalarEnumerationTraits<> |
| 290 | on your enumeration type and define a enumeration() method. |
| 291 | For instance, suppose you had an enumeration of CPUs and a struct with it as |
| 292 | a field: |
| 293 | |
| 294 | .. code-block:: c++ |
| 295 | |
| 296 | enum CPUs { |
| 297 | cpu_x86_64 = 5, |
| 298 | cpu_x86 = 7, |
| 299 | cpu_PowerPC = 8 |
| 300 | }; |
| 301 | |
| 302 | struct Info { |
| 303 | CPUs cpu; |
| 304 | uint32_t flags; |
| 305 | }; |
| 306 | |
| 307 | To support reading and writing of this enumeration, you can define a |
| 308 | ScalarEnumerationTraits specialization on CPUs, which can then be used |
| 309 | as a field type: |
| 310 | |
| 311 | .. code-block:: c++ |
| 312 | |
| 313 | using llvm::yaml::ScalarEnumerationTraits; |
Alex Rosenberg | 7f5af7f | 2013-02-18 02:44:09 +0000 | [diff] [blame] | 314 | using llvm::yaml::MappingTraits; |
Nick Kledzik | f60a927 | 2012-12-12 20:46:15 +0000 | [diff] [blame] | 315 | using llvm::yaml::IO; |
| 316 | |
| 317 | template <> |
| 318 | struct ScalarEnumerationTraits<CPUs> { |
| 319 | static void enumeration(IO &io, CPUs &value) { |
| 320 | io.enumCase(value, "x86_64", cpu_x86_64); |
| 321 | io.enumCase(value, "x86", cpu_x86); |
| 322 | io.enumCase(value, "PowerPC", cpu_PowerPC); |
| 323 | } |
| 324 | }; |
| 325 | |
| 326 | template <> |
Alex Rosenberg | 7f5af7f | 2013-02-18 02:44:09 +0000 | [diff] [blame] | 327 | struct MappingTraits<Info> { |
Nick Kledzik | f60a927 | 2012-12-12 20:46:15 +0000 | [diff] [blame] | 328 | static void mapping(IO &io, Info &info) { |
| 329 | io.mapRequired("cpu", info.cpu); |
| 330 | io.mapOptional("flags", info.flags, 0); |
| 331 | } |
| 332 | }; |
| 333 | |
| 334 | When reading YAML, if the string found does not match any of the the strings |
| 335 | specified by enumCase() methods, an error is automatically generated. |
| 336 | When writing YAML, if the value being written does not match any of the values |
| 337 | specified by the enumCase() methods, a runtime assertion is triggered. |
| 338 | |
| 339 | |
| 340 | BitValue |
| 341 | -------- |
| 342 | Another common data structure in C++ is a field where each bit has a unique |
| 343 | meaning. This is often used in a "flags" field. YAML I/O has support for |
| 344 | converting such fields to a flow sequence. For instance suppose you |
| 345 | had the following bit flags defined: |
| 346 | |
| 347 | .. code-block:: c++ |
| 348 | |
| 349 | enum { |
| 350 | flagsPointy = 1 |
| 351 | flagsHollow = 2 |
| 352 | flagsFlat = 4 |
| 353 | flagsRound = 8 |
| 354 | }; |
| 355 | |
Sean Silva | 0d65a76 | 2013-06-04 23:36:41 +0000 | [diff] [blame] | 356 | LLVM_YAML_STRONG_TYPEDEF(uint32_t, MyFlags) |
Nick Kledzik | f60a927 | 2012-12-12 20:46:15 +0000 | [diff] [blame] | 357 | |
| 358 | To support reading and writing of MyFlags, you specialize ScalarBitSetTraits<> |
| 359 | on MyFlags and provide the bit values and their names. |
| 360 | |
| 361 | .. code-block:: c++ |
| 362 | |
| 363 | using llvm::yaml::ScalarBitSetTraits; |
Alex Rosenberg | 7f5af7f | 2013-02-18 02:44:09 +0000 | [diff] [blame] | 364 | using llvm::yaml::MappingTraits; |
Nick Kledzik | f60a927 | 2012-12-12 20:46:15 +0000 | [diff] [blame] | 365 | using llvm::yaml::IO; |
| 366 | |
| 367 | template <> |
| 368 | struct ScalarBitSetTraits<MyFlags> { |
| 369 | static void bitset(IO &io, MyFlags &value) { |
| 370 | io.bitSetCase(value, "hollow", flagHollow); |
| 371 | io.bitSetCase(value, "flat", flagFlat); |
| 372 | io.bitSetCase(value, "round", flagRound); |
| 373 | io.bitSetCase(value, "pointy", flagPointy); |
| 374 | } |
| 375 | }; |
| 376 | |
| 377 | struct Info { |
| 378 | StringRef name; |
| 379 | MyFlags flags; |
| 380 | }; |
| 381 | |
| 382 | template <> |
Alex Rosenberg | 7f5af7f | 2013-02-18 02:44:09 +0000 | [diff] [blame] | 383 | struct MappingTraits<Info> { |
Nick Kledzik | f60a927 | 2012-12-12 20:46:15 +0000 | [diff] [blame] | 384 | static void mapping(IO &io, Info& info) { |
| 385 | io.mapRequired("name", info.name); |
| 386 | io.mapRequired("flags", info.flags); |
| 387 | } |
| 388 | }; |
| 389 | |
| 390 | With the above, YAML I/O (when writing) will test mask each value in the |
| 391 | bitset trait against the flags field, and each that matches will |
| 392 | cause the corresponding string to be added to the flow sequence. The opposite |
| 393 | is done when reading and any unknown string values will result in a error. With |
| 394 | the above schema, a same valid YAML document is: |
| 395 | |
| 396 | .. code-block:: yaml |
| 397 | |
| 398 | name: Tom |
| 399 | flags: [ pointy, flat ] |
| 400 | |
| 401 | |
| 402 | Custom Scalar |
| 403 | ------------- |
| 404 | Sometimes for readability a scalar needs to be formatted in a custom way. For |
| 405 | instance your internal data structure may use a integer for time (seconds since |
| 406 | some epoch), but in YAML it would be much nicer to express that integer in |
| 407 | some time format (e.g. 4-May-2012 10:30pm). YAML I/O has a way to support |
| 408 | custom formatting and parsing of scalar types by specializing ScalarTraits<> on |
| 409 | your data type. When writing, YAML I/O will provide the native type and |
| 410 | your specialization must create a temporary llvm::StringRef. When reading, |
Daniel Dunbar | 06b9f9e | 2013-08-16 23:30:19 +0000 | [diff] [blame] | 411 | YAML I/O will provide an llvm::StringRef of scalar and your specialization |
Nick Kledzik | f60a927 | 2012-12-12 20:46:15 +0000 | [diff] [blame] | 412 | must convert that to your native data type. An outline of a custom scalar type |
| 413 | looks like: |
| 414 | |
| 415 | .. code-block:: c++ |
| 416 | |
| 417 | using llvm::yaml::ScalarTraits; |
| 418 | using llvm::yaml::IO; |
| 419 | |
| 420 | template <> |
| 421 | struct ScalarTraits<MyCustomType> { |
| 422 | static void output(const T &value, llvm::raw_ostream &out) { |
| 423 | out << value; // do custom formatting here |
| 424 | } |
| 425 | static StringRef input(StringRef scalar, T &value) { |
| 426 | // do custom parsing here. Return the empty string on success, |
| 427 | // or an error message on failure. |
| 428 | return StringRef(); |
| 429 | } |
| 430 | }; |
| 431 | |
| 432 | |
| 433 | Mappings |
| 434 | ======== |
| 435 | |
| 436 | To be translated to or from a YAML mapping for your type T you must specialize |
Alex Rosenberg | 7f5af7f | 2013-02-18 02:44:09 +0000 | [diff] [blame] | 437 | llvm::yaml::MappingTraits on T and implement the "void mapping(IO &io, T&)" |
Nick Kledzik | f60a927 | 2012-12-12 20:46:15 +0000 | [diff] [blame] | 438 | method. If your native data structures use pointers to a class everywhere, |
| 439 | you can specialize on the class pointer. Examples: |
| 440 | |
| 441 | .. code-block:: c++ |
| 442 | |
Alex Rosenberg | 7f5af7f | 2013-02-18 02:44:09 +0000 | [diff] [blame] | 443 | using llvm::yaml::MappingTraits; |
Nick Kledzik | f60a927 | 2012-12-12 20:46:15 +0000 | [diff] [blame] | 444 | using llvm::yaml::IO; |
| 445 | |
| 446 | // Example of struct Foo which is used by value |
| 447 | template <> |
Alex Rosenberg | 7f5af7f | 2013-02-18 02:44:09 +0000 | [diff] [blame] | 448 | struct MappingTraits<Foo> { |
Nick Kledzik | f60a927 | 2012-12-12 20:46:15 +0000 | [diff] [blame] | 449 | static void mapping(IO &io, Foo &foo) { |
| 450 | io.mapOptional("size", foo.size); |
| 451 | ... |
| 452 | } |
| 453 | }; |
| 454 | |
| 455 | // Example of struct Bar which is natively always a pointer |
| 456 | template <> |
Alex Rosenberg | 7f5af7f | 2013-02-18 02:44:09 +0000 | [diff] [blame] | 457 | struct MappingTraits<Bar*> { |
Nick Kledzik | f60a927 | 2012-12-12 20:46:15 +0000 | [diff] [blame] | 458 | static void mapping(IO &io, Bar *&bar) { |
| 459 | io.mapOptional("size", bar->size); |
| 460 | ... |
| 461 | } |
| 462 | }; |
| 463 | |
| 464 | |
| 465 | No Normalization |
| 466 | ---------------- |
| 467 | |
| 468 | The mapping() method is responsible, if needed, for normalizing and |
| 469 | denormalizing. In a simple case where the native data structure requires no |
| 470 | normalization, the mapping method just uses mapOptional() or mapRequired() to |
| 471 | bind the struct's fields to YAML key names. For example: |
| 472 | |
| 473 | .. code-block:: c++ |
| 474 | |
Alex Rosenberg | 7f5af7f | 2013-02-18 02:44:09 +0000 | [diff] [blame] | 475 | using llvm::yaml::MappingTraits; |
Nick Kledzik | f60a927 | 2012-12-12 20:46:15 +0000 | [diff] [blame] | 476 | using llvm::yaml::IO; |
| 477 | |
| 478 | template <> |
Alex Rosenberg | 7f5af7f | 2013-02-18 02:44:09 +0000 | [diff] [blame] | 479 | struct MappingTraits<Person> { |
Nick Kledzik | f60a927 | 2012-12-12 20:46:15 +0000 | [diff] [blame] | 480 | static void mapping(IO &io, Person &info) { |
| 481 | io.mapRequired("name", info.name); |
| 482 | io.mapOptional("hat-size", info.hatSize); |
| 483 | } |
| 484 | }; |
| 485 | |
| 486 | |
| 487 | Normalization |
| 488 | ---------------- |
| 489 | |
| 490 | When [de]normalization is required, the mapping() method needs a way to access |
| 491 | normalized values as fields. To help with this, there is |
| 492 | a template MappingNormalization<> which you can then use to automatically |
| 493 | do the normalization and denormalization. The template is used to create |
| 494 | a local variable in your mapping() method which contains the normalized keys. |
| 495 | |
| 496 | Suppose you have native data type |
| 497 | Polar which specifies a position in polar coordinates (distance, angle): |
| 498 | |
| 499 | .. code-block:: c++ |
| 500 | |
| 501 | struct Polar { |
| 502 | float distance; |
| 503 | float angle; |
| 504 | }; |
| 505 | |
| 506 | but you've decided the normalized YAML for should be in x,y coordinates. That |
| 507 | is, you want the yaml to look like: |
| 508 | |
| 509 | .. code-block:: yaml |
| 510 | |
| 511 | x: 10.3 |
| 512 | y: -4.7 |
| 513 | |
Alex Rosenberg | 7f5af7f | 2013-02-18 02:44:09 +0000 | [diff] [blame] | 514 | You can support this by defining a MappingTraits that normalizes the polar |
Nick Kledzik | f60a927 | 2012-12-12 20:46:15 +0000 | [diff] [blame] | 515 | coordinates to x,y coordinates when writing YAML and denormalizes x,y |
Alex Rosenberg | 7f5af7f | 2013-02-18 02:44:09 +0000 | [diff] [blame] | 516 | coordinates into polar when reading YAML. |
Nick Kledzik | f60a927 | 2012-12-12 20:46:15 +0000 | [diff] [blame] | 517 | |
| 518 | .. code-block:: c++ |
| 519 | |
Alex Rosenberg | 7f5af7f | 2013-02-18 02:44:09 +0000 | [diff] [blame] | 520 | using llvm::yaml::MappingTraits; |
Nick Kledzik | f60a927 | 2012-12-12 20:46:15 +0000 | [diff] [blame] | 521 | using llvm::yaml::IO; |
| 522 | |
| 523 | template <> |
Alex Rosenberg | 7f5af7f | 2013-02-18 02:44:09 +0000 | [diff] [blame] | 524 | struct MappingTraits<Polar> { |
Nick Kledzik | f60a927 | 2012-12-12 20:46:15 +0000 | [diff] [blame] | 525 | |
| 526 | class NormalizedPolar { |
| 527 | public: |
| 528 | NormalizedPolar(IO &io) |
| 529 | : x(0.0), y(0.0) { |
| 530 | } |
| 531 | NormalizedPolar(IO &, Polar &polar) |
| 532 | : x(polar.distance * cos(polar.angle)), |
| 533 | y(polar.distance * sin(polar.angle)) { |
| 534 | } |
| 535 | Polar denormalize(IO &) { |
Daniel Dunbar | bf2e7b5 | 2013-05-20 22:39:48 +0000 | [diff] [blame] | 536 | return Polar(sqrt(x*x+y*y), arctan(x,y)); |
Nick Kledzik | f60a927 | 2012-12-12 20:46:15 +0000 | [diff] [blame] | 537 | } |
| 538 | |
| 539 | float x; |
| 540 | float y; |
| 541 | }; |
| 542 | |
| 543 | static void mapping(IO &io, Polar &polar) { |
| 544 | MappingNormalization<NormalizedPolar, Polar> keys(io, polar); |
| 545 | |
| 546 | io.mapRequired("x", keys->x); |
| 547 | io.mapRequired("y", keys->y); |
| 548 | } |
| 549 | }; |
| 550 | |
| 551 | When writing YAML, the local variable "keys" will be a stack allocated |
Rui Ueyama | 0ad114c | 2013-09-11 05:22:01 +0000 | [diff] [blame] | 552 | instance of NormalizedPolar, constructed from the supplied polar object which |
Nick Kledzik | f60a927 | 2012-12-12 20:46:15 +0000 | [diff] [blame] | 553 | initializes it x and y fields. The mapRequired() methods then write out the x |
| 554 | and y values as key/value pairs. |
| 555 | |
| 556 | When reading YAML, the local variable "keys" will be a stack allocated instance |
| 557 | of NormalizedPolar, constructed by the empty constructor. The mapRequired |
| 558 | methods will find the matching key in the YAML document and fill in the x and y |
| 559 | fields of the NormalizedPolar object keys. At the end of the mapping() method |
| 560 | when the local keys variable goes out of scope, the denormalize() method will |
| 561 | automatically be called to convert the read values back to polar coordinates, |
| 562 | and then assigned back to the second parameter to mapping(). |
| 563 | |
| 564 | In some cases, the normalized class may be a subclass of the native type and |
| 565 | could be returned by the denormalize() method, except that the temporary |
| 566 | normalized instance is stack allocated. In these cases, the utility template |
| 567 | MappingNormalizationHeap<> can be used instead. It just like |
| 568 | MappingNormalization<> except that it heap allocates the normalized object |
Alex Rosenberg | 7f5af7f | 2013-02-18 02:44:09 +0000 | [diff] [blame] | 569 | when reading YAML. It never destroys the normalized object. The denormalize() |
Nick Kledzik | f60a927 | 2012-12-12 20:46:15 +0000 | [diff] [blame] | 570 | method can this return "this". |
| 571 | |
| 572 | |
| 573 | Default values |
| 574 | -------------- |
| 575 | Within a mapping() method, calls to io.mapRequired() mean that that key is |
| 576 | required to exist when parsing YAML documents, otherwise YAML I/O will issue an |
| 577 | error. |
| 578 | |
| 579 | On the other hand, keys registered with io.mapOptional() are allowed to not |
| 580 | exist in the YAML document being read. So what value is put in the field |
| 581 | for those optional keys? |
| 582 | There are two steps to how those optional fields are filled in. First, the |
| 583 | second parameter to the mapping() method is a reference to a native class. That |
| 584 | native class must have a default constructor. Whatever value the default |
| 585 | constructor initially sets for an optional field will be that field's value. |
| 586 | Second, the mapOptional() method has an optional third parameter. If provided |
| 587 | it is the value that mapOptional() should set that field to if the YAML document |
| 588 | does not have that key. |
| 589 | |
| 590 | There is one important difference between those two ways (default constructor |
| 591 | and third parameter to mapOptional). When YAML I/O generates a YAML document, |
| 592 | if the mapOptional() third parameter is used, if the actual value being written |
| 593 | is the same as (using ==) the default value, then that key/value is not written. |
| 594 | |
| 595 | |
| 596 | Order of Keys |
| 597 | -------------- |
| 598 | |
| 599 | When writing out a YAML document, the keys are written in the order that the |
| 600 | calls to mapRequired()/mapOptional() are made in the mapping() method. This |
| 601 | gives you a chance to write the fields in an order that a human reader of |
| 602 | the YAML document would find natural. This may be different that the order |
| 603 | of the fields in the native class. |
| 604 | |
| 605 | When reading in a YAML document, the keys in the document can be in any order, |
| 606 | but they are processed in the order that the calls to mapRequired()/mapOptional() |
| 607 | are made in the mapping() method. That enables some interesting |
| 608 | functionality. For instance, if the first field bound is the cpu and the second |
| 609 | field bound is flags, and the flags are cpu specific, you can programmatically |
| 610 | switch how the flags are converted to and from YAML based on the cpu. |
| 611 | This works for both reading and writing. For example: |
| 612 | |
| 613 | .. code-block:: c++ |
| 614 | |
Alex Rosenberg | 7f5af7f | 2013-02-18 02:44:09 +0000 | [diff] [blame] | 615 | using llvm::yaml::MappingTraits; |
Nick Kledzik | f60a927 | 2012-12-12 20:46:15 +0000 | [diff] [blame] | 616 | using llvm::yaml::IO; |
| 617 | |
| 618 | struct Info { |
| 619 | CPUs cpu; |
| 620 | uint32_t flags; |
| 621 | }; |
| 622 | |
| 623 | template <> |
Alex Rosenberg | 7f5af7f | 2013-02-18 02:44:09 +0000 | [diff] [blame] | 624 | struct MappingTraits<Info> { |
Nick Kledzik | f60a927 | 2012-12-12 20:46:15 +0000 | [diff] [blame] | 625 | static void mapping(IO &io, Info &info) { |
| 626 | io.mapRequired("cpu", info.cpu); |
| 627 | // flags must come after cpu for this to work when reading yaml |
| 628 | if ( info.cpu == cpu_x86_64 ) |
| 629 | io.mapRequired("flags", *(My86_64Flags*)info.flags); |
| 630 | else |
| 631 | io.mapRequired("flags", *(My86Flags*)info.flags); |
| 632 | } |
| 633 | }; |
| 634 | |
| 635 | |
| 636 | Sequence |
| 637 | ======== |
| 638 | |
| 639 | To be translated to or from a YAML sequence for your type T you must specialize |
| 640 | llvm::yaml::SequenceTraits on T and implement two methods: |
Dmitri Gribenko | e813112 | 2013-01-19 20:34:20 +0000 | [diff] [blame] | 641 | ``size_t size(IO &io, T&)`` and |
| 642 | ``T::value_type& element(IO &io, T&, size_t indx)``. For example: |
Nick Kledzik | f60a927 | 2012-12-12 20:46:15 +0000 | [diff] [blame] | 643 | |
| 644 | .. code-block:: c++ |
| 645 | |
| 646 | template <> |
| 647 | struct SequenceTraits<MySeq> { |
| 648 | static size_t size(IO &io, MySeq &list) { ... } |
Rui Ueyama | 539b1df | 2013-09-12 01:43:21 +0000 | [diff] [blame^] | 649 | static MySeqEl &element(IO &io, MySeq &list, size_t index) { ... } |
Nick Kledzik | f60a927 | 2012-12-12 20:46:15 +0000 | [diff] [blame] | 650 | }; |
| 651 | |
| 652 | The size() method returns how many elements are currently in your sequence. |
| 653 | The element() method returns a reference to the i'th element in the sequence. |
| 654 | When parsing YAML, the element() method may be called with an index one bigger |
| 655 | than the current size. Your element() method should allocate space for one |
| 656 | more element (using default constructor if element is a C++ object) and returns |
| 657 | a reference to that new allocated space. |
| 658 | |
| 659 | |
| 660 | Flow Sequence |
| 661 | ------------- |
| 662 | A YAML "flow sequence" is a sequence that when written to YAML it uses the |
| 663 | inline notation (e.g [ foo, bar ] ). To specify that a sequence type should |
| 664 | be written in YAML as a flow sequence, your SequenceTraits specialization should |
| 665 | add "static const bool flow = true;". For instance: |
| 666 | |
| 667 | .. code-block:: c++ |
| 668 | |
| 669 | template <> |
| 670 | struct SequenceTraits<MyList> { |
| 671 | static size_t size(IO &io, MyList &list) { ... } |
Rui Ueyama | 539b1df | 2013-09-12 01:43:21 +0000 | [diff] [blame^] | 672 | static MyListEl &element(IO &io, MyList &list, size_t index) { ... } |
Nick Kledzik | f60a927 | 2012-12-12 20:46:15 +0000 | [diff] [blame] | 673 | |
| 674 | // The existence of this member causes YAML I/O to use a flow sequence |
| 675 | static const bool flow = true; |
| 676 | }; |
| 677 | |
| 678 | With the above, if you used MyList as the data type in your native data |
Alex Rosenberg | 7f5af7f | 2013-02-18 02:44:09 +0000 | [diff] [blame] | 679 | structures, then then when converted to YAML, a flow sequence of integers |
Nick Kledzik | f60a927 | 2012-12-12 20:46:15 +0000 | [diff] [blame] | 680 | will be used (e.g. [ 10, -3, 4 ]). |
| 681 | |
| 682 | |
| 683 | Utility Macros |
| 684 | -------------- |
Alex Rosenberg | 7f5af7f | 2013-02-18 02:44:09 +0000 | [diff] [blame] | 685 | Since a common source of sequences is std::vector<>, YAML I/O provides macros: |
Nick Kledzik | f60a927 | 2012-12-12 20:46:15 +0000 | [diff] [blame] | 686 | LLVM_YAML_IS_SEQUENCE_VECTOR() and LLVM_YAML_IS_FLOW_SEQUENCE_VECTOR() which |
| 687 | can be used to easily specify SequenceTraits<> on a std::vector type. YAML |
| 688 | I/O does not partial specialize SequenceTraits on std::vector<> because that |
| 689 | would force all vectors to be sequences. An example use of the macros: |
| 690 | |
| 691 | .. code-block:: c++ |
| 692 | |
| 693 | std::vector<MyType1>; |
| 694 | std::vector<MyType2>; |
| 695 | LLVM_YAML_IS_SEQUENCE_VECTOR(MyType1) |
| 696 | LLVM_YAML_IS_FLOW_SEQUENCE_VECTOR(MyType2) |
| 697 | |
| 698 | |
| 699 | |
| 700 | Document List |
| 701 | ============= |
| 702 | |
| 703 | YAML allows you to define multiple "documents" in a single YAML file. Each |
| 704 | new document starts with a left aligned "---" token. The end of all documents |
| 705 | is denoted with a left aligned "..." token. Many users of YAML will never |
| 706 | have need for multiple documents. The top level node in their YAML schema |
| 707 | will be a mapping or sequence. For those cases, the following is not needed. |
| 708 | But for cases where you do want multiple documents, you can specify a |
| 709 | trait for you document list type. The trait has the same methods as |
| 710 | SequenceTraits but is named DocumentListTraits. For example: |
| 711 | |
| 712 | .. code-block:: c++ |
| 713 | |
| 714 | template <> |
| 715 | struct DocumentListTraits<MyDocList> { |
| 716 | static size_t size(IO &io, MyDocList &list) { ... } |
| 717 | static MyDocType element(IO &io, MyDocList &list, size_t index) { ... } |
| 718 | }; |
| 719 | |
| 720 | |
| 721 | User Context Data |
| 722 | ================= |
| 723 | When an llvm::yaml::Input or llvm::yaml::Output object is created their |
| 724 | constructors take an optional "context" parameter. This is a pointer to |
| 725 | whatever state information you might need. |
| 726 | |
| 727 | For instance, in a previous example we showed how the conversion type for a |
| 728 | flags field could be determined at runtime based on the value of another field |
| 729 | in the mapping. But what if an inner mapping needs to know some field value |
| 730 | of an outer mapping? That is where the "context" parameter comes in. You |
| 731 | can set values in the context in the outer map's mapping() method and |
| 732 | retrieve those values in the inner map's mapping() method. |
| 733 | |
| 734 | The context value is just a void*. All your traits which use the context |
| 735 | and operate on your native data types, need to agree what the context value |
| 736 | actually is. It could be a pointer to an object or struct which your various |
| 737 | traits use to shared context sensitive information. |
| 738 | |
| 739 | |
| 740 | Output |
| 741 | ====== |
| 742 | |
| 743 | The llvm::yaml::Output class is used to generate a YAML document from your |
| 744 | in-memory data structures, using traits defined on your data types. |
| 745 | To instantiate an Output object you need an llvm::raw_ostream, and optionally |
| 746 | a context pointer: |
| 747 | |
| 748 | .. code-block:: c++ |
| 749 | |
| 750 | class Output : public IO { |
| 751 | public: |
| 752 | Output(llvm::raw_ostream &, void *context=NULL); |
| 753 | |
| 754 | Once you have an Output object, you can use the C++ stream operator on it |
| 755 | to write your native data as YAML. One thing to recall is that a YAML file |
| 756 | can contain multiple "documents". If the top level data structure you are |
| 757 | streaming as YAML is a mapping, scalar, or sequence, then Output assumes you |
| 758 | are generating one document and wraps the mapping output |
| 759 | with "``---``" and trailing "``...``". |
| 760 | |
| 761 | .. code-block:: c++ |
| 762 | |
| 763 | using llvm::yaml::Output; |
| 764 | |
| 765 | void dumpMyMapDoc(const MyMapType &info) { |
| 766 | Output yout(llvm::outs()); |
| 767 | yout << info; |
| 768 | } |
| 769 | |
| 770 | The above could produce output like: |
| 771 | |
| 772 | .. code-block:: yaml |
| 773 | |
| 774 | --- |
| 775 | name: Tom |
| 776 | hat-size: 7 |
| 777 | ... |
| 778 | |
| 779 | On the other hand, if the top level data structure you are streaming as YAML |
| 780 | has a DocumentListTraits specialization, then Output walks through each element |
| 781 | of your DocumentList and generates a "---" before the start of each element |
| 782 | and ends with a "...". |
| 783 | |
| 784 | .. code-block:: c++ |
| 785 | |
| 786 | using llvm::yaml::Output; |
| 787 | |
| 788 | void dumpMyMapDoc(const MyDocListType &docList) { |
| 789 | Output yout(llvm::outs()); |
| 790 | yout << docList; |
| 791 | } |
| 792 | |
| 793 | The above could produce output like: |
| 794 | |
| 795 | .. code-block:: yaml |
| 796 | |
| 797 | --- |
| 798 | name: Tom |
| 799 | hat-size: 7 |
| 800 | --- |
| 801 | name: Tom |
| 802 | shoe-size: 11 |
| 803 | ... |
| 804 | |
| 805 | Input |
| 806 | ===== |
| 807 | |
| 808 | The llvm::yaml::Input class is used to parse YAML document(s) into your native |
| 809 | data structures. To instantiate an Input |
| 810 | object you need a StringRef to the entire YAML file, and optionally a context |
| 811 | pointer: |
| 812 | |
| 813 | .. code-block:: c++ |
| 814 | |
| 815 | class Input : public IO { |
| 816 | public: |
| 817 | Input(StringRef inputContent, void *context=NULL); |
| 818 | |
| 819 | Once you have an Input object, you can use the C++ stream operator to read |
| 820 | the document(s). If you expect there might be multiple YAML documents in |
| 821 | one file, you'll need to specialize DocumentListTraits on a list of your |
| 822 | document type and stream in that document list type. Otherwise you can |
| 823 | just stream in the document type. Also, you can check if there was |
| 824 | any syntax errors in the YAML be calling the error() method on the Input |
| 825 | object. For example: |
| 826 | |
| 827 | .. code-block:: c++ |
| 828 | |
| 829 | // Reading a single document |
| 830 | using llvm::yaml::Input; |
| 831 | |
| 832 | Input yin(mb.getBuffer()); |
| 833 | |
| 834 | // Parse the YAML file |
| 835 | MyDocType theDoc; |
| 836 | yin >> theDoc; |
| 837 | |
| 838 | // Check for error |
| 839 | if ( yin.error() ) |
| 840 | return; |
| 841 | |
| 842 | |
| 843 | .. code-block:: c++ |
| 844 | |
| 845 | // Reading multiple documents in one file |
| 846 | using llvm::yaml::Input; |
| 847 | |
| 848 | LLVM_YAML_IS_DOCUMENT_LIST_VECTOR(std::vector<MyDocType>) |
| 849 | |
| 850 | Input yin(mb.getBuffer()); |
| 851 | |
| 852 | // Parse the YAML file |
| 853 | std::vector<MyDocType> theDocList; |
| 854 | yin >> theDocList; |
| 855 | |
| 856 | // Check for error |
| 857 | if ( yin.error() ) |
| 858 | return; |
| 859 | |
| 860 | |