Daniel Dunbar | 3b709d5 | 2012-05-08 16:50:35 +0000 | [diff] [blame] | 1 | llvm-ar - LLVM archiver |
| 2 | ======================= |
| 3 | |
| 4 | |
| 5 | SYNOPSIS |
| 6 | -------- |
| 7 | |
| 8 | |
| 9 | **llvm-ar** [-]{dmpqrtx}[Rabfikouz] [relpos] [count] <archive> [files...] |
| 10 | |
| 11 | |
| 12 | DESCRIPTION |
| 13 | ----------- |
| 14 | |
| 15 | |
| 16 | The **llvm-ar** command is similar to the common Unix utility, ``ar``. It |
| 17 | archives several files together into a single file. The intent for this is |
| 18 | to produce archive libraries by LLVM bitcode that can be linked into an |
| 19 | LLVM program. However, the archive can contain any kind of file. By default, |
| 20 | **llvm-ar** generates a symbol table that makes linking faster because |
| 21 | only the symbol table needs to be consulted, not each individual file member |
| 22 | of the archive. |
| 23 | |
| 24 | The **llvm-ar** command can be used to *read* both SVR4 and BSD style archive |
| 25 | files. However, it cannot be used to write them. While the **llvm-ar** command |
| 26 | produces files that are *almost* identical to the format used by other ``ar`` |
| 27 | implementations, it has two significant departures in order to make the |
| 28 | archive appropriate for LLVM. The first departure is that **llvm-ar** only |
| 29 | uses BSD4.4 style long path names (stored immediately after the header) and |
| 30 | never contains a string table for long names. The second departure is that the |
| 31 | symbol table is formated for efficient construction of an in-memory data |
| 32 | structure that permits rapid (red-black tree) lookups. Consequently, archives |
| 33 | produced with **llvm-ar** usually won't be readable or editable with any |
| 34 | ``ar`` implementation or useful for linking. Using the ``f`` modifier to flatten |
| 35 | file names will make the archive readable by other ``ar`` implementations |
| 36 | but not for linking because the symbol table format for LLVM is unique. If an |
| 37 | SVR4 or BSD style archive is used with the ``r`` (replace) or ``q`` (quick |
| 38 | update) operations, the archive will be reconstructed in LLVM format. This |
| 39 | means that the string table will be dropped (in deference to BSD 4.4 long names) |
| 40 | and an LLVM symbol table will be added (by default). The system symbol table |
| 41 | will be retained. |
| 42 | |
| 43 | Here's where **llvm-ar** departs from previous ``ar`` implementations: |
| 44 | |
| 45 | |
| 46 | *Symbol Table* |
| 47 | |
| 48 | Since **llvm-ar** is intended to archive bitcode files, the symbol table |
| 49 | won't make much sense to anything but LLVM. Consequently, the symbol table's |
| 50 | format has been simplified. It consists simply of a sequence of pairs |
| 51 | of a file member index number as an LSB 4byte integer and a null-terminated |
| 52 | string. |
| 53 | |
| 54 | |
| 55 | |
| 56 | *Long Paths* |
| 57 | |
| 58 | Some ``ar`` implementations (SVR4) use a separate file member to record long |
| 59 | path names (> 15 characters). **llvm-ar** takes the BSD 4.4 and Mac OS X |
| 60 | approach which is to simply store the full path name immediately preceding |
| 61 | the data for the file. The path name is null terminated and may contain the |
| 62 | slash (/) character. |
| 63 | |
| 64 | |
| 65 | |
| 66 | *Compression* |
| 67 | |
| 68 | **llvm-ar** can compress the members of an archive to save space. The |
| 69 | compression used depends on what's available on the platform and what choices |
| 70 | the LLVM Compressor utility makes. It generally favors bzip2 but will select |
| 71 | between "no compression" or bzip2 depending on what makes sense for the |
| 72 | file's content. |
| 73 | |
| 74 | |
| 75 | |
| 76 | *Directory Recursion* |
| 77 | |
| 78 | Most ``ar`` implementations do not recurse through directories but simply |
| 79 | ignore directories if they are presented to the program in the *files* |
| 80 | option. **llvm-ar**, however, can recurse through directory structures and |
| 81 | add all the files under a directory, if requested. |
| 82 | |
| 83 | |
| 84 | |
| 85 | *TOC Verbose Output* |
| 86 | |
| 87 | When **llvm-ar** prints out the verbose table of contents (``tv`` option), it |
| 88 | precedes the usual output with a character indicating the basic kind of |
| 89 | content in the file. A blank means the file is a regular file. A 'Z' means |
| 90 | the file is compressed. A 'B' means the file is an LLVM bitcode file. An |
| 91 | 'S' means the file is the symbol table. |
| 92 | |
| 93 | |
| 94 | |
| 95 | |
| 96 | OPTIONS |
| 97 | ------- |
| 98 | |
| 99 | |
| 100 | The options to **llvm-ar** are compatible with other ``ar`` implementations. |
| 101 | However, there are a few modifiers (*zR*) that are not found in other ``ar`` |
| 102 | implementations. The options to **llvm-ar** specify a single basic operation to |
| 103 | perform on the archive, a variety of modifiers for that operation, the name of |
| 104 | the archive file, and an optional list of file names. These options are used to |
| 105 | determine how **llvm-ar** should process the archive file. |
| 106 | |
| 107 | The Operations and Modifiers are explained in the sections below. The minimal |
| 108 | set of options is at least one operator and the name of the archive. Typically |
| 109 | archive files end with a ``.a`` suffix, but this is not required. Following |
| 110 | the *archive-name* comes a list of *files* that indicate the specific members |
| 111 | of the archive to operate on. If the *files* option is not specified, it |
| 112 | generally means either "none" or "all" members, depending on the operation. |
| 113 | |
| 114 | Operations |
| 115 | ~~~~~~~~~~ |
| 116 | |
| 117 | |
| 118 | |
| 119 | d |
| 120 | |
| 121 | Delete files from the archive. No modifiers are applicable to this operation. |
| 122 | The *files* options specify which members should be removed from the |
| 123 | archive. It is not an error if a specified file does not appear in the archive. |
| 124 | If no *files* are specified, the archive is not modified. |
| 125 | |
| 126 | |
| 127 | |
| 128 | m[abi] |
| 129 | |
| 130 | Move files from one location in the archive to another. The *a*, *b*, and |
| 131 | *i* modifiers apply to this operation. The *files* will all be moved |
| 132 | to the location given by the modifiers. If no modifiers are used, the files |
| 133 | will be moved to the end of the archive. If no *files* are specified, the |
| 134 | archive is not modified. |
| 135 | |
| 136 | |
| 137 | |
| 138 | p[k] |
| 139 | |
| 140 | Print files to the standard output. The *k* modifier applies to this |
| 141 | operation. This operation simply prints the *files* indicated to the |
| 142 | standard output. If no *files* are specified, the entire archive is printed. |
| 143 | Printing bitcode files is ill-advised as they might confuse your terminal |
| 144 | settings. The *p* operation never modifies the archive. |
| 145 | |
| 146 | |
| 147 | |
| 148 | q[Rfz] |
| 149 | |
| 150 | Quickly append files to the end of the archive. The *R*, *f*, and *z* |
| 151 | modifiers apply to this operation. This operation quickly adds the |
| 152 | *files* to the archive without checking for duplicates that should be |
| 153 | removed first. If no *files* are specified, the archive is not modified. |
| 154 | Because of the way that **llvm-ar** constructs the archive file, its dubious |
| 155 | whether the *q* operation is any faster than the *r* operation. |
| 156 | |
| 157 | |
| 158 | |
| 159 | r[Rabfuz] |
| 160 | |
| 161 | Replace or insert file members. The *R*, *a*, *b*, *f*, *u*, and *z* |
| 162 | modifiers apply to this operation. This operation will replace existing |
| 163 | *files* or insert them at the end of the archive if they do not exist. If no |
| 164 | *files* are specified, the archive is not modified. |
| 165 | |
| 166 | |
| 167 | |
| 168 | t[v] |
| 169 | |
| 170 | Print the table of contents. Without any modifiers, this operation just prints |
| 171 | the names of the members to the standard output. With the *v* modifier, |
| 172 | **llvm-ar** also prints out the file type (B=bitcode, Z=compressed, S=symbol |
| 173 | table, blank=regular file), the permission mode, the owner and group, the |
| 174 | size, and the date. If any *files* are specified, the listing is only for |
| 175 | those files. If no *files* are specified, the table of contents for the |
| 176 | whole archive is printed. |
| 177 | |
| 178 | |
| 179 | |
| 180 | x[oP] |
| 181 | |
| 182 | Extract archive members back to files. The *o* modifier applies to this |
| 183 | operation. This operation retrieves the indicated *files* from the archive |
| 184 | and writes them back to the operating system's file system. If no |
| 185 | *files* are specified, the entire archive is extract. |
| 186 | |
| 187 | |
| 188 | |
| 189 | |
| 190 | Modifiers (operation specific) |
| 191 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 192 | |
| 193 | |
| 194 | The modifiers below are specific to certain operations. See the Operations |
| 195 | section (above) to determine which modifiers are applicable to which operations. |
| 196 | |
| 197 | |
| 198 | [a] |
| 199 | |
| 200 | When inserting or moving member files, this option specifies the destination of |
| 201 | the new files as being after the *relpos* member. If *relpos* is not found, |
| 202 | the files are placed at the end of the archive. |
| 203 | |
| 204 | |
| 205 | |
| 206 | [b] |
| 207 | |
| 208 | When inserting or moving member files, this option specifies the destination of |
| 209 | the new files as being before the *relpos* member. If *relpos* is not |
| 210 | found, the files are placed at the end of the archive. This modifier is |
Sylvestre Ledru | c8e41c5 | 2012-07-23 08:51:15 +0000 | [diff] [blame^] | 211 | identical to the *i* modifier. |
Daniel Dunbar | 3b709d5 | 2012-05-08 16:50:35 +0000 | [diff] [blame] | 212 | |
| 213 | |
| 214 | |
| 215 | [f] |
| 216 | |
| 217 | Normally, **llvm-ar** stores the full path name to a file as presented to it on |
| 218 | the command line. With this option, truncated (15 characters max) names are |
| 219 | used. This ensures name compatibility with older versions of ``ar`` but may also |
| 220 | thwart correct extraction of the files (duplicates may overwrite). If used with |
| 221 | the *R* option, the directory recursion will be performed but the file names |
| 222 | will all be flattened to simple file names. |
| 223 | |
| 224 | |
| 225 | |
| 226 | [i] |
| 227 | |
| 228 | A synonym for the *b* option. |
| 229 | |
| 230 | |
| 231 | |
| 232 | [k] |
| 233 | |
| 234 | Normally, **llvm-ar** will not print the contents of bitcode files when the |
| 235 | *p* operation is used. This modifier defeats the default and allows the |
| 236 | bitcode members to be printed. |
| 237 | |
| 238 | |
| 239 | |
| 240 | [N] |
| 241 | |
| 242 | This option is ignored by **llvm-ar** but provided for compatibility. |
| 243 | |
| 244 | |
| 245 | |
| 246 | [o] |
| 247 | |
| 248 | When extracting files, this option will cause **llvm-ar** to preserve the |
| 249 | original modification times of the files it writes. |
| 250 | |
| 251 | |
| 252 | |
| 253 | [P] |
| 254 | |
| 255 | use full path names when matching |
| 256 | |
| 257 | |
| 258 | |
| 259 | [R] |
| 260 | |
| 261 | This modifier instructions the *r* option to recursively process directories. |
| 262 | Without *R*, directories are ignored and only those *files* that refer to |
| 263 | files will be added to the archive. When *R* is used, any directories specified |
| 264 | with *files* will be scanned (recursively) to find files to be added to the |
| 265 | archive. Any file whose name begins with a dot will not be added. |
| 266 | |
| 267 | |
| 268 | |
| 269 | [u] |
| 270 | |
| 271 | When replacing existing files in the archive, only replace those files that have |
| 272 | a time stamp than the time stamp of the member in the archive. |
| 273 | |
| 274 | |
| 275 | |
| 276 | [z] |
| 277 | |
| 278 | When inserting or replacing any file in the archive, compress the file first. |
| 279 | This |
| 280 | modifier is safe to use when (previously) compressed bitcode files are added to |
| 281 | the archive; the compressed bitcode files will not be doubly compressed. |
| 282 | |
| 283 | |
| 284 | |
| 285 | |
| 286 | Modifiers (generic) |
| 287 | ~~~~~~~~~~~~~~~~~~~ |
| 288 | |
| 289 | |
| 290 | The modifiers below may be applied to any operation. |
| 291 | |
| 292 | |
| 293 | [c] |
| 294 | |
| 295 | For all operations, **llvm-ar** will always create the archive if it doesn't |
| 296 | exist. Normally, **llvm-ar** will print a warning message indicating that the |
| 297 | archive is being created. Using this modifier turns off that warning. |
| 298 | |
| 299 | |
| 300 | |
| 301 | [s] |
| 302 | |
| 303 | This modifier requests that an archive index (or symbol table) be added to the |
| 304 | archive. This is the default mode of operation. The symbol table will contain |
| 305 | all the externally visible functions and global variables defined by all the |
| 306 | bitcode files in the archive. Using this modifier is more efficient that using |
| 307 | llvm-ranlib|llvm-ranlib which also creates the symbol table. |
| 308 | |
| 309 | |
| 310 | |
| 311 | [S] |
| 312 | |
| 313 | This modifier is the opposite of the *s* modifier. It instructs **llvm-ar** to |
| 314 | not build the symbol table. If both *s* and *S* are used, the last modifier to |
| 315 | occur in the options will prevail. |
| 316 | |
| 317 | |
| 318 | |
| 319 | [v] |
| 320 | |
| 321 | This modifier instructs **llvm-ar** to be verbose about what it is doing. Each |
| 322 | editing operation taken against the archive will produce a line of output saying |
| 323 | what is being done. |
| 324 | |
| 325 | |
| 326 | |
| 327 | |
| 328 | |
| 329 | STANDARDS |
| 330 | --------- |
| 331 | |
| 332 | |
| 333 | The **llvm-ar** utility is intended to provide a superset of the IEEE Std 1003.2 |
| 334 | (POSIX.2) functionality for ``ar``. **llvm-ar** can read both SVR4 and BSD4.4 (or |
| 335 | Mac OS X) archives. If the ``f`` modifier is given to the ``x`` or ``r`` operations |
| 336 | then **llvm-ar** will write SVR4 compatible archives. Without this modifier, |
| 337 | **llvm-ar** will write BSD4.4 compatible archives that have long names |
| 338 | immediately after the header and indicated using the "#1/ddd" notation for the |
| 339 | name in the header. |
| 340 | |
| 341 | |
| 342 | FILE FORMAT |
| 343 | ----------- |
| 344 | |
| 345 | |
| 346 | The file format for LLVM Archive files is similar to that of BSD 4.4 or Mac OSX |
| 347 | archive files. In fact, except for the symbol table, the ``ar`` commands on those |
| 348 | operating systems should be able to read LLVM archive files. The details of the |
| 349 | file format follow. |
| 350 | |
| 351 | Each archive begins with the archive magic number which is the eight printable |
| 352 | characters "!<arch>\n" where \n represents the newline character (0x0A). |
| 353 | Following the magic number, the file is composed of even length members that |
| 354 | begin with an archive header and end with a \n padding character if necessary |
| 355 | (to make the length even). Each file member is composed of a header (defined |
| 356 | below), an optional newline-terminated "long file name" and the contents of |
| 357 | the file. |
| 358 | |
| 359 | The fields of the header are described in the items below. All fields of the |
| 360 | header contain only ASCII characters, are left justified and are right padded |
| 361 | with space characters. |
| 362 | |
| 363 | |
| 364 | name - char[16] |
| 365 | |
| 366 | This field of the header provides the name of the archive member. If the name is |
| 367 | longer than 15 characters or contains a slash (/) character, then this field |
| 368 | contains ``#1/nnn`` where ``nnn`` provides the length of the name and the ``#1/`` |
| 369 | is literal. In this case, the actual name of the file is provided in the ``nnn`` |
| 370 | bytes immediately following the header. If the name is 15 characters or less, it |
| 371 | is contained directly in this field and terminated with a slash (/) character. |
| 372 | |
| 373 | |
| 374 | |
| 375 | date - char[12] |
| 376 | |
| 377 | This field provides the date of modification of the file in the form of a |
| 378 | decimal encoded number that provides the number of seconds since the epoch |
| 379 | (since 00:00:00 Jan 1, 1970) per Posix specifications. |
| 380 | |
| 381 | |
| 382 | |
| 383 | uid - char[6] |
| 384 | |
| 385 | This field provides the user id of the file encoded as a decimal ASCII string. |
| 386 | This field might not make much sense on non-Unix systems. On Unix, it is the |
| 387 | same value as the st_uid field of the stat structure returned by the stat(2) |
| 388 | operating system call. |
| 389 | |
| 390 | |
| 391 | |
| 392 | gid - char[6] |
| 393 | |
| 394 | This field provides the group id of the file encoded as a decimal ASCII string. |
| 395 | This field might not make much sense on non-Unix systems. On Unix, it is the |
| 396 | same value as the st_gid field of the stat structure returned by the stat(2) |
| 397 | operating system call. |
| 398 | |
| 399 | |
| 400 | |
| 401 | mode - char[8] |
| 402 | |
| 403 | This field provides the access mode of the file encoded as an octal ASCII |
| 404 | string. This field might not make much sense on non-Unix systems. On Unix, it |
| 405 | is the same value as the st_mode field of the stat structure returned by the |
| 406 | stat(2) operating system call. |
| 407 | |
| 408 | |
| 409 | |
| 410 | size - char[10] |
| 411 | |
| 412 | This field provides the size of the file, in bytes, encoded as a decimal ASCII |
| 413 | string. If the size field is negative (starts with a minus sign, 0x02D), then |
| 414 | the archive member is stored in compressed form. The first byte of the archive |
| 415 | member's data indicates the compression type used. A value of 0 (0x30) indicates |
| 416 | that no compression was used. A value of 2 (0x32) indicates that bzip2 |
| 417 | compression was used. |
| 418 | |
| 419 | |
| 420 | |
| 421 | fmag - char[2] |
| 422 | |
| 423 | This field is the archive file member magic number. Its content is always the |
| 424 | two characters back tick (0x60) and newline (0x0A). This provides some measure |
| 425 | utility in identifying archive files that have been corrupted. |
| 426 | |
| 427 | |
| 428 | |
| 429 | The LLVM symbol table has the special name "#_LLVM_SYM_TAB_#". It is presumed |
| 430 | that no regular archive member file will want this name. The LLVM symbol table |
| 431 | is simply composed of a sequence of triplets: byte offset, length of symbol, |
| 432 | and the symbol itself. Symbols are not null or newline terminated. Here are |
| 433 | the details on each of these items: |
| 434 | |
| 435 | |
| 436 | offset - vbr encoded 32-bit integer |
| 437 | |
| 438 | The offset item provides the offset into the archive file where the bitcode |
| 439 | member is stored that is associated with the symbol. The offset value is 0 |
| 440 | based at the start of the first "normal" file member. To derive the actual |
| 441 | file offset of the member, you must add the number of bytes occupied by the file |
| 442 | signature (8 bytes) and the symbol tables. The value of this item is encoded |
| 443 | using variable bit rate encoding to reduce the size of the symbol table. |
| 444 | Variable bit rate encoding uses the high bit (0x80) of each byte to indicate |
| 445 | if there are more bytes to follow. The remaining 7 bits in each byte carry bits |
| 446 | from the value. The final byte does not have the high bit set. |
| 447 | |
| 448 | |
| 449 | |
| 450 | length - vbr encoded 32-bit integer |
| 451 | |
| 452 | The length item provides the length of the symbol that follows. Like this |
| 453 | *offset* item, the length is variable bit rate encoded. |
| 454 | |
| 455 | |
| 456 | |
| 457 | symbol - character array |
| 458 | |
| 459 | The symbol item provides the text of the symbol that is associated with the |
| 460 | *offset*. The symbol is not terminated by any character. Its length is provided |
| 461 | by the *length* field. Note that is allowed (but unwise) to use non-printing |
| 462 | characters (even 0x00) in the symbol. This allows for multiple encodings of |
| 463 | symbol names. |
| 464 | |
| 465 | |
| 466 | |
| 467 | |
| 468 | EXIT STATUS |
| 469 | ----------- |
| 470 | |
| 471 | |
| 472 | If **llvm-ar** succeeds, it will exit with 0. A usage error, results |
| 473 | in an exit code of 1. A hard (file system typically) error results in an |
| 474 | exit code of 2. Miscellaneous or unknown errors result in an |
| 475 | exit code of 3. |
| 476 | |
| 477 | |
| 478 | SEE ALSO |
| 479 | -------- |
| 480 | |
| 481 | |
| 482 | llvm-ranlib|llvm-ranlib, ar(1) |