Dmitri Gribenko | fee64ee | 2012-11-18 18:40:21 +0000 | [diff] [blame] | 1 | ============== |
| 2 | System Library |
| 3 | ============== |
| 4 | |
Dmitri Gribenko | fee64ee | 2012-11-18 18:40:21 +0000 | [diff] [blame] | 5 | Abstract |
| 6 | ======== |
| 7 | |
Dmitri Gribenko | fee64ee | 2012-11-18 18:40:21 +0000 | [diff] [blame] | 8 | This document provides some details on LLVM's System Library, located in the |
| 9 | source at ``lib/System`` and ``include/llvm/System``. The library's purpose is |
| 10 | to shield LLVM from the differences between operating systems for the few |
| 11 | services LLVM needs from the operating system. Much of LLVM is written using |
| 12 | portability features of standard C++. However, in a few areas, system dependent |
| 13 | facilities are needed and the System Library is the wrapper around those system |
| 14 | calls. |
| 15 | |
| 16 | By centralizing LLVM's use of operating system interfaces, we make it possible |
| 17 | for the LLVM tool chain and runtime libraries to be more easily ported to new |
| 18 | platforms since (theoretically) only ``lib/System`` needs to be ported. This |
| 19 | library also unclutters the rest of LLVM from #ifdef use and special cases for |
| 20 | specific operating systems. Such uses are replaced with simple calls to the |
| 21 | interfaces provided in ``include/llvm/System``. |
| 22 | |
| 23 | Note that the System Library is not intended to be a complete operating system |
| 24 | wrapper (such as the Adaptive Communications Environment (ACE) or Apache |
| 25 | Portable Runtime (APR)), but only provides the functionality necessary to |
| 26 | support LLVM. |
| 27 | |
| 28 | The System Library was written by Reid Spencer who formulated the design based |
| 29 | on similar work originating from the eXtensible Programming System (XPS). |
| 30 | Several people helped with the effort; especially, Jeff Cohen and Henrik Bach |
| 31 | on the Win32 port. |
| 32 | |
| 33 | Keeping LLVM Portable |
| 34 | ===================== |
| 35 | |
| 36 | In order to keep LLVM portable, LLVM developers should adhere to a set of |
| 37 | portability rules associated with the System Library. Adherence to these rules |
| 38 | should help the System Library achieve its goal of shielding LLVM from the |
| 39 | variations in operating system interfaces and doing so efficiently. The |
| 40 | following sections define the rules needed to fulfill this objective. |
| 41 | |
| 42 | Don't Include System Headers |
| 43 | ---------------------------- |
| 44 | |
| 45 | Except in ``lib/System``, no LLVM source code should directly ``#include`` a |
| 46 | system header. Care has been taken to remove all such ``#includes`` from LLVM |
| 47 | while ``lib/System`` was being developed. Specifically this means that header |
| 48 | files like "``unistd.h``", "``windows.h``", "``stdio.h``", and "``string.h``" |
| 49 | are forbidden to be included by LLVM source code outside the implementation of |
| 50 | ``lib/System``. |
| 51 | |
| 52 | To obtain system-dependent functionality, existing interfaces to the system |
| 53 | found in ``include/llvm/System`` should be used. If an appropriate interface is |
| 54 | not available, it should be added to ``include/llvm/System`` and implemented in |
| 55 | ``lib/System`` for all supported platforms. |
| 56 | |
| 57 | Don't Expose System Headers |
| 58 | --------------------------- |
| 59 | |
| 60 | The System Library must shield LLVM from **all** system headers. To obtain |
| 61 | system level functionality, LLVM source must ``#include "llvm/System/Thing.h"`` |
| 62 | and nothing else. This means that ``Thing.h`` cannot expose any system header |
| 63 | files. This protects LLVM from accidentally using system specific functionality |
| 64 | and only allows it via the ``lib/System`` interface. |
| 65 | |
| 66 | Use Standard C Headers |
| 67 | ---------------------- |
| 68 | |
| 69 | The **standard** C headers (the ones beginning with "c") are allowed to be |
| 70 | exposed through the ``lib/System`` interface. These headers and the things they |
| 71 | declare are considered to be platform agnostic. LLVM source files may include |
| 72 | them directly or obtain their inclusion through ``lib/System`` interfaces. |
| 73 | |
| 74 | Use Standard C++ Headers |
| 75 | ------------------------ |
| 76 | |
| 77 | The **standard** C++ headers from the standard C++ library and standard |
| 78 | template library may be exposed through the ``lib/System`` interface. These |
| 79 | headers and the things they declare are considered to be platform agnostic. |
| 80 | LLVM source files may include them or obtain their inclusion through |
| 81 | ``lib/System`` interfaces. |
| 82 | |
| 83 | High Level Interface |
| 84 | -------------------- |
| 85 | |
| 86 | The entry points specified in the interface of ``lib/System`` must be aimed at |
| 87 | completing some reasonably high level task needed by LLVM. We do not want to |
| 88 | simply wrap each operating system call. It would be preferable to wrap several |
| 89 | operating system calls that are always used in conjunction with one another by |
| 90 | LLVM. |
| 91 | |
| 92 | For example, consider what is needed to execute a program, wait for it to |
| 93 | complete, and return its result code. On Unix, this involves the following |
| 94 | operating system calls: ``getenv``, ``fork``, ``execve``, and ``wait``. The |
| 95 | correct thing for ``lib/System`` to provide is a function, say |
| 96 | ``ExecuteProgramAndWait``, that implements the functionality completely. what |
| 97 | we don't want is wrappers for the operating system calls involved. |
| 98 | |
| 99 | There must **not** be a one-to-one relationship between operating system |
| 100 | calls and the System library's interface. Any such interface function will be |
| 101 | suspicious. |
| 102 | |
| 103 | No Unused Functionality |
| 104 | ----------------------- |
| 105 | |
| 106 | There must be no functionality specified in the interface of ``lib/System`` |
| 107 | that isn't actually used by LLVM. We're not writing a general purpose operating |
| 108 | system wrapper here, just enough to satisfy LLVM's needs. And, LLVM doesn't |
| 109 | need much. This design goal aims to keep the ``lib/System`` interface small and |
| 110 | understandable which should foster its actual use and adoption. |
| 111 | |
| 112 | No Duplicate Implementations |
| 113 | ---------------------------- |
| 114 | |
| 115 | The implementation of a function for a given platform must be written exactly |
| 116 | once. This implies that it must be possible to apply a function's |
| 117 | implementation to multiple operating systems if those operating systems can |
| 118 | share the same implementation. This rule applies to the set of operating |
| 119 | systems supported for a given class of operating system (e.g. Unix, Win32). |
| 120 | |
| 121 | No Virtual Methods |
| 122 | ------------------ |
| 123 | |
| 124 | The System Library interfaces can be called quite frequently by LLVM. In order |
| 125 | to make those calls as efficient as possible, we discourage the use of virtual |
| 126 | methods. There is no need to use inheritance for implementation differences, it |
| 127 | just adds complexity. The ``#include`` mechanism works just fine. |
| 128 | |
| 129 | No Exposed Functions |
| 130 | -------------------- |
| 131 | |
| 132 | Any functions defined by system libraries (i.e. not defined by ``lib/System``) |
| 133 | must not be exposed through the ``lib/System`` interface, even if the header |
| 134 | file for that function is not exposed. This prevents inadvertent use of system |
| 135 | specific functionality. |
| 136 | |
| 137 | For example, the ``stat`` system call is notorious for having variations in the |
| 138 | data it provides. ``lib/System`` must not declare ``stat`` nor allow it to be |
| 139 | declared. Instead it should provide its own interface to discovering |
| 140 | information about files and directories. Those interfaces may be implemented in |
| 141 | terms of ``stat`` but that is strictly an implementation detail. The interface |
| 142 | provided by the System Library must be implemented on all platforms (even those |
| 143 | without ``stat``). |
| 144 | |
| 145 | No Exposed Data |
| 146 | --------------- |
| 147 | |
| 148 | Any data defined by system libraries (i.e. not defined by ``lib/System``) must |
| 149 | not be exposed through the ``lib/System`` interface, even if the header file |
| 150 | for that function is not exposed. As with functions, this prevents inadvertent |
| 151 | use of data that might not exist on all platforms. |
| 152 | |
| 153 | Minimize Soft Errors |
| 154 | -------------------- |
| 155 | |
| 156 | Operating system interfaces will generally provide error results for every |
| 157 | little thing that could go wrong. In almost all cases, you can divide these |
| 158 | error results into two groups: normal/good/soft and abnormal/bad/hard. That is, |
| 159 | some of the errors are simply information like "file not found", "insufficient |
| 160 | privileges", etc. while other errors are much harder like "out of space", "bad |
| 161 | disk sector", or "system call interrupted". We'll call the first group "*soft*" |
| 162 | errors and the second group "*hard*" errors. |
| 163 | |
| 164 | ``lib/System`` must always attempt to minimize soft errors. This is a design |
| 165 | requirement because the minimization of soft errors can affect the granularity |
| 166 | and the nature of the interface. In general, if you find that you're wanting to |
| 167 | throw soft errors, you must review the granularity of the interface because it |
| 168 | is likely you're trying to implement something that is too low level. The rule |
| 169 | of thumb is to provide interface functions that **can't** fail, except when |
| 170 | faced with hard errors. |
| 171 | |
| 172 | For a trivial example, suppose we wanted to add an "``OpenFileForWriting``" |
| 173 | function. For many operating systems, if the file doesn't exist, attempting to |
| 174 | open the file will produce an error. However, ``lib/System`` should not simply |
| 175 | throw that error if it occurs because its a soft error. The problem is that the |
| 176 | interface function, ``OpenFileForWriting`` is too low level. It should be |
| 177 | ``OpenOrCreateFileForWriting``. In the case of the soft "doesn't exist" error, |
| 178 | this function would just create it and then open it for writing. |
| 179 | |
| 180 | This design principle needs to be maintained in ``lib/System`` because it |
| 181 | avoids the propagation of soft error handling throughout the rest of LLVM. |
| 182 | Hard errors will generally just cause a termination for an LLVM tool so don't |
| 183 | be bashful about throwing them. |
| 184 | |
| 185 | Rules of thumb: |
| 186 | |
| 187 | #. Don't throw soft errors, only hard errors. |
| 188 | |
| 189 | #. If you're tempted to throw a soft error, re-think the interface. |
| 190 | |
| 191 | #. Handle internally the most common normal/good/soft error conditions |
| 192 | so the rest of LLVM doesn't have to. |
| 193 | |
| 194 | No throw Specifications |
| 195 | ----------------------- |
| 196 | |
| 197 | None of the ``lib/System`` interface functions may be declared with C++ |
| 198 | ``throw()`` specifications on them. This requirement makes sure that the |
| 199 | compiler does not insert additional exception handling code into the interface |
| 200 | functions. This is a performance consideration: ``lib/System`` functions are at |
| 201 | the bottom of many call chains and as such can be frequently called. We need |
| 202 | them to be as efficient as possible. However, no routines in the system |
| 203 | library should actually throw exceptions. |
| 204 | |
| 205 | Code Organization |
| 206 | ----------------- |
| 207 | |
| 208 | Implementations of the System Library interface are separated by their general |
| 209 | class of operating system. Currently only Unix and Win32 classes are defined |
| 210 | but more could be added for other operating system classifications. To |
| 211 | distinguish which implementation to compile, the code in ``lib/System`` uses |
| 212 | the ``LLVM_ON_UNIX`` and ``LLVM_ON_WIN32`` ``#defines`` provided via configure |
| 213 | through the ``llvm/Config/config.h`` file. Each source file in ``lib/System``, |
| 214 | after implementing the generic (operating system independent) functionality |
| 215 | needs to include the correct implementation using a set of |
| 216 | ``#if defined(LLVM_ON_XYZ)`` directives. For example, if we had |
| 217 | ``lib/System/File.cpp``, we'd expect to see in that file: |
| 218 | |
| 219 | .. code-block:: c++ |
| 220 | |
| 221 | #if defined(LLVM_ON_UNIX) |
| 222 | #include "Unix/File.cpp" |
| 223 | #endif |
| 224 | #if defined(LLVM_ON_WIN32) |
| 225 | #include "Win32/File.cpp" |
| 226 | #endif |
| 227 | |
| 228 | The implementation in ``lib/System/Unix/File.cpp`` should handle all Unix |
| 229 | variants. The implementation in ``lib/System/Win32/File.cpp`` should handle all |
| 230 | Win32 variants. What this does is quickly differentiate the basic class of |
| 231 | operating system that will provide the implementation. The specific details for |
| 232 | a given platform must still be determined through the use of ``#ifdef``. |
| 233 | |
| 234 | Consistent Semantics |
| 235 | -------------------- |
| 236 | |
| 237 | The implementation of a ``lib/System`` interface can vary drastically between |
| 238 | platforms. That's okay as long as the end result of the interface function is |
| 239 | the same. For example, a function to create a directory is pretty straight |
| 240 | forward on all operating system. System V IPC on the other hand isn't even |
| 241 | supported on all platforms. Instead of "supporting" System V IPC, |
| 242 | ``lib/System`` should provide an interface to the basic concept of |
| 243 | inter-process communications. The implementations might use System V IPC if |
| 244 | that was available or named pipes, or whatever gets the job done effectively |
| 245 | for a given operating system. In all cases, the interface and the |
| 246 | implementation must be semantically consistent. |
| 247 | |