blob: 33b191ebd0ccd4c56a7aaba7e7d731ae381bbf55 [file] [log] [blame]
Reid Spencer5f016e22007-07-11 17:01:13 +00001//===---------------------------------------------------------------------===//
2// Random Notes
3//===---------------------------------------------------------------------===//
4
5C90/C99/C++ Comparisons:
6http://david.tribble.com/text/cdiffs.htm
7
8//===---------------------------------------------------------------------===//
9Extensions:
10
11 * "#define_target X Y"
12 This preprocessor directive works exactly the same was as #define, but it
13 notes that 'X' is a target-specific preprocessor directive. When used, a
14 diagnostic is emitted indicating that the translation unit is non-portable.
15
16 If a target-define is #undef'd before use, no diagnostic is emitted. If 'X'
17 were previously a normal #define macro, the macro is tainted. If 'X' is
18 subsequently #defined as a non-target-specific define, the taint bit is
19 cleared.
20
21 * "#define_other_target X"
22 The preprocessor directive takes a single identifier argument. It notes
23 that this identifier is a target-specific #define for some target other than
24 the current one. Use of this identifier will result in a diagnostic.
25
26 If 'X' is later #undef'd or #define'd, the taint bit is cleared. If 'X' is
27 already defined, X is marked as a target-specific define.
28
29//===---------------------------------------------------------------------===//
30
Chris Lattnerd9028b72008-01-14 06:27:57 +000031When we go to reimplement <tgmath.h>, we should do it more intelligently than
32the GCC-supplied header. EDG has an interesting __generic builtin that provides
33overloading for C:
34http://www.edg.com/docs/edg_cpp.pdf
35
36For example, they have:
37 #define sin(x) __generic(x,,, sin, sinf, sinl, csin, csinf,csinl)(x)
38
39It's unclear to me why you couldn't just have a builtin like:
40 __builtin_overload(1, arg1, impl1, impl2, impl3)
41 __builtin_overload(2, arg1, arg2, impl1, impl2, impl3)
42 __builtin_overload(3, arg1, arg2, arg3, impl1, impl2, impl3)
43
44Where the compiler would just pick the right "impl" based on the arguments
45provided. One nasty detail is that some arithmetic promotions most be done for
46use by the tgmath.h stuff, but it would be nice to be able to handle vectors
47etc as well without huge globs of macros. With the above scheme, you could
48use:
49
50 #define sin(x) __builtin_overload(1, x, sin, sinf, sinl, csin, csinf,csinl)(x)
51
52and not need to keep track of which argument to "__generic" corresponds to which
53type, etc.
54
55//===---------------------------------------------------------------------===//
56
Reid Spencer5f016e22007-07-11 17:01:13 +000057To time GCC preprocessing speed without output, use:
58 "time gcc -MM file"
59This is similar to -Eonly.
60
61
62//===---------------------------------------------------------------------===//
63
64 C++ Template Instantiation benchmark:
65 http://users.rcn.com/abrahams/instantiation_speed/index.html
66
67//===---------------------------------------------------------------------===//
68
69TODO: File Manager Speedup:
70
71 We currently do a lot of stat'ing for files that don't exist, particularly
72 when lots of -I paths exist (e.g. see the <iostream> example, check for
73 failures in stat in FileManager::getFile). It would be far better to make
74 the following changes:
75 1. FileEntry contains a sys::Path instead of a std::string for Name.
76 2. sys::Path contains timestamp and size, lazily computed. Eliminate from
77 FileEntry.
78 3. File UIDs are created on request, not when files are opened.
79 These changes make it possible to efficiently have FileEntry objects for
80 files that exist on the file system, but have not been used yet.
81
82 Once this is done:
83 1. DirectoryEntry gets a boolean value "has read entries". When false, not
84 all entries in the directory are in the file mgr, when true, they are.
85 2. Instead of stat'ing the file in FileManager::getFile, check to see if
86 the dir has been read. If so, fail immediately, if not, read the dir,
87 then retry.
88 3. Reading the dir uses the getdirentries syscall, creating an FileEntry
89 for all files found.
90
91//===---------------------------------------------------------------------===//
92
93TODO: Fast #Import:
94
95 * Get frameworks that don't use #import to do so, e.g.
96 DirectoryService, AudioToolbox, CoreFoundation, etc. Why not using #import?
97 Because they work in C mode? C has #import.
98 * Have the lexer return a token for #import instead of handling it itself.
99 - Create a new preprocessor object with no external state (no -D/U options
100 from the command line, etc). Alternatively, keep track of exactly which
101 external state is used by a #import: declare it somehow.
102 * When having reading a #import file, keep track of whether we have (and/or
103 which) seen any "configuration" macros. Various cases:
104 - Uses of target args (__POWERPC__, __i386): Header has to be parsed
105 multiple times, per-target. What about #ifndef checks? How do we know?
106 - "Configuration" preprocessor macros not defined: POWERPC, etc. What about
107 things like __STDC__ etc? What is and what isn't allowed.
108 * Special handling for "umbrella" headers, which just contain #import stmts:
109 - Cocoa.h/AppKit.h - Contain pointers to digests instead of entire digests
110 themselves? Foundation.h isn't pure umbrella!
111 * Frameworks digests:
112 - Can put "digest" of a framework-worth of headers into the framework
113 itself. To open AppKit, just mmap
114 /System/Library/Frameworks/AppKit.framework/"digest", which provides a
115 symbol table in a well defined format. Lazily unstream stuff that is
116 needed. Contains declarations, macros, and debug information.
117 - System frameworks ship with digests. How do we handle configuration
118 information? How do we handle stuff like:
119 #if MAC_OS_X_VERSION_MAX_ALLOWED >= MAC_OS_X_VERSION_10_2
120 which guards a bunch of decls? Should there be a couple of default
121 configs, then have the UI fall back to building/caching its own?
122 - GUI automatically builds digests when UI is idle, both of system
123 frameworks if they aren't not available in the right config, and of app
124 frameworks.
125 - GUI builds dependence graph of frameworks/digests based on #imports. If a
126 digest is out date, dependent digests are automatically invalidated.
127
128 * New constraints on #import for objc-v3:
129 - #imported file must not define non-inline function bodies.
130 - Alternatively, they can, and these bodies get compiled/linked *once*
131 per app into a dylib. What about building user dylibs?
132 - Restrictions on ObjC grammar: can't #import the body of a for stmt or fn.
133 - Compiler must detect and reject these cases.
134 - #defines defined within a #import have two behaviors:
135 - By default, they escape the header. These macros *cannot* be #undef'd
136 by other code: this is enforced by the front-end.
137 - Optionally, user can specify what macros escape (whitelist) or can use
138 #undef.
139
140//===---------------------------------------------------------------------===//
141
142TODO: New language feature: Configuration queries:
143 - Instead of #ifdef __POWERPC__, use "if (strcmp(`cpu`, __POWERPC__))", or
144 some other, better, syntax.
145 - Use it to increase the number of "architecture-clean" #import'd files,
146 allowing a single index to be used for all fat slices.
147
148//===---------------------------------------------------------------------===//
Ted Kremenekf4c45b02007-12-03 22:26:16 +0000149// Specifying targets: -triple and -arch
150===---------------------------------------------------------------------===//
151
152The clang supports "-triple" and "-arch" options. At most one -triple option may
153be specified, while multiple -arch options can be specified. Both are optional.
154
155The "selection of target" behavior is defined as follows:
156
157(1) If the user does not specify -triple:
158
159 (a) If no -arch options are specified, the target triple used is the host
160 triple (in llvm/Config/config.h).
161
162 (b) If one or more -arch's are specified (and no -triple), then there is
163 one triple for each -arch, where the specified arch is substituted
164 for the arch in the host triple. Example:
165
166 host triple = i686-apple-darwin9
167 command: clang -arch ppc -arch ppc64 ...
168 triples used: ppc-apple-darwin9 ppc64-apple-darwin9
169
170(2) The user does specify a -triple (only one allowed):
171
172 (a) If no -arch options are specified, the triple specified by -triple
173 is used. E.g clang -triple i686-apple-darwin9
174
175 (b) If one or more -arch options are specified, then the triple specified
176 by -triple is used as the primary target, and the arch's specified
177 by -arch are used to create secondary targets. For example:
178
179 clang -triple i686-apple-darwin9 -arch ppc -arch ppc64
180
181 has the following targets:
182
183 i686-apple-darwin9 (primary target)
184 ppc-apple-darwin9 (secondary target)
185 ppc64-apple-darwin9 (secondary target)
186
187The secondary targets are used in the 'portability' model (see below).
188
189//===---------------------------------------------------------------------===//
Reid Spencer5f016e22007-07-11 17:01:13 +0000190
191The 'portability' model in clang is sufficient to catch translation units (or
192their parts) that are not portable, but it doesn't help if the system headers
193are non-portable and not fixed. An alternative model that would be easy to use
194is a 'tainting' scheme. Consider:
195
196int32_t
197OSHostByteOrder(void) {
198#if defined(__LITTLE_ENDIAN__)
199 return OSLittleEndian;
200#elif defined(__BIG_ENDIAN__)
201 return OSBigEndian;
202#else
203 return OSUnknownByteOrder;
204#endif
205}
206
207It would be trivial to mark 'OSHostByteOrder' as being non-portable (tainted)
208instead of marking the entire translation unit. Then, if OSHostByteOrder is
209never called/used by the current translation unit, the t-u wouldn't be marked
210non-portable. However, there is no good way to handle stuff like:
211
212extern int X, Y;
213
214#ifndef __POWERPC__
215#define X Y
216#endif
217
218int bar() { return X; }
219
220When compiling for powerpc, the #define is skipped, so it doesn't know that bar
221uses a #define that is set on some other target. In practice, limited cases
222could be handled by scanning the skipped region of a #if, but the fully general
223case cannot be implemented efficiently. In this case, for example, the #define
224in the protected region could be turned into either a #define_target or
225#define_other_target as appropriate. The harder case is code like this (from
226OSByteOrder.h):
227
228 #if (defined(__ppc__) || defined(__ppc64__))
229 #include <libkern/ppc/OSByteOrder.h>
230 #elif (defined(__i386__) || defined(__x86_64__))
231 #include <libkern/i386/OSByteOrder.h>
232 #else
233 #include <libkern/machine/OSByteOrder.h>
234 #endif
235
236The realistic way to fix this is by having an initial #ifdef __llvm__ that
237defines its contents in terms of the llvm bswap intrinsics. Other things should
238be handled on a case-by-case basis.
239
240
241We probably have to do something smarter like this in the future. The C++ header
242<limits> contains a lot of code like this:
243
244 static const int digits10 = __LDBL_DIG__;
245 static const int min_exponent = __LDBL_MIN_EXP__;
246 static const int min_exponent10 = __LDBL_MIN_10_EXP__;
247 static const float_denorm_style has_denorm
248 = bool(__LDBL_DENORM_MIN__) ? denorm_present : denorm_absent;
249
250 ... since this isn't being used in an #ifdef, it should be easy enough to taint
251the decl for these ivars.
252
253
254/usr/include/sys/cdefs.h contains stuff like this:
255
256#if defined(__ppc__)
257# if defined(__LDBL_MANT_DIG__) && defined(__DBL_MANT_DIG__) && \
258 __LDBL_MANT_DIG__ > __DBL_MANT_DIG__
259# if __ENVIRONMENT_MAC_OS_X_VERSION_MIN_REQUIRED__-0 < 1040
260# define __DARWIN_LDBL_COMPAT(x) __asm("_" __STRING(x) "$LDBLStub")
261# else
262# define __DARWIN_LDBL_COMPAT(x) __asm("_" __STRING(x) "$LDBL128")
263# endif
264# define __DARWIN_LDBL_COMPAT2(x) __asm("_" __STRING(x) "$LDBL128")
265# define __DARWIN_LONG_DOUBLE_IS_DOUBLE 0
266# else
267# define __DARWIN_LDBL_COMPAT(x) /* nothing */
268# define __DARWIN_LDBL_COMPAT2(x) /* nothing */
269# define __DARWIN_LONG_DOUBLE_IS_DOUBLE 1
270# endif
271#elif defined(__i386__) || defined(__ppc64__) || defined(__x86_64__)
272# define __DARWIN_LDBL_COMPAT(x) /* nothing */
273# define __DARWIN_LDBL_COMPAT2(x) /* nothing */
274# define __DARWIN_LONG_DOUBLE_IS_DOUBLE 0
275#else
276# error Unknown architecture
277#endif
278
279An ideal way to solve this issue is to mark __DARWIN_LDBL_COMPAT /
280__DARWIN_LDBL_COMPAT2 / __DARWIN_LONG_DOUBLE_IS_DOUBLE as being non-portable
281because they depend on non-portable macros. In practice though, this may end
282up being a serious problem: every use of printf will mark the translation unit
283non-portable if targetting ppc32 and something else.
284
285//===---------------------------------------------------------------------===//
Chris Lattner26862342007-07-11 17:31:59 +0000286