blob: 5bf9fa04fe0340c382c4d5db2f407f03b58d4f8b [file] [log] [blame]
Chris Lattner58360332006-10-14 19:53:48 +00001//===---------------------------------------------------------------------===//
2// Random Notes
3//===---------------------------------------------------------------------===//
4
Chris Lattner58360332006-10-14 19:53:48 +00005//===---------------------------------------------------------------------===//
6Extensions:
7
8 * "#define_target X Y"
9 This preprocessor directive works exactly the same was as #define, but it
10 notes that 'X' is a target-specific preprocessor directive. When used, a
11 diagnostic is emitted indicating that the translation unit is non-portable.
12
13 If a target-define is #undef'd before use, no diagnostic is emitted. If 'X'
14 were previously a normal #define macro, the macro is tainted. If 'X' is
15 subsequently #defined as a non-target-specific define, the taint bit is
16 cleared.
17
18 * "#define_other_target X"
19 The preprocessor directive takes a single identifier argument. It notes
20 that this identifier is a target-specific #define for some target other than
21 the current one. Use of this identifier will result in a diagnostic.
22
23 If 'X' is later #undef'd or #define'd, the taint bit is cleared. If 'X' is
24 already defined, X is marked as a target-specific define.
Chris Lattnerecc6fc52006-07-04 19:29:50 +000025
Chris Lattnerbfe98602006-10-14 17:39:56 +000026//===---------------------------------------------------------------------===//
Chris Lattnerecc6fc52006-07-04 19:29:50 +000027
28To time GCC preprocessing speed without output, use:
Chris Lattnerecbf7b42006-07-05 00:08:00 +000029 "time gcc -MM file"
Chris Lattnerbfe98602006-10-14 17:39:56 +000030This is similar to -Eonly.
31
Chris Lattnerbacf0bf2006-11-08 05:53:27 +000032
33//===---------------------------------------------------------------------===//
34
35 C++ Template Instantiation benchmark:
36 http://users.rcn.com/abrahams/instantiation_speed/index.html
37
Chris Lattnerbfe98602006-10-14 17:39:56 +000038//===---------------------------------------------------------------------===//
Chris Lattnerdb878cd2006-07-10 06:34:50 +000039
Chris Lattnerf78e6032006-11-05 17:54:43 +000040TODO: File Manager Speedup:
Chris Lattnerdb878cd2006-07-10 06:34:50 +000041
Chris Lattnerf78e6032006-11-05 17:54:43 +000042 We currently do a lot of stat'ing for files that don't exist, particularly
43 when lots of -I paths exist (e.g. see the <iostream> example, check for
44 failures in stat in FileManager::getFile). It would be far better to make
45 the following changes:
46 1. FileEntry contains a sys::Path instead of a std::string for Name.
47 2. sys::Path contains timestamp and size, lazily computed. Eliminate from
48 FileEntry.
49 3. File UIDs are created on request, not when files are opened.
50 These changes make it possible to efficiently have FileEntry objects for
51 files that exist on the file system, but have not been used yet.
52
53 Once this is done:
54 1. DirectoryEntry gets a boolean value "has read entries". When false, not
55 all entries in the directory are in the file mgr, when true, they are.
56 2. Instead of stat'ing the file in FileManager::getFile, check to see if
57 the dir has been read. If so, fail immediately, if not, read the dir,
58 then retry.
59 3. Reading the dir uses the getdirentries syscall, creating an FileEntry
60 for all files found.
61
62//===---------------------------------------------------------------------===//
63
64TODO: Fast #Import:
65
66 * Get frameworks that don't use #import to do so, e.g.
67 DirectoryService, AudioToolbox, CoreFoundation, etc. Why not using #import?
68 Because they work in C mode? C has #import.
69 * Have the lexer return a token for #import instead of handling it itself.
70 - Create a new preprocessor object with no external state (no -D/U options
71 from the command line, etc). Alternatively, keep track of exactly which
72 external state is used by a #import: declare it somehow.
73 * When having reading a #import file, keep track of whether we have (and/or
74 which) seen any "configuration" macros. Various cases:
75 - Uses of target args (__POWERPC__, __i386): Header has to be parsed
76 multiple times, per-target. What about #ifndef checks? How do we know?
77 - "Configuration" preprocessor macros not defined: POWERPC, etc. What about
78 things like __STDC__ etc? What is and what isn't allowed.
79 * Special handling for "umbrella" headers, which just contain #import stmts:
80 - Cocoa.h/AppKit.h - Contain pointers to digests instead of entire digests
81 themselves? Foundation.h isn't pure umbrella!
82 * Frameworks digests:
83 - Can put "digest" of a framework-worth of headers into the framework
84 itself. To open AppKit, just mmap
85 /System/Library/Frameworks/AppKit.framework/"digest", which provides a
86 symbol table in a well defined format. Lazily unstream stuff that is
87 needed. Contains declarations, macros, and debug information.
88 - System frameworks ship with digests. How do we handle configuration
89 information? How do we handle stuff like:
90 #if MAC_OS_X_VERSION_MAX_ALLOWED >= MAC_OS_X_VERSION_10_2
91 which guards a bunch of decls? Should there be a couple of default
92 configs, then have the UI fall back to building/caching its own?
93 - GUI automatically builds digests when UI is idle, both of system
94 frameworks if they aren't not available in the right config, and of app
95 frameworks.
96 - GUI builds dependence graph of frameworks/digests based on #imports. If a
97 digest is out date, dependent digests are automatically invalidated.
98
99 * New constraints on #import for objc-v3:
100 - #imported file must not define non-inline function bodies.
101 - Alternatively, they can, and these bodies get compiled/linked *once*
102 per app into a dylib. What about building user dylibs?
103 - Restrictions on ObjC grammar: can't #import the body of a for stmt or fn.
104 - Compiler must detect and reject these cases.
105 - #defines defined within a #import have two behaviors:
106 - By default, they escape the header. These macros *cannot* be #undef'd
107 by other code: this is enforced by the front-end.
108 - Optionally, user can specify what macros escape (whitelist) or can use
109 #undef.
110
111//===---------------------------------------------------------------------===//
112
113TODO: New language feature: Configuration queries:
114 - Instead of #ifdef __POWERPC__, use "if (strcmp(`cpu`, __POWERPC__))", or
115 some other, better, syntax.
116 - Use it to increase the number of "architecture-clean" #import'd files,
117 allowing a single index to be used for all fat slices.
118
Chris Lattnerbfe98602006-10-14 17:39:56 +0000119//===---------------------------------------------------------------------===//
120
121The 'portability' model in clang is sufficient to catch translation units (or
122their parts) that are not portable, but it doesn't help if the system headers
123are non-portable and not fixed. An alternative model that would be easy to use
124is a 'tainting' scheme. Consider:
125
Chris Lattnerdad3c452006-10-15 01:13:14 +0000126int32_t
127OSHostByteOrder(void) {
128#if defined(__LITTLE_ENDIAN__)
129 return OSLittleEndian;
130#elif defined(__BIG_ENDIAN__)
131 return OSBigEndian;
Chris Lattnerbfe98602006-10-14 17:39:56 +0000132#else
Chris Lattnerdad3c452006-10-15 01:13:14 +0000133 return OSUnknownByteOrder;
Chris Lattnerbfe98602006-10-14 17:39:56 +0000134#endif
135}
136
Chris Lattnerdad3c452006-10-15 01:13:14 +0000137It would be trivial to mark 'OSHostByteOrder' as being non-portable (tainted)
138instead of marking the entire translation unit. Then, if OSHostByteOrder is
139never called/used by the current translation unit, the t-u wouldn't be marked
140non-portable. However, there is no good way to handle stuff like:
Chris Lattnerbfe98602006-10-14 17:39:56 +0000141
142extern int X, Y;
143
144#ifndef __POWERPC__
145#define X Y
146#endif
147
148int bar() { return X; }
149
150When compiling for powerpc, the #define is skipped, so it doesn't know that bar
151uses a #define that is set on some other target. In practice, limited cases
152could be handled by scanning the skipped region of a #if, but the fully general
Chris Lattner4856a422006-10-15 22:34:29 +0000153case cannot be implemented efficiently. In this case, for example, the #define
154in the protected region could be turned into either a #define_target or
155#define_other_target as appropriate. The harder case is code like this (from
Chris Lattnerdad3c452006-10-15 01:13:14 +0000156OSByteOrder.h):
Chris Lattnerbfe98602006-10-14 17:39:56 +0000157
Chris Lattnerdad3c452006-10-15 01:13:14 +0000158 #if (defined(__ppc__) || defined(__ppc64__))
159 #include <libkern/ppc/OSByteOrder.h>
160 #elif (defined(__i386__) || defined(__x86_64__))
161 #include <libkern/i386/OSByteOrder.h>
162 #else
163 #include <libkern/machine/OSByteOrder.h>
164 #endif
165
Chris Lattner4856a422006-10-15 22:34:29 +0000166The realistic way to fix this is by having an initial #ifdef __llvm__ that
167defines its contents in terms of the llvm bswap intrinsics. Other things should
168be handled on a case-by-case basis.
Chris Lattnerdad3c452006-10-15 01:13:14 +0000169
170
171We probably have to do something smarter like this in the future. The C++ header
Chris Lattner2ddda732006-10-15 01:05:06 +0000172<limits> contains a lot of code like this:
Chris Lattnerdad3c452006-10-15 01:13:14 +0000173
Chris Lattner2ddda732006-10-15 01:05:06 +0000174 static const int digits10 = __LDBL_DIG__;
175 static const int min_exponent = __LDBL_MIN_EXP__;
176 static const int min_exponent10 = __LDBL_MIN_10_EXP__;
177 static const float_denorm_style has_denorm
178 = bool(__LDBL_DENORM_MIN__) ? denorm_present : denorm_absent;
Chris Lattnerdad3c452006-10-15 01:13:14 +0000179
Chris Lattner2ddda732006-10-15 01:05:06 +0000180 ... since this isn't being used in an #ifdef, it should be easy enough to taint
181the decl for these ivars.
182
183
Chris Lattnerdad3c452006-10-15 01:13:14 +0000184/usr/include/sys/cdefs.h contains stuff like this:
185
186#if defined(__ppc__)
187# if defined(__LDBL_MANT_DIG__) && defined(__DBL_MANT_DIG__) && \
188 __LDBL_MANT_DIG__ > __DBL_MANT_DIG__
189# if __ENVIRONMENT_MAC_OS_X_VERSION_MIN_REQUIRED__-0 < 1040
190# define __DARWIN_LDBL_COMPAT(x) __asm("_" __STRING(x) "$LDBLStub")
191# else
192# define __DARWIN_LDBL_COMPAT(x) __asm("_" __STRING(x) "$LDBL128")
193# endif
194# define __DARWIN_LDBL_COMPAT2(x) __asm("_" __STRING(x) "$LDBL128")
195# define __DARWIN_LONG_DOUBLE_IS_DOUBLE 0
196# else
197# define __DARWIN_LDBL_COMPAT(x) /* nothing */
198# define __DARWIN_LDBL_COMPAT2(x) /* nothing */
199# define __DARWIN_LONG_DOUBLE_IS_DOUBLE 1
200# endif
201#elif defined(__i386__) || defined(__ppc64__) || defined(__x86_64__)
202# define __DARWIN_LDBL_COMPAT(x) /* nothing */
203# define __DARWIN_LDBL_COMPAT2(x) /* nothing */
204# define __DARWIN_LONG_DOUBLE_IS_DOUBLE 0
205#else
206# error Unknown architecture
207#endif
208
209An ideal way to solve this issue is to mark __DARWIN_LDBL_COMPAT /
210__DARWIN_LDBL_COMPAT2 / __DARWIN_LONG_DOUBLE_IS_DOUBLE as being non-portable
Chris Lattner4856a422006-10-15 22:34:29 +0000211because they depend on non-portable macros. In practice though, this may end
212up being a serious problem: every use of printf will mark the translation unit
213non-portable if targetting ppc32 and something else.
Chris Lattner2ddda732006-10-15 01:05:06 +0000214
Chris Lattnerbfe98602006-10-14 17:39:56 +0000215//===---------------------------------------------------------------------===//