Jason Evans | 379f847 | 2010-10-24 16:18:29 -0700 | [diff] [blame] | 1 | Following are change highlights associated with official releases. Important |
Jason Evans | be09b81 | 2015-07-07 09:33:22 -0700 | [diff] [blame] | 2 | bug fixes are all mentioned, but some internal enhancements are omitted here for |
| 3 | brevity. Much more detail can be found in the git revision history: |
Jason Evans | 379f847 | 2010-10-24 16:18:29 -0700 | [diff] [blame] | 4 | |
Jason Evans | b9ec5c9 | 2014-02-25 16:43:51 -0800 | [diff] [blame] | 5 | https://github.com/jemalloc/jemalloc |
| 6 | |
Qi Wang | 1c51381 | 2018-05-02 14:34:19 -0700 | [diff] [blame] | 7 | * 5.1.0 (May 4th, 2018) |
| 8 | |
| 9 | This release is primarily about fine-tuning, ranging from several new features |
| 10 | to numerous notable performance and portability enhancements. The release and |
| 11 | prior dev versions have been running in multiple large scale applications for |
| 12 | months, and the cumulative improvements are substantial in many cases. |
| 13 | |
| 14 | Given the long and successful production runs, this release is likely a good |
| 15 | candidate for applications to upgrade, from both jemalloc 5.0 and before. For |
| 16 | performance-critical applications, the newly added TUNING.md provides |
| 17 | guidelines on jemalloc tuning. |
| 18 | |
| 19 | New features: |
| 20 | - Implement transparent huge page support for internal metadata. (@interwq) |
| 21 | - Add opt.thp to allow enabling / disabling transparent huge pages for all |
| 22 | mappings. (@interwq) |
| 23 | - Add maximum background thread count option. (@djwatson) |
| 24 | - Allow prof_active to control opt.lg_prof_interval and prof.gdump. |
| 25 | (@interwq) |
| 26 | - Allow arena index lookup based on allocation addresses via mallctl. |
| 27 | (@lionkov) |
| 28 | - Allow disabling initial-exec TLS model. (@davidtgoldblatt, @KenMacD) |
| 29 | - Add opt.lg_extent_max_active_fit to set the max ratio between the size of |
| 30 | the active extent selected (to split off from) and the size of the requested |
| 31 | allocation. (@interwq, @davidtgoldblatt) |
| 32 | - Add retain_grow_limit to set the max size when growing virtual address |
| 33 | space. (@interwq) |
| 34 | - Add mallctl interfaces: |
| 35 | + arena.<i>.retain_grow_limit (@interwq) |
| 36 | + arenas.lookup (@lionkov) |
| 37 | + max_background_threads (@djwatson) |
| 38 | + opt.lg_extent_max_active_fit (@interwq) |
| 39 | + opt.max_background_threads (@djwatson) |
| 40 | + opt.metadata_thp (@interwq) |
| 41 | + opt.thp (@interwq) |
| 42 | + stats.metadata_thp (@interwq) |
| 43 | |
| 44 | Portability improvements: |
| 45 | - Support GNU/kFreeBSD configuration. (@paravoid) |
| 46 | - Support m68k, nios2 and SH3 architectures. (@paravoid) |
| 47 | - Fall back to FD_CLOEXEC when O_CLOEXEC is unavailable. (@zonyitoo) |
| 48 | - Fix symbol listing for cross-compiling. (@tamird) |
| 49 | - Fix high bits computation on ARM. (@davidtgoldblatt, @paravoid) |
| 50 | - Disable the CPU_SPINWAIT macro for Power. (@davidtgoldblatt, @marxin) |
| 51 | - Fix MSVC 2015 & 2017 builds. (@rustyx) |
| 52 | - Improve RISC-V support. (@EdSchouten) |
| 53 | - Set name mangling script in strict mode. (@nicolov) |
| 54 | - Avoid MADV_HUGEPAGE on ARM. (@marxin) |
| 55 | - Modify configure to determine return value of strerror_r. |
| 56 | (@davidtgoldblatt, @cferris1000) |
| 57 | - Make sure CXXFLAGS is tested with CPP compiler. (@nehaljwani) |
| 58 | - Fix 32-bit build on MSVC. (@rustyx) |
| 59 | - Fix external symbol on MSVC. (@maksqwe) |
| 60 | - Avoid a printf format specifier warning. (@jasone) |
| 61 | - Add configure option --disable-initial-exec-tls which can allow jemalloc to |
| 62 | be dynamically loaded after program startup. (@davidtgoldblatt, @KenMacD) |
| 63 | - AArch64: Add ILP32 support. (@cmuellner) |
| 64 | - Add --with-lg-vaddr configure option to support cross compiling. |
| 65 | (@cmuellner, @davidtgoldblatt) |
| 66 | |
| 67 | Optimizations and refactors: |
| 68 | - Improve active extent fit with extent_max_active_fit. This considerably |
| 69 | reduces fragmentation over time and improves virtual memory and metadata |
| 70 | usage. (@davidtgoldblatt, @interwq) |
| 71 | - Eagerly coalesce large extents to reduce fragmentation. (@interwq) |
| 72 | - sdallocx: only read size info when page aligned (i.e. possibly sampled), |
| 73 | which speeds up the sized deallocation path significantly. (@interwq) |
| 74 | - Avoid attempting new mappings for in place expansion with retain, since |
| 75 | it rarely succeeds in practice and causes high overhead. (@interwq) |
| 76 | - Refactor OOM handling in newImpl. (@wqfish) |
| 77 | - Add internal fine-grained logging functionality for debugging use. |
| 78 | (@davidtgoldblatt) |
| 79 | - Refactor arena / tcache interactions. (@davidtgoldblatt) |
| 80 | - Refactor extent management with dumpable flag. (@davidtgoldblatt) |
| 81 | - Add runtime detection of lazy purging. (@interwq) |
| 82 | - Use pairing heap instead of red-black tree for extents_avail. (@djwatson) |
| 83 | - Use sysctl on startup in FreeBSD. (@trasz) |
| 84 | - Use thread local prng state instead of atomic. (@djwatson) |
| 85 | - Make decay to always purge one more extent than before, because in |
| 86 | practice large extents are usually the ones that cross the decay threshold. |
| 87 | Purging the additional extent helps save memory as well as reduce VM |
| 88 | fragmentation. (@interwq) |
| 89 | - Fast division by dynamic values. (@davidtgoldblatt) |
| 90 | - Improve the fit for aligned allocation. (@interwq, @edwinsmith) |
| 91 | - Refactor extent_t bitpacking. (@rkmisra) |
| 92 | - Optimize the generated assembly for ticker operations. (@davidtgoldblatt) |
| 93 | - Convert stats printing to use a structured text emitter. (@davidtgoldblatt) |
| 94 | - Remove preserve_lru feature for extents management. (@djwatson) |
| 95 | - Consolidate two memory loads into one on the fast deallocation path. |
| 96 | (@davidtgoldblatt, @interwq) |
| 97 | |
| 98 | Bug fixes (most of the issues are only relevant to jemalloc 5.0): |
| 99 | - Fix deadlock with multithreaded fork in OS X. (@davidtgoldblatt) |
| 100 | - Validate returned file descriptor before use. (@zonyitoo) |
| 101 | - Fix a few background thread initialization and shutdown issues. (@interwq) |
| 102 | - Fix an extent coalesce + decay race by taking both coalescing extents off |
| 103 | the LRU list. (@interwq) |
| 104 | - Fix potentially unbound increase during decay, caused by one thread keep |
| 105 | stashing memory to purge while other threads generating new pages. The |
| 106 | number of pages to purge is checked to prevent this. (@interwq) |
| 107 | - Fix a FreeBSD bootstrap assertion. (@strejda, @interwq) |
| 108 | - Handle 32 bit mutex counters. (@rkmisra) |
| 109 | - Fix a indexing bug when creating background threads. (@davidtgoldblatt, |
| 110 | @binliu19) |
| 111 | - Fix arguments passed to extent_init. (@yuleniwo, @interwq) |
| 112 | - Fix addresses used for ordering mutexes. (@rkmisra) |
| 113 | - Fix abort_conf processing during bootstrap. (@interwq) |
| 114 | - Fix include path order for out-of-tree builds. (@cmuellner) |
| 115 | |
| 116 | Incompatible changes: |
| 117 | - Remove --disable-thp. (@interwq) |
| 118 | - Remove mallctl interfaces: |
| 119 | + config.thp (@interwq) |
| 120 | |
| 121 | Documentation: |
| 122 | - Add TUNING.md. (@interwq, @davidtgoldblatt, @djwatson) |
| 123 | |
Jason Evans | 284edf0 | 2017-07-01 17:12:05 -0700 | [diff] [blame] | 124 | * 5.0.1 (July 1, 2017) |
| 125 | |
| 126 | This bugfix release fixes several issues, most of which are obscure enough |
| 127 | that typical applications are not impacted. |
| 128 | |
| 129 | Bug fixes: |
| 130 | - Update decay->nunpurged before purging, in order to avoid potential update |
| 131 | races and subsequent incorrect purging volume. (@interwq) |
| 132 | - Only abort on dlsym(3) error if the failure impacts an enabled feature (lazy |
| 133 | locking and/or background threads). This mitigates an initialization |
| 134 | failure bug for which we still do not have a clear reproduction test case. |
| 135 | (@interwq) |
| 136 | - Modify tsd management so that it neither crashes nor leaks if a thread's |
| 137 | only allocation activity is to call free() after TLS destructors have been |
| 138 | executed. This behavior was observed when operating with GNU libc, and is |
| 139 | unlikely to be an issue with other libc implementations. (@interwq) |
| 140 | - Mask signals during background thread creation. This prevents signals from |
| 141 | being inadvertently delivered to background threads. (@jasone, |
Jason Evans | aa44ddb | 2017-07-02 17:55:52 -0700 | [diff] [blame] | 142 | @davidtgoldblatt, @interwq) |
Jason Evans | 284edf0 | 2017-07-01 17:12:05 -0700 | [diff] [blame] | 143 | - Avoid inactivity checks within background threads, in order to prevent |
| 144 | recursive mutex acquisition. (@interwq) |
| 145 | - Fix extent_grow_retained() to use the specified hooks when the |
| 146 | arena.<i>.extent_hooks mallctl is used to override the default hooks. |
| 147 | (@interwq) |
| 148 | - Add missing reentrancy support for custom extent hooks which allocate. |
| 149 | (@interwq) |
| 150 | - Post-fork(2), re-initialize the list of tcaches associated with each arena |
| 151 | to contain no tcaches except the forking thread's. (@interwq) |
| 152 | - Add missing post-fork(2) mutex reinitialization for extent_grow_mtx. This |
| 153 | fixes potential deadlocks after fork(2). (@interwq) |
| 154 | - Enforce minimum autoconf version (currently 2.68), since 2.63 is known to |
| 155 | generate corrupt configure scripts. (@jasone) |
| 156 | - Ensure that the configured page size (--with-lg-page) is no larger than the |
| 157 | configured huge page size (--with-lg-hugepage). (@jasone) |
| 158 | |
Jason Evans | aae8fd9 | 2017-06-09 09:41:09 -0700 | [diff] [blame] | 159 | * 5.0.0 (June 13, 2017) |
| 160 | |
| 161 | Unlike all previous jemalloc releases, this release does not use naturally |
| 162 | aligned "chunks" for virtual memory management, and instead uses page-aligned |
| 163 | "extents". This change has few externally visible effects, but the internal |
| 164 | impacts are... extensive. Many other internal changes combine to make this |
| 165 | the most cohesively designed version of jemalloc so far, with ample |
| 166 | opportunity for further enhancements. |
| 167 | |
| 168 | Continuous integration is now an integral aspect of development thanks to the |
| 169 | efforts of @davidtgoldblatt, and the dev branch tends to remain reasonably |
| 170 | stable on the tested platforms (Linux, FreeBSD, macOS, and Windows). As a |
| 171 | side effect the official release frequency may decrease over time. |
| 172 | |
| 173 | New features: |
| 174 | - Implement optional per-CPU arena support; threads choose which arena to use |
| 175 | based on current CPU rather than on fixed thread-->arena associations. |
| 176 | (@interwq) |
| 177 | - Implement two-phase decay of unused dirty pages. Pages transition from |
| 178 | dirty-->muzzy-->clean, where the first phase transition relies on |
| 179 | madvise(... MADV_FREE) semantics, and the second phase transition discards |
| 180 | pages such that they are replaced with demand-zeroed pages on next access. |
| 181 | (@jasone) |
| 182 | - Increase decay time resolution from seconds to milliseconds. (@jasone) |
| 183 | - Implement opt-in per CPU background threads, and use them for asynchronous |
| 184 | decay-driven unused dirty page purging. (@interwq) |
| 185 | - Add mutex profiling, which collects a variety of statistics useful for |
| 186 | diagnosing overhead/contention issues. (@interwq) |
| 187 | - Add C++ new/delete operator bindings. (@djwatson) |
| 188 | - Support manually created arena destruction, such that all data and metadata |
| 189 | are discarded. Add MALLCTL_ARENAS_DESTROYED for accessing merged stats |
| 190 | associated with destroyed arenas. (@jasone) |
| 191 | - Add MALLCTL_ARENAS_ALL as a fixed index for use in accessing |
| 192 | merged/destroyed arena statistics via mallctl. (@jasone) |
| 193 | - Add opt.abort_conf to optionally abort if invalid configuration options are |
| 194 | detected during initialization. (@interwq) |
| 195 | - Add opt.stats_print_opts, so that e.g. JSON output can be selected for the |
| 196 | stats dumped during exit if opt.stats_print is true. (@jasone) |
| 197 | - Add --with-version=VERSION for use when embedding jemalloc into another |
| 198 | project's git repository. (@jasone) |
| 199 | - Add --disable-thp to support cross compiling. (@jasone) |
| 200 | - Add --with-lg-hugepage to support cross compiling. (@jasone) |
| 201 | - Add mallctl interfaces (various authors): |
| 202 | + background_thread |
| 203 | + opt.abort_conf |
| 204 | + opt.retain |
| 205 | + opt.percpu_arena |
| 206 | + opt.background_thread |
| 207 | + opt.{dirty,muzzy}_decay_ms |
| 208 | + opt.stats_print_opts |
| 209 | + arena.<i>.initialized |
| 210 | + arena.<i>.destroy |
| 211 | + arena.<i>.{dirty,muzzy}_decay_ms |
| 212 | + arena.<i>.extent_hooks |
| 213 | + arenas.{dirty,muzzy}_decay_ms |
| 214 | + arenas.bin.<i>.slab_size |
| 215 | + arenas.nlextents |
| 216 | + arenas.lextent.<i>.size |
| 217 | + arenas.create |
| 218 | + stats.background_thread.{num_threads,num_runs,run_interval} |
| 219 | + stats.mutexes.{ctl,background_thread,prof,reset}. |
| 220 | {num_ops,num_spin_acq,num_wait,max_wait_time,total_wait_time,max_num_thds, |
| 221 | num_owner_switch} |
| 222 | + stats.arenas.<i>.{dirty,muzzy}_decay_ms |
| 223 | + stats.arenas.<i>.uptime |
| 224 | + stats.arenas.<i>.{pmuzzy,base,internal,resident} |
| 225 | + stats.arenas.<i>.{dirty,muzzy}_{npurge,nmadvise,purged} |
| 226 | + stats.arenas.<i>.bins.<j>.{nslabs,reslabs,curslabs} |
| 227 | + stats.arenas.<i>.bins.<j>.mutex. |
| 228 | {num_ops,num_spin_acq,num_wait,max_wait_time,total_wait_time,max_num_thds, |
| 229 | num_owner_switch} |
| 230 | + stats.arenas.<i>.lextents.<j>.{nmalloc,ndalloc,nrequests,curlextents} |
| 231 | + stats.arenas.i.mutexes.{large,extent_avail,extents_dirty,extents_muzzy, |
| 232 | extents_retained,decay_dirty,decay_muzzy,base,tcache_list}. |
| 233 | {num_ops,num_spin_acq,num_wait,max_wait_time,total_wait_time,max_num_thds, |
| 234 | num_owner_switch} |
| 235 | |
| 236 | Portability improvements: |
| 237 | - Improve reentrant allocation support, such that deadlock is less likely if |
| 238 | e.g. a system library call in turn allocates memory. (@davidtgoldblatt, |
| 239 | @interwq) |
| 240 | - Support static linking of jemalloc with glibc. (@djwatson) |
| 241 | |
| 242 | Optimizations and refactors: |
| 243 | - Organize virtual memory as "extents" of virtual memory pages, rather than as |
| 244 | naturally aligned "chunks", and store all metadata in arbitrarily distant |
| 245 | locations. This reduces virtual memory external fragmentation, and will |
| 246 | interact better with huge pages (not yet explicitly supported). (@jasone) |
| 247 | - Fold large and huge size classes together; only small and large size classes |
| 248 | remain. (@jasone) |
| 249 | - Unify the allocation paths, and merge most fast-path branching decisions. |
| 250 | (@davidtgoldblatt, @interwq) |
| 251 | - Embed per thread automatic tcache into thread-specific data, which reduces |
| 252 | conditional branches and dereferences. Also reorganize tcache to increase |
| 253 | fast-path data locality. (@interwq) |
| 254 | - Rewrite atomics to closely model the C11 API, convert various |
| 255 | synchronization from mutex-based to atomic, and use the explicit memory |
| 256 | ordering control to resolve various hypothetical races without increasing |
| 257 | synchronization overhead. (@davidtgoldblatt) |
| 258 | - Extensively optimize rtree via various methods: |
| 259 | + Add multiple layers of rtree lookup caching, since rtree lookups are now |
| 260 | part of fast-path deallocation. (@interwq) |
| 261 | + Determine rtree layout at compile time. (@jasone) |
| 262 | + Make the tree shallower for common configurations. (@jasone) |
| 263 | + Embed the root node in the top-level rtree data structure, thus avoiding |
| 264 | one level of indirection. (@jasone) |
| 265 | + Further specialize leaf elements as compared to internal node elements, |
| 266 | and directly embed extent metadata needed for fast-path deallocation. |
| 267 | (@jasone) |
| 268 | + Ignore leading always-zero address bits (architecture-specific). |
| 269 | (@jasone) |
| 270 | - Reorganize headers (ongoing work) to make them hermetic, and disentangle |
| 271 | various module dependencies. (@davidtgoldblatt) |
| 272 | - Convert various internal data structures such as size class metadata from |
| 273 | boot-time-initialized to compile-time-initialized. Propagate resulting data |
| 274 | structure simplifications, such as making arena metadata fixed-size. |
| 275 | (@jasone) |
| 276 | - Simplify size class lookups when constrained to size classes that are |
| 277 | multiples of the page size. This speeds lookups, but the primary benefit is |
| 278 | complexity reduction in code that was the source of numerous regressions. |
| 279 | (@jasone) |
| 280 | - Lock individual extents when possible for localized extent operations, |
| 281 | rather than relying on a top-level arena lock. (@davidtgoldblatt, @jasone) |
| 282 | - Use first fit layout policy instead of best fit, in order to improve |
| 283 | packing. (@jasone) |
| 284 | - If munmap(2) is not in use, use an exponential series to grow each arena's |
| 285 | virtual memory, so that the number of disjoint virtual memory mappings |
| 286 | remains low. (@jasone) |
| 287 | - Implement per arena base allocators, so that arenas never share any virtual |
| 288 | memory pages. (@jasone) |
| 289 | - Automatically generate private symbol name mangling macros. (@jasone) |
| 290 | |
| 291 | Incompatible changes: |
| 292 | - Replace chunk hooks with an expanded/normalized set of extent hooks. |
| 293 | (@jasone) |
| 294 | - Remove ratio-based purging. (@jasone) |
| 295 | - Remove --disable-tcache. (@jasone) |
| 296 | - Remove --disable-tls. (@jasone) |
| 297 | - Remove --enable-ivsalloc. (@jasone) |
| 298 | - Remove --with-lg-size-class-group. (@jasone) |
| 299 | - Remove --with-lg-tiny-min. (@jasone) |
| 300 | - Remove --disable-cc-silence. (@jasone) |
| 301 | - Remove --enable-code-coverage. (@jasone) |
| 302 | - Remove --disable-munmap (replaced by opt.retain). (@jasone) |
| 303 | - Remove Valgrind support. (@jasone) |
| 304 | - Remove quarantine support. (@jasone) |
| 305 | - Remove redzone support. (@jasone) |
| 306 | - Remove mallctl interfaces (various authors): |
| 307 | + config.munmap |
| 308 | + config.tcache |
| 309 | + config.tls |
| 310 | + config.valgrind |
| 311 | + opt.lg_chunk |
| 312 | + opt.purge |
| 313 | + opt.lg_dirty_mult |
| 314 | + opt.decay_time |
| 315 | + opt.quarantine |
| 316 | + opt.redzone |
| 317 | + opt.thp |
| 318 | + arena.<i>.lg_dirty_mult |
| 319 | + arena.<i>.decay_time |
| 320 | + arena.<i>.chunk_hooks |
| 321 | + arenas.initialized |
| 322 | + arenas.lg_dirty_mult |
| 323 | + arenas.decay_time |
| 324 | + arenas.bin.<i>.run_size |
| 325 | + arenas.nlruns |
| 326 | + arenas.lrun.<i>.size |
| 327 | + arenas.nhchunks |
| 328 | + arenas.hchunk.<i>.size |
| 329 | + arenas.extend |
| 330 | + stats.cactive |
| 331 | + stats.arenas.<i>.lg_dirty_mult |
| 332 | + stats.arenas.<i>.decay_time |
| 333 | + stats.arenas.<i>.metadata.{mapped,allocated} |
| 334 | + stats.arenas.<i>.{npurge,nmadvise,purged} |
| 335 | + stats.arenas.<i>.huge.{allocated,nmalloc,ndalloc,nrequests} |
| 336 | + stats.arenas.<i>.bins.<j>.{nruns,reruns,curruns} |
| 337 | + stats.arenas.<i>.lruns.<j>.{nmalloc,ndalloc,nrequests,curruns} |
| 338 | + stats.arenas.<i>.hchunks.<j>.{nmalloc,ndalloc,nrequests,curhchunks} |
| 339 | |
| 340 | Bug fixes: |
| 341 | - Improve interval-based profile dump triggering to dump only one profile when |
| 342 | a single allocation's size exceeds the interval. (@jasone) |
| 343 | - Use prefixed function names (as controlled by --with-jemalloc-prefix) when |
| 344 | pruning backtrace frames in jeprof. (@jasone) |
| 345 | |
Jason Evans | cbb6720 | 2017-02-28 12:59:22 -0800 | [diff] [blame] | 346 | * 4.5.0 (February 28, 2017) |
| 347 | |
| 348 | This is the first release to benefit from much broader continuous integration |
| 349 | testing, thanks to @davidtgoldblatt. Had we had this testing infrastructure |
| 350 | in place for prior releases, it would have caught all of the most serious |
| 351 | regressions fixed by this release. |
| 352 | |
| 353 | New features: |
Jason Evans | ff55f07 | 2017-02-28 19:24:08 -0800 | [diff] [blame] | 354 | - Add --disable-thp and the opt.thp mallctl to provide opt-out mechanisms for |
Jason Evans | cbb6720 | 2017-02-28 12:59:22 -0800 | [diff] [blame] | 355 | transparent huge page integration. (@jasone) |
| 356 | - Update zone allocator integration to work with macOS 10.12. (@glandium) |
| 357 | - Restructure *CFLAGS configuration, so that CFLAGS behaves typically, and |
| 358 | EXTRA_CFLAGS provides a way to specify e.g. -Werror during building, but not |
| 359 | during configuration. (@jasone, @ronawho) |
| 360 | |
| 361 | Bug fixes: |
| 362 | - Fix DSS (sbrk(2)-based) allocation. This regression was first released in |
| 363 | 4.3.0. (@jasone) |
| 364 | - Handle race in per size class utilization computation. This functionality |
| 365 | was first released in 4.0.0. (@interwq) |
| 366 | - Fix lock order reversal during gdump. (@jasone) |
Jason Evans | ff55f07 | 2017-02-28 19:24:08 -0800 | [diff] [blame] | 367 | - Fix/refactor tcache synchronization. This regression was first released in |
Jason Evans | cbb6720 | 2017-02-28 12:59:22 -0800 | [diff] [blame] | 368 | 4.0.0. (@jasone) |
| 369 | - Fix various JSON-formatted malloc_stats_print() bugs. This functionality |
| 370 | was first released in 4.3.0. (@jasone) |
| 371 | - Fix huge-aligned allocation. This regression was first released in 4.4.0. |
| 372 | (@jasone) |
| 373 | - When transparent huge page integration is enabled, detect what state pages |
| 374 | start in according to the kernel's current operating mode, and only convert |
| 375 | arena chunks to non-huge during purging if that is not their initial state. |
| 376 | This functionality was first released in 4.4.0. (@jasone) |
| 377 | - Fix lg_chunk clamping for the --enable-cache-oblivious --disable-fill case. |
| 378 | This regression was first released in 4.0.0. (@jasone, @428desmo) |
| 379 | - Properly detect sparc64 when building for Linux. (@glaubitz) |
| 380 | |
Jason Evans | fbe3015 | 2016-12-03 12:22:59 -0800 | [diff] [blame] | 381 | * 4.4.0 (December 3, 2016) |
| 382 | |
| 383 | New features: |
| 384 | - Add configure support for *-*-linux-android. (@cferris1000, @jasone) |
| 385 | - Add the --disable-syscall configure option, for use on systems that place |
| 386 | security-motivated limitations on syscall(2). (@jasone) |
| 387 | - Add support for Debian GNU/kFreeBSD. (@thesam) |
| 388 | |
| 389 | Optimizations: |
| 390 | - Add extent serial numbers and use them where appropriate as a sort key that |
| 391 | is higher priority than address, so that the allocation policy prefers older |
| 392 | extents. This tends to improve locality (decrease fragmentation) when |
| 393 | memory grows downward. (@jasone) |
| 394 | - Refactor madvise(2) configuration so that MADV_FREE is detected and utilized |
| 395 | on Linux 4.5 and newer. (@jasone) |
| 396 | - Mark partially purged arena chunks as non-huge-page. This improves |
| 397 | interaction with Linux's transparent huge page functionality. (@jasone) |
| 398 | |
| 399 | Bug fixes: |
| 400 | - Fix size class computations for edge conditions involving extremely large |
| 401 | allocations. This regression was first released in 4.0.0. (@jasone, |
| 402 | @ingvarha) |
| 403 | - Remove overly restrictive assertions related to the cactive statistic. This |
| 404 | regression was first released in 4.1.0. (@jasone) |
| 405 | - Implement a more reliable detection scheme for os_unfair_lock on macOS. |
| 406 | (@jszakmeister) |
| 407 | |
Jason Evans | 85dae2f | 2016-11-07 16:22:02 -0800 | [diff] [blame] | 408 | * 4.3.1 (November 7, 2016) |
| 409 | |
| 410 | Bug fixes: |
| 411 | - Fix a severe virtual memory leak. This regression was first released in |
| 412 | 4.3.0. (@interwq, @jasone) |
| 413 | - Refactor atomic and prng APIs to restore support for 32-bit platforms that |
| 414 | use pre-C11 toolchains, e.g. FreeBSD's mips. (@jasone) |
| 415 | |
Jason Evans | 0760876 | 2016-11-04 00:02:43 -0700 | [diff] [blame] | 416 | * 4.3.0 (November 4, 2016) |
Jason Evans | 04e1328 | 2016-11-02 19:45:01 -0700 | [diff] [blame] | 417 | |
| 418 | This is the first release that passes the test suite for multiple Windows |
| 419 | configurations, thanks in large part to @glandium setting up continuous |
| 420 | integration via AppVeyor (and Travis CI for Linux and OS X). |
| 421 | |
| 422 | New features: |
| 423 | - Add "J" (JSON) support to malloc_stats_print(). (@jasone) |
| 424 | - Add Cray compiler support. (@ronawho) |
| 425 | |
| 426 | Optimizations: |
| 427 | - Add/use adaptive spinning for bootstrapping and radix tree node |
| 428 | initialization. (@jasone) |
| 429 | |
| 430 | Bug fixes: |
Jason Evans | 0760876 | 2016-11-04 00:02:43 -0700 | [diff] [blame] | 431 | - Fix large allocation to search starting in the optimal size class heap, |
| 432 | which can substantially reduce virtual memory churn and fragmentation. This |
| 433 | regression was first released in 4.0.0. (@mjp41, @jasone) |
Jason Evans | 04e1328 | 2016-11-02 19:45:01 -0700 | [diff] [blame] | 434 | - Fix stats.arenas.<i>.nthreads accounting. (@interwq) |
| 435 | - Fix and simplify decay-based purging. (@jasone) |
| 436 | - Make DSS (sbrk(2)-related) operations lockless, which resolves potential |
| 437 | deadlocks during thread exit. (@jasone) |
| 438 | - Fix over-sized allocation of radix tree leaf nodes. (@mjp41, @ogaun, |
| 439 | @jasone) |
Jason Evans | e0a9e78 | 2016-11-04 15:15:24 -0700 | [diff] [blame] | 440 | - Fix over-sized allocation of arena_t (plus associated stats) data |
| 441 | structures. (@jasone, @interwq) |
Jason Evans | 04e1328 | 2016-11-02 19:45:01 -0700 | [diff] [blame] | 442 | - Fix EXTRA_CFLAGS to not affect configuration. (@jasone) |
| 443 | - Fix a Valgrind integration bug. (@ronawho) |
| 444 | - Disallow 0x5a junk filling when running in Valgrind. (@jasone) |
| 445 | - Fix a file descriptor leak on Linux. This regression was first released in |
Jason Evans | 0760876 | 2016-11-04 00:02:43 -0700 | [diff] [blame] | 446 | 4.2.0. (@vsarunas, @jasone) |
Jason Evans | 04e1328 | 2016-11-02 19:45:01 -0700 | [diff] [blame] | 447 | - Fix static linking of jemalloc with glibc. (@djwatson) |
| 448 | - Use syscall(2) rather than {open,read,close}(2) during boot on Linux. This |
| 449 | works around other libraries' system call wrappers performing reentrant |
Jason Evans | 0760876 | 2016-11-04 00:02:43 -0700 | [diff] [blame] | 450 | allocation. (@kspinka, @Whissi, @jasone) |
Jason Evans | 04e1328 | 2016-11-02 19:45:01 -0700 | [diff] [blame] | 451 | - Fix OS X default zone replacement to work with OS X 10.12. (@glandium, |
| 452 | @jasone) |
Jason Evans | 0760876 | 2016-11-04 00:02:43 -0700 | [diff] [blame] | 453 | - Fix cached memory management to avoid needless commit/decommit operations |
| 454 | during purging, which resolves permanent virtual memory map fragmentation |
| 455 | issues on Windows. (@mjp41, @jasone) |
Jason Evans | 04e1328 | 2016-11-02 19:45:01 -0700 | [diff] [blame] | 456 | - Fix TSD fetches to avoid (recursive) allocation. This is relevant to |
| 457 | non-TLS and Windows configurations. (@jasone) |
| 458 | - Fix malloc_conf overriding to work on Windows. (@jasone) |
| 459 | - Forcibly disable lazy-lock on Windows (was forcibly *enabled*). (@jasone) |
| 460 | |
Jason Evans | b9b3556 | 2016-06-07 14:40:43 -0700 | [diff] [blame] | 461 | * 4.2.1 (June 8, 2016) |
| 462 | |
| 463 | Bug fixes: |
| 464 | - Fix bootstrapping issues for configurations that require allocation during |
| 465 | tsd initialization (e.g. --disable-tls). (@cferris1000, @jasone) |
| 466 | - Fix gettimeofday() version of nstime_update(). (@ronawho) |
| 467 | - Fix Valgrind regressions in calloc() and chunk_alloc_wrapper(). (@ronawho) |
| 468 | - Fix potential VM map fragmentation regression. (@jasone) |
| 469 | - Fix opt_zero-triggered in-place huge reallocation zeroing. (@jasone) |
| 470 | - Fix heap profiling context leaks in reallocation edge cases. (@jasone) |
| 471 | |
Jason Evans | 09f8585 | 2016-05-12 14:23:50 -0700 | [diff] [blame] | 472 | * 4.2.0 (May 12, 2016) |
Jason Evans | 62c217e | 2016-05-06 15:22:32 -0700 | [diff] [blame] | 473 | |
| 474 | New features: |
| 475 | - Add the arena.<i>.reset mallctl, which makes it possible to discard all of |
Jason Evans | dc7ff63 | 2016-05-12 15:06:50 -0700 | [diff] [blame] | 476 | an arena's allocations in a single operation. (@jasone) |
Jason Evans | 62c217e | 2016-05-06 15:22:32 -0700 | [diff] [blame] | 477 | - Add the stats.retained and stats.arenas.<i>.retained statistics. (@jasone) |
| 478 | - Add the --with-version configure option. (@jasone) |
| 479 | - Support --with-lg-page values larger than actual page size. (@jasone) |
| 480 | |
| 481 | Optimizations: |
| 482 | - Use pairing heaps rather than red-black trees for various hot data |
| 483 | structures. (@djwatson, @jasone) |
| 484 | - Streamline fast paths of rtree operations. (@jasone) |
| 485 | - Optimize the fast paths of calloc() and [m,d,sd]allocx(). (@jasone) |
| 486 | - Decommit unused virtual memory if the OS does not overcommit. (@jasone) |
| 487 | - Specify MAP_NORESERVE on Linux if [heuristic] overcommit is active, in order |
| 488 | to avoid unfortunate interactions during fork(2). (@jasone) |
| 489 | |
| 490 | Bug fixes: |
Jason Evans | 7790a0b | 2016-05-11 00:52:59 -0700 | [diff] [blame] | 491 | - Fix chunk accounting related to triggering gdump profiles. (@jasone) |
Jason Evans | 62c217e | 2016-05-06 15:22:32 -0700 | [diff] [blame] | 492 | - Link against librt for clock_gettime(2) if glibc < 2.17. (@jasone) |
| 493 | - Scale leak report summary according to sampling probability. (@jasone) |
| 494 | |
Jason Evans | 21cda0d | 2016-05-03 12:11:36 -0700 | [diff] [blame] | 495 | * 4.1.1 (May 3, 2016) |
| 496 | |
| 497 | This bugfix release resolves a variety of mostly minor issues, though the |
| 498 | bitmap fix is critical for 64-bit Windows. |
| 499 | |
| 500 | Bug fixes: |
| 501 | - Fix the linear scan version of bitmap_sfu() to shift by the proper amount |
| 502 | even when sizeof(long) is not the same as sizeof(void *), as on 64-bit |
| 503 | Windows. (@jasone) |
| 504 | - Fix hashing functions to avoid unaligned memory accesses (and resulting |
| 505 | crashes). This is relevant at least to some ARM-based platforms. |
| 506 | (@rkmisra) |
| 507 | - Fix fork()-related lock rank ordering reversals. These reversals were |
| 508 | unlikely to cause deadlocks in practice except when heap profiling was |
| 509 | enabled and active. (@jasone) |
| 510 | - Fix various chunk leaks in OOM code paths. (@jasone) |
| 511 | - Fix malloc_stats_print() to print opt.narenas correctly. (@jasone) |
Jason Evans | 7ba6e74 | 2016-05-03 17:46:07 -0700 | [diff] [blame] | 512 | - Fix MSVC-specific build/test issues. (@rustyx, @yuslepukhin) |
Jason Evans | 21cda0d | 2016-05-03 12:11:36 -0700 | [diff] [blame] | 513 | - Fix a variety of test failures that were due to test fragility rather than |
| 514 | core bugs. (@jasone) |
| 515 | |
Jason Evans | 3a34261 | 2016-02-28 14:52:17 -0800 | [diff] [blame] | 516 | * 4.1.0 (February 28, 2016) |
Jason Evans | 14be4a7 | 2016-02-26 21:00:02 -0800 | [diff] [blame] | 517 | |
| 518 | This release is primarily about optimizations, but it also incorporates a lot |
| 519 | of portability-motivated refactoring and enhancements. Many people worked on |
| 520 | this release, to an extent that even with the omission here of minor changes |
| 521 | (see git revision history), and of the people who reported and diagnosed |
| 522 | issues, so much of the work was contributed that starting with this release, |
| 523 | changes are annotated with author credits to help reflect the collaborative |
| 524 | effort involved. |
| 525 | |
| 526 | New features: |
| 527 | - Implement decay-based unused dirty page purging, a major optimization with |
| 528 | mallctl API impact. This is an alternative to the existing ratio-based |
| 529 | unused dirty page purging, and is intended to eventually become the sole |
| 530 | purging mechanism. New mallctls: |
| 531 | + opt.purge |
| 532 | + opt.decay_time |
| 533 | + arena.<i>.decay |
| 534 | + arena.<i>.decay_time |
| 535 | + arenas.decay_time |
| 536 | + stats.arenas.<i>.decay_time |
| 537 | (@jasone, @cevans87) |
| 538 | - Add --with-malloc-conf, which makes it possible to embed a default |
| 539 | options string during configuration. This was motivated by the desire to |
| 540 | specify --with-malloc-conf=purge:decay , since the default must remain |
| 541 | purge:ratio until the 5.0.0 release. (@jasone) |
Jason Evans | e025c51 | 2016-02-28 00:01:13 -0800 | [diff] [blame] | 542 | - Add MS Visual Studio 2015 support. (@rustyx, @yuslepukhin) |
Jason Evans | 14be4a7 | 2016-02-26 21:00:02 -0800 | [diff] [blame] | 543 | - Make *allocx() size class overflow behavior defined. The maximum |
| 544 | size class is now less than PTRDIFF_MAX to protect applications against |
| 545 | numerical overflow, and all allocation functions are guaranteed to indicate |
| 546 | errors rather than potentially crashing if the request size exceeds the |
| 547 | maximum size class. (@jasone) |
Jason Evans | 14be4a7 | 2016-02-26 21:00:02 -0800 | [diff] [blame] | 548 | - jeprof: |
| 549 | + Add raw heap profile support. (@jasone) |
| 550 | + Add --retain and --exclude for backtrace symbol filtering. (@jasone) |
| 551 | |
| 552 | Optimizations: |
| 553 | - Optimize the fast path to combine various bootstrapping and configuration |
| 554 | checks and execute more streamlined code in the common case. (@interwq) |
| 555 | - Use linear scan for small bitmaps (used for small object tracking). In |
| 556 | addition to speeding up bitmap operations on 64-bit systems, this reduces |
| 557 | allocator metadata overhead by approximately 0.2%. (@djwatson) |
| 558 | - Separate arena_avail trees, which substantially speeds up run tree |
| 559 | operations. (@djwatson) |
| 560 | - Use memoization (boot-time-computed table) for run quantization. Separate |
| 561 | arena_avail trees reduced the importance of this optimization. (@jasone) |
| 562 | - Attempt mmap-based in-place huge reallocation. This can dramatically speed |
| 563 | up incremental huge reallocation. (@jasone) |
| 564 | |
| 565 | Incompatible changes: |
| 566 | - Make opt.narenas unsigned rather than size_t. (@jasone) |
| 567 | |
| 568 | Bug fixes: |
Jason Evans | e025c51 | 2016-02-28 00:01:13 -0800 | [diff] [blame] | 569 | - Fix stats.cactive accounting regression. (@rustyx, @jasone) |
| 570 | - Handle unaligned keys in hash(). This caused problems for some ARM systems. |
Jason Evans | e3998c6 | 2016-03-07 17:55:55 -0800 | [diff] [blame] | 571 | (@jasone, @cferris1000) |
Jason Evans | 14be4a7 | 2016-02-26 21:00:02 -0800 | [diff] [blame] | 572 | - Refactor arenas array. In addition to fixing a fork-related deadlock, this |
| 573 | makes arena lookups faster and simpler. (@jasone) |
Jason Evans | 14be4a7 | 2016-02-26 21:00:02 -0800 | [diff] [blame] | 574 | - Move retained memory allocation out of the default chunk allocation |
| 575 | function, to a location that gets executed even if the application installs |
| 576 | a custom chunk allocation function. This resolves a virtual memory leak. |
| 577 | (@buchgr) |
Jason Evans | e3998c6 | 2016-03-07 17:55:55 -0800 | [diff] [blame] | 578 | - Fix a potential tsd cleanup leak. (@cferris1000, @jasone) |
Jason Evans | e025c51 | 2016-02-28 00:01:13 -0800 | [diff] [blame] | 579 | - Fix run quantization. In practice this bug had no impact unless |
| 580 | applications requested memory with alignment exceeding one page. |
| 581 | (@jasone, @djwatson) |
Jason Evans | 14be4a7 | 2016-02-26 21:00:02 -0800 | [diff] [blame] | 582 | - Fix LinuxThreads-specific bootstrapping deadlock. (Cosmin Paraschiv) |
| 583 | - jeprof: |
| 584 | + Don't discard curl options if timeout is not defined. (@djwatson) |
| 585 | + Detect failed profile fetches. (@djwatson) |
Jason Evans | e025c51 | 2016-02-28 00:01:13 -0800 | [diff] [blame] | 586 | - Fix stats.arenas.<i>.{dss,lg_dirty_mult,decay_time,pactive,pdirty} for |
| 587 | --disable-stats case. (@jasone) |
Jason Evans | 14be4a7 | 2016-02-26 21:00:02 -0800 | [diff] [blame] | 588 | |
Jason Evans | be41347 | 2015-10-24 07:53:25 -0700 | [diff] [blame] | 589 | * 4.0.4 (October 24, 2015) |
| 590 | |
| 591 | This bugfix release fixes another xallocx() regression. No other regressions |
| 592 | have come to light in over a month, so this is likely a good starting point |
| 593 | for people who prefer to wait for "dot one" releases with all the major issues |
| 594 | shaken out. |
Jason Evans | a784e41 | 2015-09-24 22:21:55 -0700 | [diff] [blame] | 595 | |
| 596 | Bug fixes: |
| 597 | - Fix xallocx(..., MALLOCX_ZERO to zero the last full trailing page of large |
| 598 | allocations that have been randomly assigned an offset of 0 when |
| 599 | --enable-cache-oblivious configure option is enabled. |
| 600 | |
Jason Evans | 0270968 | 2015-09-24 20:05:26 -0700 | [diff] [blame] | 601 | * 4.0.3 (September 24, 2015) |
| 602 | |
| 603 | This bugfix release continues the trend of xallocx() and heap profiling fixes. |
Jason Evans | fb64ec2 | 2015-09-21 18:37:18 -0700 | [diff] [blame] | 604 | |
| 605 | Bug fixes: |
Jason Evans | d260f44 | 2015-09-24 16:38:45 -0700 | [diff] [blame] | 606 | - Fix xallocx(..., MALLOCX_ZERO) to zero all trailing bytes of large |
| 607 | allocations when --enable-cache-oblivious configure option is enabled. |
| 608 | - Fix xallocx(..., MALLOCX_ZERO) to zero trailing bytes of huge allocations |
| 609 | when resizing from/to a size class that is not a multiple of the chunk size. |
Jason Evans | fb64ec2 | 2015-09-21 18:37:18 -0700 | [diff] [blame] | 610 | - Fix prof_tctx_dump_iter() to filter out nodes that were created after heap |
| 611 | profile dumping started. |
Jason Evans | d36c7eb | 2015-09-24 16:53:18 -0700 | [diff] [blame] | 612 | - Work around a potentially bad thread-specific data initialization |
| 613 | interaction with NPTL (glibc's pthreads implementation). |
Jason Evans | fb64ec2 | 2015-09-21 18:37:18 -0700 | [diff] [blame] | 614 | |
Jason Evans | b8e966f | 2015-09-21 10:19:37 -0700 | [diff] [blame] | 615 | * 4.0.2 (September 21, 2015) |
| 616 | |
| 617 | This bugfix release addresses a few bugs specific to heap profiling. |
Jason Evans | 38e2c8f | 2015-09-17 10:05:56 -0700 | [diff] [blame] | 618 | |
| 619 | Bug fixes: |
| 620 | - Fix ixallocx_prof_sample() to never modify nor create sampled small |
| 621 | allocations. xallocx() is in general incapable of moving small allocations, |
| 622 | so this fix removes buggy code without loss of generality. |
Jason Evans | 4be9c79 | 2015-09-17 10:17:55 -0700 | [diff] [blame] | 623 | - Fix irallocx_prof_sample() to always allocate large regions, even when |
| 624 | alignment is non-zero. |
Jason Evans | 3ca0cf6 | 2015-09-17 14:47:39 -0700 | [diff] [blame] | 625 | - Fix prof_alloc_rollback() to read tdata from thread-specific data rather |
| 626 | than dereferencing a potentially invalid tctx. |
Jason Evans | 38e2c8f | 2015-09-17 10:05:56 -0700 | [diff] [blame] | 627 | |
Jason Evans | 1d7540c | 2015-09-15 15:26:23 -0700 | [diff] [blame] | 628 | * 4.0.1 (September 15, 2015) |
| 629 | |
| 630 | This is a bugfix release that is somewhat high risk due to the amount of |
| 631 | refactoring required to address deep xallocx() problems. As a side effect of |
| 632 | these fixes, xallocx() now tries harder to partially fulfill requests for |
| 633 | optional extra space. Note that a couple of minor heap profiling |
| 634 | optimizations are included, but these are better thought of as performance |
Jason Evans | cad27a8 | 2018-04-10 15:16:23 -0700 | [diff] [blame] | 635 | fixes that were integral to discovering most of the other bugs. |
Jason Evans | 1d7540c | 2015-09-15 15:26:23 -0700 | [diff] [blame] | 636 | |
| 637 | Optimizations: |
| 638 | - Avoid a chunk metadata read in arena_prof_tctx_set(), since it is in the |
| 639 | fast path when heap profiling is enabled. Additionally, split a special |
| 640 | case out into arena_prof_tctx_reset(), which also avoids chunk metadata |
| 641 | reads. |
| 642 | - Optimize irallocx_prof() to optimistically update the sampler state. The |
| 643 | prior implementation appears to have been a holdover from when |
| 644 | rallocx()/xallocx() functionality was combined as rallocm(). |
Jason Evans | 5ef33a9 | 2015-08-19 14:12:05 -0700 | [diff] [blame] | 645 | |
| 646 | Bug fixes: |
Jason Evans | 1d7540c | 2015-09-15 15:26:23 -0700 | [diff] [blame] | 647 | - Fix TLS configuration such that it is enabled by default for platforms on |
| 648 | which it works correctly. |
Jason Evans | 30949da | 2015-08-25 16:13:59 -0700 | [diff] [blame] | 649 | - Fix arenas_cache_cleanup() and arena_get_hard() to handle |
| 650 | allocation/deallocation within the application's thread-specific data |
| 651 | cleanup functions even after arenas_cache is torn down. |
Jason Evans | 1d7540c | 2015-09-15 15:26:23 -0700 | [diff] [blame] | 652 | - Fix xallocx() bugs related to size+extra exceeding HUGE_MAXCLASS. |
Mike Hommey | 6d8075f | 2015-08-27 20:30:15 -0700 | [diff] [blame] | 653 | - Fix chunk purge hook calls for in-place huge shrinking reallocation to |
| 654 | specify the old chunk size rather than the new chunk size. This bug caused |
| 655 | no correctness issues for the default chunk purge function, but was |
| 656 | visible to custom functions set via the "arena.<i>.chunk_hooks" mallctl. |
Jason Evans | 1d7540c | 2015-09-15 15:26:23 -0700 | [diff] [blame] | 657 | - Fix heap profiling bugs: |
| 658 | + Fix heap profiling to distinguish among otherwise identical sample sites |
| 659 | with interposed resets (triggered via the "prof.reset" mallctl). This bug |
| 660 | could cause data structure corruption that would most likely result in a |
| 661 | segfault. |
| 662 | + Fix irealloc_prof() to prof_alloc_rollback() on OOM. |
| 663 | + Make one call to prof_active_get_unlocked() per allocation event, and use |
| 664 | the result throughout the relevant functions that handle an allocation |
| 665 | event. Also add a missing check in prof_realloc(). These fixes protect |
| 666 | allocation events against concurrent prof_active changes. |
| 667 | + Fix ixallocx_prof() to pass usize_max and zero to ixallocx_prof_sample() |
| 668 | in the correct order. |
| 669 | + Fix prof_realloc() to call prof_free_sampled_object() after calling |
| 670 | prof_malloc_sample_object(). Prior to this fix, if tctx and old_tctx were |
| 671 | the same, the tctx could have been prematurely destroyed. |
| 672 | - Fix portability bugs: |
| 673 | + Don't bitshift by negative amounts when encoding/decoding run sizes in |
| 674 | chunk header maps. This affected systems with page sizes greater than 8 |
| 675 | KiB. |
| 676 | + Rename index_t to szind_t to avoid an existing type on Solaris. |
| 677 | + Add JEMALLOC_CXX_THROW to the memalign() function prototype, in order to |
| 678 | match glibc and avoid compilation errors when including both |
| 679 | jemalloc/jemalloc.h and malloc.h in C++ code. |
| 680 | + Don't assume that /bin/sh is appropriate when running size_classes.sh |
| 681 | during configuration. |
| 682 | + Consider __sparcv9 a synonym for __sparc64__ when defining LG_QUANTUM. |
| 683 | + Link tests to librt if it contains clock_gettime(2). |
Jason Evans | 5ef33a9 | 2015-08-19 14:12:05 -0700 | [diff] [blame] | 684 | |
Jason Evans | 9b68f67 | 2015-08-17 13:21:08 -0700 | [diff] [blame] | 685 | * 4.0.0 (August 17, 2015) |
Jason Evans | 54673fd | 2015-02-23 22:28:43 -0800 | [diff] [blame] | 686 | |
| 687 | This version contains many speed and space optimizations, both minor and |
| 688 | major. The major themes are generalization, unification, and simplification. |
| 689 | Although many of these optimizations cause no visible behavior change, their |
| 690 | cumulative effect is substantial. |
| 691 | |
| 692 | New features: |
| 693 | - Normalize size class spacing to be consistent across the complete size |
| 694 | range. By default there are four size classes per size doubling, but this |
| 695 | is now configurable via the --with-lg-size-class-group option. Also add the |
| 696 | --with-lg-page, --with-lg-page-sizes, --with-lg-quantum, and |
| 697 | --with-lg-tiny-min options, which can be used to tweak page and size class |
| 698 | settings. Impacts: |
| 699 | + Worst case performance for incrementally growing/shrinking reallocation |
| 700 | is improved because there are far fewer size classes, and therefore |
| 701 | copying happens less often. |
| 702 | + Internal fragmentation is limited to 20% for all but the smallest size |
| 703 | classes (those less than four times the quantum). (1B + 4 KiB) |
| 704 | and (1B + 4 MiB) previously suffered nearly 50% internal fragmentation. |
| 705 | + Chunk fragmentation tends to be lower because there are fewer distinct run |
| 706 | sizes to pack. |
| 707 | - Add support for explicit tcaches. The "tcache.create", "tcache.flush", and |
| 708 | "tcache.destroy" mallctls control tcache lifetime and flushing, and the |
| 709 | MALLOCX_TCACHE(tc) and MALLOCX_TCACHE_NONE flags to the *allocx() API |
| 710 | control which tcache is used for each operation. |
| 711 | - Implement per thread heap profiling, as well as the ability to |
| 712 | enable/disable heap profiling on a per thread basis. Add the "prof.reset", |
| 713 | "prof.lg_sample", "thread.prof.name", "thread.prof.active", |
| 714 | "opt.prof_thread_active_init", "prof.thread_active_init", and |
| 715 | "thread.prof.active" mallctls. |
| 716 | - Add support for per arena application-specified chunk allocators, configured |
Jason Evans | b49a334 | 2015-07-28 11:28:19 -0400 | [diff] [blame] | 717 | via the "arena.<i>.chunk_hooks" mallctl. |
Jason Evans | 54673fd | 2015-02-23 22:28:43 -0800 | [diff] [blame] | 718 | - Refactor huge allocation to be managed by arenas, so that arenas now |
| 719 | function as general purpose independent allocators. This is important in |
| 720 | the context of user-specified chunk allocators, aside from the scalability |
| 721 | benefits. Related new statistics: |
| 722 | + The "stats.arenas.<i>.huge.allocated", "stats.arenas.<i>.huge.nmalloc", |
| 723 | "stats.arenas.<i>.huge.ndalloc", and "stats.arenas.<i>.huge.nrequests" |
| 724 | mallctls provide high level per arena huge allocation statistics. |
Qinfan Wu | 8975035 | 2015-04-21 16:57:42 -0700 | [diff] [blame] | 725 | + The "arenas.nhchunks", "arenas.hchunk.<i>.size", |
Jason Evans | 54673fd | 2015-02-23 22:28:43 -0800 | [diff] [blame] | 726 | "stats.arenas.<i>.hchunks.<j>.nmalloc", |
| 727 | "stats.arenas.<i>.hchunks.<j>.ndalloc", |
| 728 | "stats.arenas.<i>.hchunks.<j>.nrequests", and |
| 729 | "stats.arenas.<i>.hchunks.<j>.curhchunks" mallctls provide per size class |
| 730 | statistics. |
| 731 | - Add the 'util' column to malloc_stats_print() output, which reports the |
| 732 | proportion of available regions that are currently in use for each small |
| 733 | size class. |
| 734 | - Add "alloc" and "free" modes for for junk filling (see the "opt.junk" |
| 735 | mallctl), so that it is possible to separately enable junk filling for |
| 736 | allocation versus deallocation. |
| 737 | - Add the jemalloc-config script, which provides information about how |
| 738 | jemalloc was configured, and how to integrate it into application builds. |
| 739 | - Add metadata statistics, which are accessible via the "stats.metadata", |
| 740 | "stats.arenas.<i>.metadata.mapped", and |
| 741 | "stats.arenas.<i>.metadata.allocated" mallctls. |
Jason Evans | 4acd75a | 2015-03-23 17:25:57 -0700 | [diff] [blame] | 742 | - Add the "stats.resident" mallctl, which reports the upper limit of |
| 743 | physically resident memory mapped by the allocator. |
Jason Evans | 562d266 | 2015-03-24 16:36:12 -0700 | [diff] [blame] | 744 | - Add per arena control over unused dirty page purging, via the |
| 745 | "arenas.lg_dirty_mult", "arena.<i>.lg_dirty_mult", and |
| 746 | "stats.arenas.<i>.lg_dirty_mult" mallctls. |
Jason Evans | 54673fd | 2015-02-23 22:28:43 -0800 | [diff] [blame] | 747 | - Add the "prof.gdump" mallctl, which makes it possible to toggle the gdump |
| 748 | feature on/off during program execution. |
| 749 | - Add sdallocx(), which implements sized deallocation. The primary |
| 750 | optimization over dallocx() is the removal of a metadata read, which often |
| 751 | suffers an L1 cache miss. |
| 752 | - Add missing header includes in jemalloc/jemalloc.h, so that applications |
| 753 | only have to #include <jemalloc/jemalloc.h>. |
| 754 | - Add support for additional platforms: |
| 755 | + Bitrig |
| 756 | + Cygwin |
| 757 | + DragonFlyBSD |
| 758 | + iOS |
| 759 | + OpenBSD |
| 760 | + OpenRISC/or1k |
| 761 | |
| 762 | Optimizations: |
Jason Evans | 54673fd | 2015-02-23 22:28:43 -0800 | [diff] [blame] | 763 | - Maintain dirty runs in per arena LRUs rather than in per arena trees of |
| 764 | dirty-run-containing chunks. In practice this change significantly reduces |
| 765 | dirty page purging volume. |
| 766 | - Integrate whole chunks into the unused dirty page purging machinery. This |
| 767 | reduces the cost of repeated huge allocation/deallocation, because it |
| 768 | effectively introduces a cache of chunks. |
| 769 | - Split the arena chunk map into two separate arrays, in order to increase |
| 770 | cache locality for the frequently accessed bits. |
| 771 | - Move small run metadata out of runs, into arena chunk headers. This reduces |
| 772 | run fragmentation, smaller runs reduce external fragmentation for small size |
| 773 | classes, and packed (less uniformly aligned) metadata layout improves CPU |
| 774 | cache set distribution. |
Jason Evans | 8a03cf0 | 2015-05-04 09:58:36 -0700 | [diff] [blame] | 775 | - Randomly distribute large allocation base pointer alignment relative to page |
| 776 | boundaries in order to more uniformly utilize CPU cache sets. This can be |
Jason Evans | f2bc852 | 2015-07-17 16:38:25 -0700 | [diff] [blame] | 777 | disabled via the --disable-cache-oblivious configure option, and queried via |
| 778 | the "config.cache_oblivious" mallctl. |
Jason Evans | 54673fd | 2015-02-23 22:28:43 -0800 | [diff] [blame] | 779 | - Micro-optimize the fast paths for the public API functions. |
| 780 | - Refactor thread-specific data to reside in a single structure. This assures |
| 781 | that only a single TLS read is necessary per call into the public API. |
| 782 | - Implement in-place huge allocation growing and shrinking. |
| 783 | - Refactor rtree (radix tree for chunk lookups) to be lock-free, and make |
| 784 | additional optimizations that reduce maximum lookup depth to one or two |
| 785 | levels. This resolves what was a concurrency bottleneck for per arena huge |
| 786 | allocation, because a global data structure is critical for determining |
| 787 | which arenas own which huge allocations. |
| 788 | |
| 789 | Incompatible changes: |
| 790 | - Replace --enable-cc-silence with --disable-cc-silence to suppress spurious |
| 791 | warnings by default. |
| 792 | - Assure that the constness of malloc_usable_size()'s return type matches that |
| 793 | of the system implementation. |
| 794 | - Change the heap profile dump format to support per thread heap profiling, |
Jason Evans | 7041720 | 2015-05-01 12:31:12 -0700 | [diff] [blame] | 795 | rename pprof to jeprof, and enhance it with the --thread=<n> option. As a |
| 796 | result, the bundled jeprof must now be used rather than the upstream |
| 797 | (gperftools) pprof. |
Jason Evans | 54673fd | 2015-02-23 22:28:43 -0800 | [diff] [blame] | 798 | - Disable "opt.prof_final" by default, in order to avoid atexit(3), which can |
| 799 | internally deadlock on some platforms. |
| 800 | - Change the "arenas.nlruns" mallctl type from size_t to unsigned. |
| 801 | - Replace the "stats.arenas.<i>.bins.<j>.allocated" mallctl with |
| 802 | "stats.arenas.<i>.bins.<j>.curregs". |
| 803 | - Ignore MALLOC_CONF in set{uid,gid,cap} binaries. |
| 804 | - Ignore MALLOCX_ARENA(a) in dallocx(), in favor of using the |
| 805 | MALLOCX_TCACHE(tc) and MALLOCX_TCACHE_NONE flags to control tcache usage. |
| 806 | |
| 807 | Removed features: |
| 808 | - Remove the *allocm() API, which is superseded by the *allocx() API. |
| 809 | - Remove the --enable-dss options, and make dss non-optional on all platforms |
| 810 | which support sbrk(2). |
| 811 | - Remove the "arenas.purge" mallctl, which was obsoleted by the |
| 812 | "arena.<i>.purge" mallctl in 3.1.0. |
| 813 | - Remove the unnecessary "opt.valgrind" mallctl; jemalloc automatically |
| 814 | detects whether it is running inside Valgrind. |
| 815 | - Remove the "stats.huge.allocated", "stats.huge.nmalloc", and |
| 816 | "stats.huge.ndalloc" mallctls. |
| 817 | - Remove the --enable-mremap option. |
Jason Evans | 54673fd | 2015-02-23 22:28:43 -0800 | [diff] [blame] | 818 | - Remove the "stats.chunks.current", "stats.chunks.total", and |
| 819 | "stats.chunks.high" mallctls. |
| 820 | |
| 821 | Bug fixes: |
| 822 | - Fix the cactive statistic to decrease (rather than increase) when active |
| 823 | memory decreases. This regression was first released in 3.5.0. |
| 824 | - Fix OOM handling in memalign() and valloc(). A variant of this bug existed |
| 825 | in all releases since 2.0.0, which introduced these functions. |
Jason Evans | 32dca11 | 2015-07-09 11:34:13 -0700 | [diff] [blame] | 826 | - Fix an OOM-related regression in arena_tcache_fill_small(), which could |
| 827 | cause cache corruption on OOM. This regression was present in all releases |
| 828 | from 2.2.0 through 3.6.0. |
Jason Evans | 241abc6 | 2015-06-23 18:47:07 -0700 | [diff] [blame] | 829 | - Fix size class overflow handling for malloc(), posix_memalign(), memalign(), |
| 830 | calloc(), and realloc() when profiling is enabled. |
Jason Evans | 54673fd | 2015-02-23 22:28:43 -0800 | [diff] [blame] | 831 | - Fix the "arena.<i>.dss" mallctl to return an error if "primary" or |
| 832 | "secondary" precedence is specified, but sbrk(2) is not supported. |
| 833 | - Fix fallback lg_floor() implementations to handle extremely large inputs. |
| 834 | - Ensure the default purgeable zone is after the default zone on OS X. |
| 835 | - Fix latent bugs in atomic_*(). |
| 836 | - Fix the "arena.<i>.dss" mallctl to handle read-only calls. |
| 837 | - Fix tls_model configuration to enable the initial-exec model when possible. |
| 838 | - Mark malloc_conf as a weak symbol so that the application can override it. |
| 839 | - Correctly detect glibc's adaptive pthread mutexes. |
| 840 | - Fix the --without-export configure option. |
| 841 | |
Jason Evans | ff53631 | 2014-03-31 09:23:10 -0700 | [diff] [blame] | 842 | * 3.6.0 (March 31, 2014) |
| 843 | |
| 844 | This version contains a critical bug fix for a regression present in 3.5.0 and |
| 845 | 3.5.1. |
| 846 | |
| 847 | Bug fixes: |
| 848 | - Fix a regression in arena_chunk_alloc() that caused crashes during |
| 849 | small/large allocation if chunk allocation failed. In the absence of this |
| 850 | bug, chunk allocation failure would result in allocation failure, e.g. NULL |
| 851 | return from malloc(). This regression was introduced in 3.5.0. |
| 852 | - Fix backtracing for gcc intrinsics-based backtracing by specifying |
| 853 | -fno-omit-frame-pointer to gcc. Note that the application (and all the |
| 854 | libraries it links to) must also be compiled with this option for |
| 855 | backtracing to be reliable. |
| 856 | - Use dss allocation precedence for huge allocations as well as small/large |
| 857 | allocations. |
Jason Evans | 54673fd | 2015-02-23 22:28:43 -0800 | [diff] [blame] | 858 | - Fix test assertion failure message formatting. This bug did not manifest on |
Jason Evans | ff53631 | 2014-03-31 09:23:10 -0700 | [diff] [blame] | 859 | x86_64 systems because of implementation subtleties in va_list. |
| 860 | - Fix inconsequential test failures for hash and SFMT code. |
| 861 | |
| 862 | New features: |
| 863 | - Support heap profiling on FreeBSD. This feature depends on the proc |
| 864 | filesystem being mounted during heap profile dumping. |
| 865 | |
Jason Evans | b9ec5c9 | 2014-02-25 16:43:51 -0800 | [diff] [blame] | 866 | * 3.5.1 (February 25, 2014) |
| 867 | |
| 868 | This version primarily addresses minor bugs in test code. |
| 869 | |
| 870 | Bug fixes: |
| 871 | - Configure Solaris/Illumos to use MADV_FREE. |
| 872 | - Fix junk filling for mremap(2)-based huge reallocation. This is only |
| 873 | relevant if configuring with the --enable-mremap option specified. |
| 874 | - Avoid compilation failure if 'restrict' C99 keyword is not supported by the |
| 875 | compiler. |
| 876 | - Add a configure test for SSE2 rather than assuming it is usable on i686 |
| 877 | systems. This fixes test compilation errors, especially on 32-bit Linux |
| 878 | systems. |
| 879 | - Fix mallctl argument size mismatches (size_t vs. uint64_t) in the stats unit |
| 880 | test. |
| 881 | - Fix/remove flawed alignment-related overflow tests. |
| 882 | - Prevent compiler optimizations that could change backtraces in the |
| 883 | prof_accum unit test. |
Jason Evans | 379f847 | 2010-10-24 16:18:29 -0700 | [diff] [blame] | 884 | |
Jason Evans | 798a481 | 2014-01-22 11:09:50 -0800 | [diff] [blame] | 885 | * 3.5.0 (January 22, 2014) |
Jason Evans | 86abd0d | 2013-11-30 15:25:42 -0800 | [diff] [blame] | 886 | |
Jason Evans | 264dfd3 | 2014-01-17 17:01:23 -0800 | [diff] [blame] | 887 | This version focuses on refactoring and automated testing, though it also |
| 888 | includes some non-trivial heap profiling optimizations not mentioned below. |
| 889 | |
| 890 | New features: |
| 891 | - Add the *allocx() API, which is a successor to the experimental *allocm() |
| 892 | API. The *allocx() functions are slightly simpler to use because they have |
| 893 | fewer parameters, they directly return the results of primary interest, and |
| 894 | mallocx()/rallocx() avoid the strict aliasing pitfall that |
Jason Evans | 9c8baec | 2014-01-22 13:08:47 -0800 | [diff] [blame] | 895 | allocm()/rallocm() share with posix_memalign(). Note that *allocm() is |
Jason Evans | 264dfd3 | 2014-01-17 17:01:23 -0800 | [diff] [blame] | 896 | slated for removal in the next non-bugfix release. |
| 897 | - Add support for LinuxThreads. |
| 898 | |
Jason Evans | 86abd0d | 2013-11-30 15:25:42 -0800 | [diff] [blame] | 899 | Bug fixes: |
Jason Evans | d37d5ad | 2013-12-05 23:01:50 -0800 | [diff] [blame] | 900 | - Unless heap profiling is enabled, disable floating point code and don't link |
| 901 | with libm. This, in combination with e.g. EXTRA_CFLAGS=-mno-sse on x64 |
| 902 | systems, makes it possible to completely disable floating point register |
| 903 | use. Some versions of glibc neglect to save/restore caller-saved floating |
| 904 | point registers during dynamic lazy symbol loading, and the symbol loading |
| 905 | code uses whatever malloc the application happens to have linked/loaded |
| 906 | with, the result being potential floating point register corruption. |
Jason Evans | 264dfd3 | 2014-01-17 17:01:23 -0800 | [diff] [blame] | 907 | - Report ENOMEM rather than EINVAL if an OOM occurs during heap profiling |
| 908 | backtrace creation in imemalign(). This bug impacted posix_memalign() and |
| 909 | aligned_alloc(). |
| 910 | - Fix a file descriptor leak in a prof_dump_maps() error path. |
| 911 | - Fix prof_dump() to close the dump file descriptor for all relevant error |
| 912 | paths. |
| 913 | - Fix rallocm() to use the arena specified by the ALLOCM_ARENA(s) flag for |
| 914 | allocation, not just deallocation. |
| 915 | - Fix a data race for large allocation stats counters. |
| 916 | - Fix a potential infinite loop during thread exit. This bug occurred on |
| 917 | Solaris, and could affect other platforms with similar pthreads TSD |
| 918 | implementations. |
| 919 | - Don't junk-fill reallocations unless usable size changes. This fixes a |
| 920 | violation of the *allocx()/*allocm() semantics. |
| 921 | - Fix growing large reallocation to junk fill new space. |
| 922 | - Fix huge deallocation to junk fill when munmap is disabled. |
Jason Evans | 86abd0d | 2013-11-30 15:25:42 -0800 | [diff] [blame] | 923 | - Change the default private namespace prefix from empty to je_, and change |
| 924 | --with-private-namespace-prefix so that it prepends an additional prefix |
| 925 | rather than replacing je_. This reduces the likelihood of applications |
| 926 | which statically link jemalloc experiencing symbol name collisions. |
Jason Evans | 264dfd3 | 2014-01-17 17:01:23 -0800 | [diff] [blame] | 927 | - Add missing private namespace mangling (relevant when |
| 928 | --with-private-namespace is specified). |
| 929 | - Add and use JEMALLOC_INLINE_C so that static inline functions are marked as |
| 930 | static even for debug builds. |
| 931 | - Add a missing mutex unlock in a malloc_init_hard() error path. In practice |
| 932 | this error path is never executed. |
| 933 | - Fix numerous bugs in malloc_strotumax() error handling/reporting. These |
| 934 | bugs had no impact except for malformed inputs. |
| 935 | - Fix numerous bugs in malloc_snprintf(). These bugs were not exercised by |
| 936 | existing calls, so they had no impact. |
Jason Evans | 86abd0d | 2013-11-30 15:25:42 -0800 | [diff] [blame] | 937 | |
Jason Evans | 0f7ba3f | 2013-10-20 19:40:09 -0700 | [diff] [blame] | 938 | * 3.4.1 (October 20, 2013) |
Jason Evans | ff08ef7 | 2013-10-19 21:41:10 -0700 | [diff] [blame] | 939 | |
| 940 | Bug fixes: |
Jason Evans | 7b65180 | 2013-10-20 14:09:54 -0700 | [diff] [blame] | 941 | - Fix a race in the "arenas.extend" mallctl that could cause memory corruption |
| 942 | of internal data structures and subsequent crashes. |
Jason Evans | dda90f5 | 2013-10-19 23:48:40 -0700 | [diff] [blame] | 943 | - Fix Valgrind integration flaws that caused Valgrind warnings about reads of |
| 944 | uninitialized memory in: |
| 945 | + arena chunk headers |
| 946 | + internal zero-initialized data structures (relevant to tcache and prof |
| 947 | code) |
Jason Evans | ff08ef7 | 2013-10-19 21:41:10 -0700 | [diff] [blame] | 948 | - Preserve errno during the first allocation. A readlink(2) call during |
| 949 | initialization fails unless /etc/malloc.conf exists, so errno was typically |
| 950 | set during the first allocation prior to this fix. |
| 951 | - Fix compilation warnings reported by gcc 4.8.1. |
| 952 | |
Jason Evans | 765cc2b | 2013-06-02 20:58:00 -0700 | [diff] [blame] | 953 | * 3.4.0 (June 2, 2013) |
| 954 | |
| 955 | This version is essentially a small bugfix release, but the addition of |
| 956 | aarch64 support requires that the minor version be incremented. |
| 957 | |
| 958 | Bug fixes: |
| 959 | - Fix race-triggered deadlocks in chunk_record(). These deadlocks were |
| 960 | typically triggered by multiple threads concurrently deallocating huge |
| 961 | objects. |
| 962 | |
| 963 | New features: |
| 964 | - Add support for the aarch64 architecture. |
| 965 | |
Jason Evans | 2298835 | 2013-03-06 11:11:17 -0800 | [diff] [blame] | 966 | * 3.3.1 (March 6, 2013) |
| 967 | |
| 968 | This version fixes bugs that are typically encountered only when utilizing |
| 969 | custom run-time options. |
Jason Evans | bbe29d3 | 2013-01-30 15:03:11 -0800 | [diff] [blame] | 970 | |
| 971 | Bug fixes: |
Jason Evans | 88c222c | 2013-02-06 11:59:30 -0800 | [diff] [blame] | 972 | - Fix a locking order bug that could cause deadlock during fork if heap |
| 973 | profiling were enabled. |
Jason Evans | a7a28c3 | 2013-01-31 16:53:58 -0800 | [diff] [blame] | 974 | - Fix a chunk recycling bug that could cause the allocator to lose track of |
Jason Evans | 765cc2b | 2013-06-02 20:58:00 -0700 | [diff] [blame] | 975 | whether a chunk was zeroed. On FreeBSD, NetBSD, and OS X, it could cause |
Jason Evans | a7a28c3 | 2013-01-31 16:53:58 -0800 | [diff] [blame] | 976 | corruption if allocating via sbrk(2) (unlikely unless running with the |
| 977 | "dss:primary" option specified). This was completely harmless on Linux |
| 978 | unless using mlockall(2) (and unlikely even then, unless the |
| 979 | --disable-munmap configure option or the "dss:primary" option was |
| 980 | specified). This regression was introduced in 3.1.0 by the |
| 981 | mlockall(2)/madvise(2) interaction fix. |
Jason Evans | bbe29d3 | 2013-01-30 15:03:11 -0800 | [diff] [blame] | 982 | - Fix TLS-related memory corruption that could occur during thread exit if the |
| 983 | thread never allocated memory. Only the quarantine and prof facilities were |
| 984 | susceptible. |
Jason Evans | d0e942e | 2013-01-31 14:42:41 -0800 | [diff] [blame] | 985 | - Fix two quarantine bugs: |
| 986 | + Internal reallocation of the quarantined object array leaked the old |
| 987 | array. |
| 988 | + Reallocation failure for internal reallocation of the quarantined object |
| 989 | array (very unlikely) resulted in memory corruption. |
Jason Evans | 0691275 | 2013-01-31 17:02:53 -0800 | [diff] [blame] | 990 | - Fix Valgrind integration to annotate all internally allocated memory in a |
| 991 | way that keeps Valgrind happy about internal data structure access. |
Jason Evans | 2298835 | 2013-03-06 11:11:17 -0800 | [diff] [blame] | 992 | - Fix building for s390 systems. |
Jason Evans | bbe29d3 | 2013-01-30 15:03:11 -0800 | [diff] [blame] | 993 | |
Jason Evans | b5681fb | 2013-01-22 22:45:09 -0800 | [diff] [blame] | 994 | * 3.3.0 (January 23, 2013) |
| 995 | |
| 996 | This version includes a few minor performance improvements in addition to the |
| 997 | listed new features and bug fixes. |
| 998 | |
| 999 | New features: |
| 1000 | - Add clipping support to lg_chunk option processing. |
| 1001 | - Add the --enable-ivsalloc option. |
| 1002 | - Add the --without-export option. |
| 1003 | - Add the --disable-zone-allocator option. |
Jason Evans | 6eb84fb | 2012-11-29 22:13:04 -0800 | [diff] [blame] | 1004 | |
| 1005 | Bug fixes: |
| 1006 | - Fix "arenas.extend" mallctl to output the number of arenas. |
Jason Evans | a33488d | 2013-10-03 14:38:39 -0700 | [diff] [blame] | 1007 | - Fix chunk_recycle() to unconditionally inform Valgrind that returned memory |
Jason Evans | 1271185 | 2012-12-12 10:12:18 -0800 | [diff] [blame] | 1008 | is undefined. |
Jason Evans | b5681fb | 2013-01-22 22:45:09 -0800 | [diff] [blame] | 1009 | - Fix build break on FreeBSD related to alloca.h. |
Jason Evans | 6eb84fb | 2012-11-29 22:13:04 -0800 | [diff] [blame] | 1010 | |
Jason Evans | 556ddc7 | 2012-11-07 15:16:29 -0800 | [diff] [blame] | 1011 | * 3.2.0 (November 9, 2012) |
| 1012 | |
| 1013 | In addition to a couple of bug fixes, this version modifies page run |
| 1014 | allocation and dirty page purging algorithms in order to better control |
| 1015 | page-level virtual memory fragmentation. |
Jason Evans | 12efefb | 2012-10-16 22:06:56 -0700 | [diff] [blame] | 1016 | |
Jason Evans | e3d1306 | 2012-10-30 15:42:37 -0700 | [diff] [blame] | 1017 | Incompatible changes: |
Jason Evans | 556ddc7 | 2012-11-07 15:16:29 -0800 | [diff] [blame] | 1018 | - Change the "opt.lg_dirty_mult" default from 5 to 3 (32:1 to 8:1). |
Jason Evans | e3d1306 | 2012-10-30 15:42:37 -0700 | [diff] [blame] | 1019 | |
Jason Evans | 12efefb | 2012-10-16 22:06:56 -0700 | [diff] [blame] | 1020 | Bug fixes: |
| 1021 | - Fix dss/mmap allocation precedence code to use recyclable mmap memory only |
| 1022 | after primary dss allocation fails. |
Jason Evans | e3d1306 | 2012-10-30 15:42:37 -0700 | [diff] [blame] | 1023 | - Fix deadlock in the "arenas.purge" mallctl. This regression was introduced |
| 1024 | in 3.1.0 by the addition of the "arena.<i>.purge" mallctl. |
Jason Evans | 12efefb | 2012-10-16 22:06:56 -0700 | [diff] [blame] | 1025 | |
Jason Evans | 2b592b0 | 2012-10-16 10:12:40 -0700 | [diff] [blame] | 1026 | * 3.1.0 (October 16, 2012) |
Jason Evans | 3860eac | 2012-05-15 13:53:21 -0700 | [diff] [blame] | 1027 | |
Jason Evans | 781fe75 | 2012-05-15 14:48:14 -0700 | [diff] [blame] | 1028 | New features: |
| 1029 | - Auto-detect whether running inside Valgrind, thus removing the need to |
| 1030 | manually specify MALLOC_CONF=valgrind:true. |
Jason Evans | 2b592b0 | 2012-10-16 10:12:40 -0700 | [diff] [blame] | 1031 | - Add the "arenas.extend" mallctl, which allows applications to create |
| 1032 | manually managed arenas. |
| 1033 | - Add the ALLOCM_ARENA() flag for {,r,d}allocm(). |
| 1034 | - Add the "opt.dss", "arena.<i>.dss", and "stats.arenas.<i>.dss" mallctls, |
| 1035 | which provide control over dss/mmap precedence. |
| 1036 | - Add the "arena.<i>.purge" mallctl, which obsoletes "arenas.purge". |
| 1037 | - Define LG_QUANTUM for hppa. |
Jason Evans | 781fe75 | 2012-05-15 14:48:14 -0700 | [diff] [blame] | 1038 | |
Jason Evans | 174b70e | 2012-05-15 23:31:53 -0700 | [diff] [blame] | 1039 | Incompatible changes: |
| 1040 | - Disable tcache by default if running inside Valgrind, in order to avoid |
| 1041 | making unallocated objects appear reachable to Valgrind. |
Jason Evans | 2b592b0 | 2012-10-16 10:12:40 -0700 | [diff] [blame] | 1042 | - Drop const from malloc_usable_size() argument on Linux. |
Jason Evans | 174b70e | 2012-05-15 23:31:53 -0700 | [diff] [blame] | 1043 | |
Jason Evans | 3860eac | 2012-05-15 13:53:21 -0700 | [diff] [blame] | 1044 | Bug fixes: |
| 1045 | - Fix heap profiling crash if sampled object is freed via realloc(p, 0). |
Jason Evans | 5c710ce | 2012-05-23 16:09:22 -0700 | [diff] [blame] | 1046 | - Remove const from __*_hook variable declarations, so that glibc can modify |
| 1047 | them during process forking. |
Jason Evans | 2b592b0 | 2012-10-16 10:12:40 -0700 | [diff] [blame] | 1048 | - Fix mlockall(2)/madvise(2) interaction. |
| 1049 | - Fix fork(2)-related deadlocks. |
| 1050 | - Fix error return value for "thread.tcache.enabled" mallctl. |
Jason Evans | 3860eac | 2012-05-15 13:53:21 -0700 | [diff] [blame] | 1051 | |
Jason Evans | cbb71ca | 2012-05-11 17:00:20 -0700 | [diff] [blame] | 1052 | * 3.0.0 (May 11, 2012) |
Jason Evans | 9ef7f5d | 2012-04-16 18:16:48 -0700 | [diff] [blame] | 1053 | |
| 1054 | Although this version adds some major new features, the primary focus is on |
| 1055 | internal code cleanup that facilitates maintainability and portability, most |
| 1056 | of which is not reflected in the ChangeLog. This is the first release to |
| 1057 | incorporate substantial contributions from numerous other developers, and the |
| 1058 | result is a more broadly useful allocator (see the git revision history for |
| 1059 | contribution details). Note that the license has been unified, thanks to |
| 1060 | Facebook granting a license under the same terms as the other copyright |
| 1061 | holders (see COPYING). |
| 1062 | |
| 1063 | New features: |
| 1064 | - Implement Valgrind support, redzones, and quarantine. |
Jason Evans | 079687b | 2012-04-23 12:49:23 -0700 | [diff] [blame] | 1065 | - Add support for additional platforms: |
Jason Evans | 9ef7f5d | 2012-04-16 18:16:48 -0700 | [diff] [blame] | 1066 | + FreeBSD |
| 1067 | + Mac OS X Lion |
Jason Evans | 40f514f | 2012-04-22 16:21:06 -0700 | [diff] [blame] | 1068 | + MinGW |
Jason Evans | cbb71ca | 2012-05-11 17:00:20 -0700 | [diff] [blame] | 1069 | + Windows (no support yet for replacing the system malloc) |
Jason Evans | 9ef7f5d | 2012-04-16 18:16:48 -0700 | [diff] [blame] | 1070 | - Add support for additional architectures: |
| 1071 | + MIPS |
| 1072 | + SH4 |
| 1073 | + Tilera |
| 1074 | - Add support for cross compiling. |
| 1075 | - Add nallocm(), which rounds a request size up to the nearest size class |
| 1076 | without actually allocating. |
| 1077 | - Implement aligned_alloc() (blame C11). |
Jason Evans | 9ef7f5d | 2012-04-16 18:16:48 -0700 | [diff] [blame] | 1078 | - Add the "thread.tcache.enabled" mallctl. |
Jason Evans | 0b25fe7 | 2012-04-17 16:39:33 -0700 | [diff] [blame] | 1079 | - Add the "opt.prof_final" mallctl. |
Jason Evans | 25a000e | 2012-04-17 15:49:30 -0700 | [diff] [blame] | 1080 | - Update pprof (from gperftools 2.0). |
Jason Evans | cbb71ca | 2012-05-11 17:00:20 -0700 | [diff] [blame] | 1081 | - Add the --with-mangling option. |
| 1082 | - Add the --disable-experimental option. |
| 1083 | - Add the --disable-munmap option, and make it the default on Linux. |
| 1084 | - Add the --enable-mremap option, which disables use of mremap(2) by default. |
Jason Evans | 9ef7f5d | 2012-04-16 18:16:48 -0700 | [diff] [blame] | 1085 | |
| 1086 | Incompatible changes: |
| 1087 | - Enable stats by default. |
| 1088 | - Enable fill by default. |
| 1089 | - Disable lazy locking by default. |
| 1090 | - Rename the "tcache.flush" mallctl to "thread.tcache.flush". |
| 1091 | - Rename the "arenas.pagesize" mallctl to "arenas.page". |
Jason Evans | 0b25fe7 | 2012-04-17 16:39:33 -0700 | [diff] [blame] | 1092 | - Change the "opt.lg_prof_sample" default from 0 to 19 (1 B to 512 KiB). |
| 1093 | - Change the "opt.prof_accum" default from true to false. |
Jason Evans | 9ef7f5d | 2012-04-16 18:16:48 -0700 | [diff] [blame] | 1094 | |
| 1095 | Removed features: |
| 1096 | - Remove the swap feature, including the "config.swap", "swap.avail", |
| 1097 | "swap.prezeroed", "swap.nfds", and "swap.fds" mallctls. |
| 1098 | - Remove highruns statistics, including the |
| 1099 | "stats.arenas.<i>.bins.<j>.highruns" and |
| 1100 | "stats.arenas.<i>.lruns.<j>.highruns" mallctls. |
| 1101 | - As part of small size class refactoring, remove the "opt.lg_[qc]space_max", |
| 1102 | "arenas.cacheline", "arenas.subpage", "arenas.[tqcs]space_{min,max}", and |
| 1103 | "arenas.[tqcs]bins" mallctls. |
| 1104 | - Remove the "arenas.chunksize" mallctl. |
| 1105 | - Remove the "opt.lg_prof_tcmax" option. |
| 1106 | - Remove the "opt.lg_prof_bt_max" option. |
| 1107 | - Remove the "opt.lg_tcache_gc_sweep" option. |
| 1108 | - Remove the --disable-tiny option, including the "config.tiny" mallctl. |
| 1109 | - Remove the --enable-dynamic-page-shift configure option. |
| 1110 | - Remove the --enable-sysv configure option. |
| 1111 | |
| 1112 | Bug fixes: |
Jason Evans | 9ef7f5d | 2012-04-16 18:16:48 -0700 | [diff] [blame] | 1113 | - Fix a statistics-related bug in the "thread.arena" mallctl that could cause |
| 1114 | invalid statistics and crashes. |
Jason Evans | 079687b | 2012-04-23 12:49:23 -0700 | [diff] [blame] | 1115 | - Work around TLS deallocation via free() on Linux. This bug could cause |
Jason Evans | 9ef7f5d | 2012-04-16 18:16:48 -0700 | [diff] [blame] | 1116 | write-after-free memory corruption. |
Jason Evans | 52386b2 | 2012-04-22 16:00:11 -0700 | [diff] [blame] | 1117 | - Fix a potential deadlock that could occur during interval- and |
| 1118 | growth-triggered heap profile dumps. |
Jason Evans | d8ceef6 | 2012-05-10 20:59:39 -0700 | [diff] [blame] | 1119 | - Fix large calloc() zeroing bugs due to dropping chunk map unzeroed flags. |
Jason Evans | 8f0e0eb | 2012-04-21 13:33:48 -0700 | [diff] [blame] | 1120 | - Fix chunk_alloc_dss() to stop claiming memory is zeroed. This bug could |
| 1121 | cause memory corruption and crashes with --enable-dss specified. |
Jason Evans | 52386b2 | 2012-04-22 16:00:11 -0700 | [diff] [blame] | 1122 | - Fix fork-related bugs that could cause deadlock in children between fork |
| 1123 | and exec. |
Jason Evans | 9ef7f5d | 2012-04-16 18:16:48 -0700 | [diff] [blame] | 1124 | - Fix malloc_stats_print() to honor 'b' and 'l' in the opts parameter. |
| 1125 | - Fix realloc(p, 0) to act like free(p). |
| 1126 | - Do not enforce minimum alignment in memalign(). |
| 1127 | - Check for NULL pointer in malloc_usable_size(). |
Jason Evans | 52386b2 | 2012-04-22 16:00:11 -0700 | [diff] [blame] | 1128 | - Fix an off-by-one heap profile statistics bug that could be observed in |
| 1129 | interval- and growth-triggered heap profiles. |
Jason Evans | 6b9ed67 | 2012-04-25 13:12:46 -0700 | [diff] [blame] | 1130 | - Fix the "epoch" mallctl to update cached stats even if the passed in epoch |
| 1131 | is 0. |
Jason Evans | 9ef7f5d | 2012-04-16 18:16:48 -0700 | [diff] [blame] | 1132 | - Fix bin->runcur management to fix a layout policy bug. This bug did not |
| 1133 | affect correctness. |
| 1134 | - Fix a bug in choose_arena_hard() that potentially caused more arenas to be |
| 1135 | initialized than necessary. |
| 1136 | - Add missing "opt.lg_tcache_max" mallctl implementation. |
| 1137 | - Use glibc allocator hooks to make mixed allocator usage less likely. |
| 1138 | - Fix build issues for --disable-tcache. |
Jason Evans | 918d6e2 | 2012-04-20 13:42:21 -0700 | [diff] [blame] | 1139 | - Don't mangle pthread_create() when --with-private-namespace is specified. |
Jason Evans | 9ef7f5d | 2012-04-16 18:16:48 -0700 | [diff] [blame] | 1140 | |
Jason Evans | b3bd885 | 2011-11-14 17:12:45 -0800 | [diff] [blame] | 1141 | * 2.2.5 (November 14, 2011) |
| 1142 | |
| 1143 | Bug fixes: |
| 1144 | - Fix huge_ralloc() race when using mremap(2). This is a serious bug that |
| 1145 | could cause memory corruption and/or crashes. |
| 1146 | - Fix huge_ralloc() to maintain chunk statistics. |
| 1147 | - Fix malloc_stats_print(..., "a") output. |
| 1148 | |
Jason Evans | ca9ee1a | 2011-11-05 21:46:23 -0700 | [diff] [blame] | 1149 | * 2.2.4 (November 5, 2011) |
| 1150 | |
| 1151 | Bug fixes: |
| 1152 | - Initialize arenas_tsd before using it. This bug existed for 2.2.[0-3], as |
| 1153 | well as for --disable-tls builds in earlier releases. |
| 1154 | - Do not assume a 4 KiB page size in test/rallocm.c. |
| 1155 | |
Jason Evans | c67e4fd | 2011-08-31 15:19:13 -0700 | [diff] [blame] | 1156 | * 2.2.3 (August 31, 2011) |
| 1157 | |
| 1158 | This version fixes numerous bugs related to heap profiling. |
| 1159 | |
| 1160 | Bug fixes: |
| 1161 | - Fix a prof-related race condition. This bug could cause memory corruption, |
| 1162 | but only occurred in non-default configurations (prof_accum:false). |
| 1163 | - Fix off-by-one backtracing issues (make sure that prof_alloc_prep() is |
| 1164 | excluded from backtraces). |
| 1165 | - Fix a prof-related bug in realloc() (only triggered by OOM errors). |
| 1166 | - Fix prof-related bugs in allocm() and rallocm(). |
| 1167 | - Fix prof_tdata_cleanup() for --disable-tls builds. |
| 1168 | - Fix a relative include path, to fix objdir builds. |
| 1169 | |
Jason Evans | 4c48481 | 2011-07-30 16:59:13 -0700 | [diff] [blame] | 1170 | * 2.2.2 (July 30, 2011) |
| 1171 | |
| 1172 | Bug fixes: |
| 1173 | - Fix a build error for --disable-tcache. |
| 1174 | - Fix assertions in arena_purge() (for real this time). |
| 1175 | - Add the --with-private-namespace option. This is a workaround for symbol |
| 1176 | conflicts that can inadvertently arise when using static libraries. |
| 1177 | |
Jason Evans | 7d9ebea | 2011-03-30 15:01:08 -0700 | [diff] [blame] | 1178 | * 2.2.1 (March 30, 2011) |
| 1179 | |
| 1180 | Bug fixes: |
| 1181 | - Implement atomic operations for x86/x64. This fixes compilation failures |
| 1182 | for versions of gcc that are still in wide use. |
| 1183 | - Fix an assertion in arena_purge(). |
| 1184 | |
Jason Evans | 4bcd987 | 2011-03-22 15:30:22 -0700 | [diff] [blame] | 1185 | * 2.2.0 (March 22, 2011) |
| 1186 | |
| 1187 | This version incorporates several improvements to algorithms and data |
| 1188 | structures that tend to reduce fragmentation and increase speed. |
| 1189 | |
| 1190 | New features: |
| 1191 | - Add the "stats.cactive" mallctl. |
| 1192 | - Update pprof (from google-perftools 1.7). |
| 1193 | - Improve backtracing-related configuration logic, and add the |
| 1194 | --disable-prof-libgcc option. |
| 1195 | |
| 1196 | Bug fixes: |
| 1197 | - Change default symbol visibility from "internal", to "hidden", which |
| 1198 | decreases the overhead of library-internal function calls. |
| 1199 | - Fix symbol visibility so that it is also set on OS X. |
| 1200 | - Fix a build dependency regression caused by the introduction of the .pic.o |
| 1201 | suffix for PIC object files. |
| 1202 | - Add missing checks for mutex initialization failures. |
| 1203 | - Don't use libgcc-based backtracing except on x64, where it is known to work. |
| 1204 | - Fix deadlocks on OS X that were due to memory allocation in |
| 1205 | pthread_mutex_lock(). |
| 1206 | - Heap profiling-specific fixes: |
| 1207 | + Fix memory corruption due to integer overflow in small region index |
| 1208 | computation, when using a small enough sample interval that profiling |
| 1209 | context pointers are stored in small run headers. |
| 1210 | + Fix a bootstrap ordering bug that only occurred with TLS disabled. |
| 1211 | + Fix a rallocm() rsize bug. |
| 1212 | + Fix error detection bugs for aligned memory allocation. |
| 1213 | |
Jason Evans | 0e4d0d1 | 2011-03-14 16:41:03 -0700 | [diff] [blame] | 1214 | * 2.1.3 (March 14, 2011) |
| 1215 | |
| 1216 | Bug fixes: |
| 1217 | - Fix a cpp logic regression (due to the "thread.{de,}allocatedp" mallctl fix |
| 1218 | for OS X in 2.1.2). |
| 1219 | - Fix a "thread.arena" mallctl bug. |
| 1220 | - Fix a thread cache stats merging bug. |
| 1221 | |
je | 6e56e5e | 2011-03-02 11:23:41 -0800 | [diff] [blame] | 1222 | * 2.1.2 (March 2, 2011) |
| 1223 | |
| 1224 | Bug fixes: |
| 1225 | - Fix "thread.{de,}allocatedp" mallctl for OS X. |
| 1226 | - Add missing jemalloc.a to build system. |
| 1227 | |
Jason Evans | 6369286 | 2011-02-07 22:48:35 -0800 | [diff] [blame] | 1228 | * 2.1.1 (January 31, 2011) |
Jason Evans | ada55b2 | 2011-01-31 20:08:56 -0800 | [diff] [blame] | 1229 | |
Jason Evans | 6369286 | 2011-02-07 22:48:35 -0800 | [diff] [blame] | 1230 | Bug fixes: |
Jason Evans | ada55b2 | 2011-01-31 20:08:56 -0800 | [diff] [blame] | 1231 | - Fix aligned huge reallocation (affected allocm()). |
| 1232 | - Fix the ALLOCM_LG_ALIGN macro definition. |
| 1233 | - Fix a heap dumping deadlock. |
| 1234 | - Fix a "thread.arena" mallctl bug. |
| 1235 | |
Jason Evans | 6369286 | 2011-02-07 22:48:35 -0800 | [diff] [blame] | 1236 | * 2.1.0 (December 3, 2010) |
Jason Evans | 0e8d3d2 | 2010-12-03 17:02:16 -0800 | [diff] [blame] | 1237 | |
| 1238 | This version incorporates some optimizations that can't quite be considered |
| 1239 | bug fixes. |
| 1240 | |
| 1241 | New features: |
| 1242 | - Use Linux's mremap(2) for huge object reallocation when possible. |
| 1243 | - Avoid locking in mallctl*() when possible. |
| 1244 | - Add the "thread.[de]allocatedp" mallctl's. |
| 1245 | - Convert the manual page source from roff to DocBook, and generate both roff |
| 1246 | and HTML manuals. |
| 1247 | |
| 1248 | Bug fixes: |
| 1249 | - Fix a crash due to incorrect bootstrap ordering. This only impacted |
| 1250 | --enable-debug --enable-dss configurations. |
| 1251 | - Fix a minor statistics bug for mallctl("swap.avail", ...). |
| 1252 | |
Jason Evans | 6369286 | 2011-02-07 22:48:35 -0800 | [diff] [blame] | 1253 | * 2.0.1 (October 29, 2010) |
Jason Evans | 53806fe | 2010-10-29 20:16:39 -0700 | [diff] [blame] | 1254 | |
| 1255 | Bug fixes: |
| 1256 | - Fix a race condition in heap profiling that could cause undefined behavior |
Jason Evans | ada55b2 | 2011-01-31 20:08:56 -0800 | [diff] [blame] | 1257 | if "opt.prof_accum" were disabled. |
Jason Evans | 53806fe | 2010-10-29 20:16:39 -0700 | [diff] [blame] | 1258 | - Add missing mutex unlocks for some OOM error paths in the heap profiling |
| 1259 | code. |
| 1260 | - Fix a compilation error for non-C99 builds. |
| 1261 | |
Jason Evans | 6369286 | 2011-02-07 22:48:35 -0800 | [diff] [blame] | 1262 | * 2.0.0 (October 24, 2010) |
Jason Evans | 379f847 | 2010-10-24 16:18:29 -0700 | [diff] [blame] | 1263 | |
| 1264 | This version focuses on the experimental *allocm() API, and on improved |
| 1265 | run-time configuration/introspection. Nonetheless, numerous performance |
| 1266 | improvements are also included. |
| 1267 | |
| 1268 | New features: |
Jason Evans | b059a53 | 2010-10-24 16:54:40 -0700 | [diff] [blame] | 1269 | - Implement the experimental {,r,s,d}allocm() API, which provides a superset |
| 1270 | of the functionality available via malloc(), calloc(), posix_memalign(), |
| 1271 | realloc(), malloc_usable_size(), and free(). These functions can be used to |
| 1272 | allocate/reallocate aligned zeroed memory, ask for optional extra memory |
| 1273 | during reallocation, prevent object movement during reallocation, etc. |
| 1274 | - Replace JEMALLOC_OPTIONS/JEMALLOC_PROF_PREFIX with MALLOC_CONF, which is |
| 1275 | more human-readable, and more flexible. For example: |
| 1276 | JEMALLOC_OPTIONS=AJP |
| 1277 | is now: |
| 1278 | MALLOC_CONF=abort:true,fill:true,stats_print:true |
| 1279 | - Port to Apple OS X. Sponsored by Mozilla. |
| 1280 | - Make it possible for the application to control thread-->arena mappings via |
| 1281 | the "thread.arena" mallctl. |
| 1282 | - Add compile-time support for all TLS-related functionality via pthreads TSD. |
| 1283 | This is mainly of interest for OS X, which does not support TLS, but has a |
| 1284 | TSD implementation with similar performance. |
| 1285 | - Override memalign() and valloc() if they are provided by the system. |
| 1286 | - Add the "arenas.purge" mallctl, which can be used to synchronously purge all |
| 1287 | dirty unused pages. |
| 1288 | - Make cumulative heap profiling data optional, so that it is possible to |
| 1289 | limit the amount of memory consumed by heap profiling data structures. |
| 1290 | - Add per thread allocation counters that can be accessed via the |
| 1291 | "thread.allocated" and "thread.deallocated" mallctls. |
Jason Evans | 379f847 | 2010-10-24 16:18:29 -0700 | [diff] [blame] | 1292 | |
| 1293 | Incompatible changes: |
Jason Evans | b059a53 | 2010-10-24 16:54:40 -0700 | [diff] [blame] | 1294 | - Remove JEMALLOC_OPTIONS and malloc_options (see MALLOC_CONF above). |
| 1295 | - Increase default backtrace depth from 4 to 128 for heap profiling. |
| 1296 | - Disable interval-based profile dumps by default. |
Jason Evans | 379f847 | 2010-10-24 16:18:29 -0700 | [diff] [blame] | 1297 | |
| 1298 | Bug fixes: |
| 1299 | - Remove bad assertions in fork handler functions. These assertions could |
| 1300 | cause aborts for some combinations of configure settings. |
| 1301 | - Fix strerror_r() usage to deal with non-standard semantics in GNU libc. |
| 1302 | - Fix leak context reporting. This bug tended to cause the number of contexts |
| 1303 | to be underreported (though the reported number of objects and bytes were |
| 1304 | correct). |
| 1305 | - Fix a realloc() bug for large in-place growing reallocation. This bug could |
| 1306 | cause memory corruption, but it was hard to trigger. |
| 1307 | - Fix an allocation bug for small allocations that could be triggered if |
| 1308 | multiple threads raced to create a new run of backing pages. |
| 1309 | - Enhance the heap profiler to trigger samples based on usable size, rather |
| 1310 | than request size. |
| 1311 | - Fix a heap profiling bug due to sometimes losing track of requested object |
| 1312 | size for sampled objects. |
| 1313 | |
Jason Evans | 6369286 | 2011-02-07 22:48:35 -0800 | [diff] [blame] | 1314 | * 1.0.3 (August 12, 2010) |
Jason Evans | 379f847 | 2010-10-24 16:18:29 -0700 | [diff] [blame] | 1315 | |
| 1316 | Bug fixes: |
| 1317 | - Fix the libunwind-based implementation of stack backtracing (used for heap |
| 1318 | profiling). This bug could cause zero-length backtraces to be reported. |
| 1319 | - Add a missing mutex unlock in library initialization code. If multiple |
| 1320 | threads raced to initialize malloc, some of them could end up permanently |
| 1321 | blocked. |
| 1322 | |
Jason Evans | 6369286 | 2011-02-07 22:48:35 -0800 | [diff] [blame] | 1323 | * 1.0.2 (May 11, 2010) |
Jason Evans | 379f847 | 2010-10-24 16:18:29 -0700 | [diff] [blame] | 1324 | |
| 1325 | Bug fixes: |
| 1326 | - Fix junk filling of large objects, which could cause memory corruption. |
| 1327 | - Add MAP_NORESERVE support for chunk mapping, because otherwise virtual |
| 1328 | memory limits could cause swap file configuration to fail. Contributed by |
| 1329 | Jordan DeLong. |
| 1330 | |
Jason Evans | 6369286 | 2011-02-07 22:48:35 -0800 | [diff] [blame] | 1331 | * 1.0.1 (April 14, 2010) |
Jason Evans | 379f847 | 2010-10-24 16:18:29 -0700 | [diff] [blame] | 1332 | |
| 1333 | Bug fixes: |
| 1334 | - Fix compilation when --enable-fill is specified. |
| 1335 | - Fix threads-related profiling bugs that affected accuracy and caused memory |
| 1336 | to be leaked during thread exit. |
| 1337 | - Fix dirty page purging race conditions that could cause crashes. |
| 1338 | - Fix crash in tcache flushing code during thread destruction. |
| 1339 | |
Jason Evans | 6369286 | 2011-02-07 22:48:35 -0800 | [diff] [blame] | 1340 | * 1.0.0 (April 11, 2010) |
Jason Evans | 379f847 | 2010-10-24 16:18:29 -0700 | [diff] [blame] | 1341 | |
| 1342 | This release focuses on speed and run-time introspection. Numerous |
| 1343 | algorithmic improvements make this release substantially faster than its |
| 1344 | predecessors. |
| 1345 | |
| 1346 | New features: |
| 1347 | - Implement autoconf-based configuration system. |
| 1348 | - Add mallctl*(), for the purposes of introspection and run-time |
| 1349 | configuration. |
| 1350 | - Make it possible for the application to manually flush a thread's cache, via |
| 1351 | the "tcache.flush" mallctl. |
| 1352 | - Base maximum dirty page count on proportion of active memory. |
charsyam | ad6800f | 2015-07-04 01:06:06 +0900 | [diff] [blame] | 1353 | - Compute various additional run-time statistics, including per size class |
Jason Evans | 379f847 | 2010-10-24 16:18:29 -0700 | [diff] [blame] | 1354 | statistics for large objects. |
| 1355 | - Expose malloc_stats_print(), which can be called repeatedly by the |
| 1356 | application. |
| 1357 | - Simplify the malloc_message() signature to only take one string argument, |
| 1358 | and incorporate an opaque data pointer argument for use by the application |
| 1359 | in combination with malloc_stats_print(). |
| 1360 | - Add support for allocation backed by one or more swap files, and allow the |
| 1361 | application to disable over-commit if swap files are in use. |
| 1362 | - Implement allocation profiling and leak checking. |
| 1363 | |
| 1364 | Removed features: |
| 1365 | - Remove the dynamic arena rebalancing code, since thread-specific caching |
| 1366 | reduces its utility. |
| 1367 | |
| 1368 | Bug fixes: |
| 1369 | - Modify chunk allocation to work when address space layout randomization |
| 1370 | (ASLR) is in use. |
| 1371 | - Fix thread cleanup bugs related to TLS destruction. |
| 1372 | - Handle 0-size allocation requests in posix_memalign(). |
| 1373 | - Fix a chunk leak. The leaked chunks were never touched, so this impacted |
| 1374 | virtual memory usage, but not physical memory usage. |
| 1375 | |
Jason Evans | 6369286 | 2011-02-07 22:48:35 -0800 | [diff] [blame] | 1376 | * linux_2008082[78]a (August 27/28, 2008) |
Jason Evans | 379f847 | 2010-10-24 16:18:29 -0700 | [diff] [blame] | 1377 | |
| 1378 | These snapshot releases are the simple result of incorporating Linux-specific |
| 1379 | support into the FreeBSD malloc sources. |
| 1380 | |
| 1381 | -------------------------------------------------------------------------------- |
| 1382 | vim:filetype=text:textwidth=80 |