| .\" Copyright (c) 2009 Facebook, Inc. All rights reserved. |
| .\" Copyright (c) 2006-2008 Jason Evans <jasone@canonware.com>. |
| .\" All rights reserved. |
| .\" Copyright (c) 1980, 1991, 1993 |
| .\" The Regents of the University of California. All rights reserved. |
| .\" |
| .\" This code is derived from software contributed to Berkeley by |
| .\" the American National Standards Committee X3, on Information |
| .\" Processing Systems. |
| .\" |
| .\" Redistribution and use in source and binary forms, with or without |
| .\" modification, are permitted provided that the following conditions |
| .\" are met: |
| .\" 1. Redistributions of source code must retain the above copyright |
| .\" notice, this list of conditions and the following disclaimer. |
| .\" 2. Redistributions in binary form must reproduce the above copyright |
| .\" notice, this list of conditions and the following disclaimer in the |
| .\" documentation and/or other materials provided with the distribution. |
| .\" 3. Neither the name of the University nor the names of its contributors |
| .\" may be used to endorse or promote products derived from this software |
| .\" without specific prior written permission. |
| .\" |
| .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND |
| .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE |
| .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE |
| .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE |
| .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL |
| .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS |
| .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) |
| .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT |
| .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY |
| .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF |
| .\" SUCH DAMAGE. |
| .\" |
| .\" @(#)malloc.3 8.1 (Berkeley) 6/4/93 |
| .\" $FreeBSD: head/lib/libc/stdlib/malloc.3 182225 2008-08-27 02:00:53Z jasone $ |
| .\" |
| .Dd June 22, 2009 |
| .Dt JEMALLOC 3 |
| .Os |
| .Sh NAME |
| .Nm malloc , calloc , posix_memalign , realloc , free , malloc_usable_size |
| .Nd general purpose memory allocation functions |
| .Sh LIBRARY |
| .Lb libjemalloc |
| .Sh SYNOPSIS |
| .In stdlib.h |
| .Ft void * |
| .Fn malloc "size_t size" |
| .Ft void * |
| .Fn calloc "size_t number" "size_t size" |
| .Ft int |
| .Fn posix_memalign "void **ptr" "size_t alignment" "size_t size" |
| .Ft void * |
| .Fn realloc "void *ptr" "size_t size" |
| .Ft void |
| .Fn free "void *ptr" |
| .In jemalloc.h |
| .Ft size_t |
| .Fn malloc_usable_size "const void *ptr" |
| .Ft const char * |
| .Va jemalloc_options ; |
| .Ft void |
| .Fo \*(lp*jemalloc_message\*(rp |
| .Fa "const char *p1" "const char *p2" "const char *p3" "const char *p4" |
| .Fc |
| .Sh DESCRIPTION |
| The |
| .Fn malloc |
| function allocates |
| .Fa size |
| bytes of uninitialized memory. |
| The allocated space is suitably aligned |
| @roff_tiny@(after possible pointer coercion) |
| for storage of any type of object. |
| .Pp |
| The |
| .Fn calloc |
| function allocates space for |
| .Fa number |
| objects, |
| each |
| .Fa size |
| bytes in length. |
| The result is identical to calling |
| .Fn malloc |
| with an argument of |
| .Dq "number * size" , |
| with the exception that the allocated memory is explicitly initialized |
| to zero bytes. |
| .Pp |
| The |
| .Fn posix_memalign |
| function allocates |
| .Fa size |
| bytes of memory such that the allocation's base address is an even multiple of |
| .Fa alignment , |
| and returns the allocation in the value pointed to by |
| .Fa ptr . |
| The requested |
| .Fa alignment |
| must be a power of 2 at least as large as |
| .Fn sizeof "void *" . |
| .Pp |
| The |
| .Fn realloc |
| function changes the size of the previously allocated memory referenced by |
| .Fa ptr |
| to |
| .Fa size |
| bytes. |
| The contents of the memory are unchanged up to the lesser of the new and |
| old sizes. |
| If the new size is larger, |
| the contents of the newly allocated portion of the memory are undefined. |
| Upon success, the memory referenced by |
| .Fa ptr |
| is freed and a pointer to the newly allocated memory is returned. |
| Note that |
| .Fn realloc |
| may move the memory allocation, resulting in a different return value than |
| .Fa ptr . |
| If |
| .Fa ptr |
| is |
| .Dv NULL , |
| the |
| .Fn realloc |
| function behaves identically to |
| .Fn malloc |
| for the specified size. |
| .Pp |
| The |
| .Fn free |
| function causes the allocated memory referenced by |
| .Fa ptr |
| to be made available for future allocations. |
| If |
| .Fa ptr |
| is |
| .Dv NULL , |
| no action occurs. |
| .Pp |
| The |
| .Fn malloc_usable_size |
| function returns the usable size of the allocation pointed to by |
| .Fa ptr . |
| The return value may be larger than the size that was requested during |
| allocation. |
| The |
| .Fn malloc_usable_size |
| function is not a mechanism for in-place |
| .Fn realloc ; |
| rather it is provided solely as a tool for introspection purposes. |
| Any discrepancy between the requested allocation size and the size reported by |
| .Fn malloc_usable_size |
| should not be depended on, since such behavior is entirely |
| implementation-dependent. |
| .Sh TUNING |
| Once, when the first call is made to one of these memory allocation |
| routines, various flags will be set or reset, which affects the |
| workings of this allocator implementation. |
| .Pp |
| The |
| .Dq name |
| of the file referenced by the symbolic link named |
| .Pa /etc/jemalloc.conf , |
| the value of the environment variable |
| .Ev JEMALLOC_OPTIONS , |
| and the string pointed to by the global variable |
| .Va jemalloc_options |
| will be interpreted, in that order, from left to right as flags. |
| .Pp |
| Each flag is a single letter, optionally prefixed by a non-negative base 10 |
| integer repetition count. |
| For example, |
| .Dq 3N |
| is equivalent to |
| .Dq NNN . |
| Some flags control parameter magnitudes, where uppercase increases the |
| magnitude, and lowercase decreases the magnitude. |
| Other flags control boolean parameters, where uppercase indicates that a |
| behavior is set, or on, and lowercase means that a behavior is not set, or off. |
| .Bl -tag -width indent |
| .It A |
| All warnings (except for the warning about unknown |
| flags being set) become fatal. |
| The process will call |
| .Xr abort 3 |
| in these cases. |
| @roff_balance@@roff_tls@.It B |
| @roff_balance@@roff_tls@Double/halve the per-arena lock contention threshold at |
| @roff_balance@@roff_tls@which a thread is randomly re-assigned to an arena. |
| @roff_balance@@roff_tls@This dynamic load balancing tends to push threads away |
| @roff_balance@@roff_tls@from highly contended arenas, which avoids worst case |
| @roff_balance@@roff_tls@contention scenarios in which threads disproportionately |
| @roff_balance@@roff_tls@utilize arenas. |
| @roff_balance@@roff_tls@However, due to the highly dynamic load that |
| @roff_balance@@roff_tls@applications may place on the allocator, it is |
| @roff_balance@@roff_tls@impossible for the allocator to know in advance how |
| @roff_balance@@roff_tls@sensitive it should be to contention over arenas. |
| @roff_balance@@roff_tls@Therefore, some applications may benefit from increasing |
| @roff_balance@@roff_tls@or decreasing this threshold parameter. |
| .It C |
| Double/halve the size of the maximum size class that is a multiple of the |
| cacheline size (64). |
| Above this size, subpage spacing (256 bytes) is used for size classes. |
| The default value is 512 bytes. |
| @roff_dss@.It D |
| @roff_dss@Use |
| @roff_dss@.Xr sbrk 2 |
| @roff_dss@to acquire memory in the data storage segment (DSS). |
| @roff_dss@This option is enabled by default. |
| @roff_dss@See the |
| @roff_dss@.Dq M |
| @roff_dss@option for related information and interactions. |
| .It F |
| Double/halve the per-arena maximum number of dirty unused pages that are |
| allowed to accumulate before informing the kernel about at least half of those |
| pages via |
| .Xr madvise 2 . |
| This provides the kernel with sufficient information to recycle dirty pages if |
| physical memory becomes scarce and the pages remain unused. |
| The default is 512 pages per arena; |
| .Ev JEMALLOC_OPTIONS=10f |
| will prevent any dirty unused pages from accumulating. |
| @roff_mag@@roff_tls@.It G |
| @roff_mag@@roff_tls@When there are multiple threads, use thread-specific caching |
| @roff_mag@@roff_tls@for objects that are smaller than one page. |
| @roff_mag@@roff_tls@This option is enabled by default. |
| @roff_mag@@roff_tls@Thread-specific caching allows many allocations to be |
| @roff_mag@@roff_tls@satisfied without performing any thread synchronization, at |
| @roff_mag@@roff_tls@the cost of increased memory use. |
| @roff_mag@@roff_tls@See the |
| @roff_mag@@roff_tls@.Dq R |
| @roff_mag@@roff_tls@option for related tuning information. |
| @roff_fill@.It J |
| @roff_fill@Each byte of new memory allocated by |
| @roff_fill@.Fn malloc |
| @roff_fill@or |
| @roff_fill@.Fn realloc |
| @roff_fill@will be initialized to 0xa5. |
| @roff_fill@All memory returned by |
| @roff_fill@.Fn free |
| @roff_fill@or |
| @roff_fill@.Fn realloc |
| @roff_fill@will be initialized to 0x5a. |
| @roff_fill@This is intended for debugging and will impact performance |
| @roff_fill@negatively. |
| .It K |
| Double/halve the virtual memory chunk size. |
| The default chunk size is 1 MB. |
| @roff_dss@.It M |
| @roff_dss@Use |
| @roff_dss@.Xr mmap 2 |
| @roff_dss@to acquire anonymously mapped memory. |
| @roff_dss@This option is enabled by default. |
| @roff_dss@If both the |
| @roff_dss@.Dq D |
| @roff_dss@and |
| @roff_dss@.Dq M |
| @roff_dss@options are enabled, the allocator prefers the DSS over anonymous |
| @roff_dss@mappings, but allocation only fails if memory cannot be acquired via |
| @roff_dss@either method. |
| @roff_dss@If neither option is enabled, then the |
| @roff_dss@.Dq M |
| @roff_dss@option is implicitly enabled in order to assure that there is a method |
| @roff_dss@for acquiring memory. |
| .It N |
| Double/halve the number of arenas. |
| The default number of arenas is two times the number of CPUs, or one if there |
| is a single CPU. |
| .It P |
| Various statistics are printed at program exit via an |
| .Xr atexit 3 |
| function. |
| This has the potential to cause deadlock for a multi-threaded process that exits |
| while one or more threads are executing in the memory allocation functions. |
| Therefore, this option should only be used with care; it is primarily intended |
| as a performance tuning aid during application development. |
| .It Q |
| Double/halve the size of the maximum size class that is a multiple of the |
| quantum (8 or 16 bytes, depending on architecture). |
| Above this size, cacheline spacing is used for size classes. |
| The default value is 128 bytes. |
| @roff_mag@@roff_tls@.It R |
| @roff_mag@@roff_tls@Double/halve magazine size, which approximately |
| @roff_mag@@roff_tls@doubles/halves the number of rounds in each magazine. |
| @roff_mag@@roff_tls@Magazines are used by the thread-specific caching machinery |
| @roff_mag@@roff_tls@to acquire and release objects in bulk. |
| @roff_mag@@roff_tls@Increasing the magazine size decreases locking overhead, at |
| @roff_mag@@roff_tls@the expense of increased memory usage. |
| @roff_stats@.It U |
| @roff_stats@Generate a verbose trace log via |
| @roff_stats@.Fn jemalloc_message |
| @roff_stats@for all allocation operations. |
| @roff_sysv@.It V |
| @roff_sysv@Attempting to allocate zero bytes will return a |
| @roff_sysv@.Dv NULL |
| @roff_sysv@pointer instead of a valid pointer. |
| @roff_sysv@(The default behavior is to make a minimal allocation and return a |
| @roff_sysv@pointer to it.) |
| @roff_sysv@This option is provided for System V compatibility. |
| @roff_sysv@@roff_xmalloc@This option is incompatible with the |
| @roff_sysv@@roff_xmalloc@.Dq X |
| @roff_sysv@@roff_xmalloc@option. |
| @roff_xmalloc@.It X |
| @roff_xmalloc@Rather than return failure for any allocation function, display a |
| @roff_xmalloc@diagnostic message on |
| @roff_xmalloc@.Dv stderr |
| @roff_xmalloc@and cause the program to drop core (using |
| @roff_xmalloc@.Xr abort 3 ) . |
| @roff_xmalloc@This option should be set at compile time by including the |
| @roff_xmalloc@following in the source code: |
| @roff_xmalloc@.Bd -literal -offset indent |
| @roff_xmalloc@jemalloc_options = "X"; |
| @roff_xmalloc@.Ed |
| @roff_fill@.It Z |
| @roff_fill@Each byte of new memory allocated by |
| @roff_fill@.Fn malloc |
| @roff_fill@or |
| @roff_fill@.Fn realloc |
| @roff_fill@will be initialized to 0. |
| @roff_fill@Note that this initialization only happens once for each byte, so |
| @roff_fill@.Fn realloc |
| @roff_fill@calls do not zero memory that was previously allocated. |
| @roff_fill@This is intended for debugging and will impact performance |
| @roff_fill@negatively. |
| .El |
| .Pp |
| @roff_fill@The |
| @roff_fill@.Dq J |
| @roff_fill@and |
| @roff_fill@.Dq Z |
| @roff_fill@options are intended for testing and debugging. |
| @roff_fill@An application which changes its behavior when these options are used |
| @roff_fill@is flawed. |
| .Sh IMPLEMENTATION NOTES |
| @roff_dss@Traditionally, allocators have used |
| @roff_dss@.Xr sbrk 2 |
| @roff_dss@to obtain memory, which is suboptimal for several reasons, including |
| @roff_dss@race conditions, increased fragmentation, and artificial limitations |
| @roff_dss@on maximum usable memory. |
| @roff_dss@This allocator uses both |
| @roff_dss@.Xr sbrk 2 |
| @roff_dss@and |
| @roff_dss@.Xr mmap 2 |
| @roff_dss@by default, but it can be configured at run time to use only one or |
| @roff_dss@the other. |
| .Pp |
| This allocator uses multiple arenas in order to reduce lock contention for |
| threaded programs on multi-processor systems. |
| This works well with regard to threading scalability, but incurs some costs. |
| There is a small fixed per-arena overhead, and additionally, arenas manage |
| memory completely independently of each other, which means a small fixed |
| increase in overall memory fragmentation. |
| These overheads are not generally an issue, given the number of arenas normally |
| used. |
| Note that using substantially more arenas than the default is not likely to |
| improve performance, mainly due to reduced cache performance. |
| However, it may make sense to reduce the number of arenas if an application |
| does not make much use of the allocation functions. |
| .Pp |
| @roff_mag@In addition to multiple arenas, this allocator supports |
| @roff_mag@thread-specific caching for small objects (smaller than one page), in |
| @roff_mag@order to make it possible to completely avoid synchronization for most |
| @roff_mag@small allocation requests. |
| @roff_mag@Such caching allows very fast allocation in the common case, but it |
| @roff_mag@increases memory usage and fragmentation, since a bounded number of |
| @roff_mag@objects can remain allocated in each thread cache. |
| @roff_mag@.Pp |
| Memory is conceptually broken into equal-sized chunks, where the chunk size is |
| a power of two that is greater than the page size. |
| Chunks are always aligned to multiples of the chunk size. |
| This alignment makes it possible to find metadata for user objects very |
| quickly. |
| .Pp |
| User objects are broken into three categories according to size: small, large, |
| and huge. |
| Small objects are smaller than one page. |
| Large objects are smaller than the chunk size. |
| Huge objects are a multiple of the chunk size. |
| Small and large objects are managed by arenas; huge objects are managed |
| separately in a single data structure that is shared by all threads. |
| Huge objects are used by applications infrequently enough that this single |
| data structure is not a scalability issue. |
| .Pp |
| Each chunk that is managed by an arena tracks its contents as runs of |
| contiguous pages (unused, backing a set of small objects, or backing one large |
| object). |
| The combination of chunk alignment and chunk page maps makes it possible to |
| determine all metadata regarding small and large allocations in constant time. |
| .Pp |
| Small objects are managed in groups by page runs. |
| Each run maintains a bitmap that tracks which regions are in use. |
| @roff_tiny@Allocation requests that are no more than half the quantum (8 or 16, |
| @roff_tiny@depending on architecture) are rounded up to the nearest power of |
| @roff_tiny@two. |
| Allocation requests that are |
| @roff_tiny@more than half the quantum, but |
| no more than the minimum cacheline-multiple size class (see the |
| .Dq Q |
| option) are rounded up to the nearest multiple of the |
| @roff_tiny@quantum. |
| @roff_no_tiny@quantum (8 or 16, depending on architecture). |
| Allocation requests that are more than the minumum cacheline-multiple size |
| class, but no more than the minimum subpage-multiple size class (see the |
| .Dq C |
| option) are rounded up to the nearest multiple of the cacheline size (64). |
| Allocation requests that are more than the minimum subpage-multiple size class |
| are rounded up to the nearest multiple of the subpage size (256). |
| Allocation requests that are more than one page, but small enough to fit in |
| an arena-managed chunk (see the |
| .Dq K |
| option), are rounded up to the nearest run size. |
| Allocation requests that are too large to fit in an arena-managed chunk are |
| rounded up to the nearest multiple of the chunk size. |
| .Pp |
| Allocations are packed tightly together, which can be an issue for |
| multi-threaded applications. |
| If you need to assure that allocations do not suffer from cacheline sharing, |
| round your allocation requests up to the nearest multiple of the cacheline |
| size. |
| .Sh DEBUGGING MALLOC PROBLEMS |
| The first thing to do is to set the |
| .Dq A |
| option. |
| This option forces a coredump (if possible) at the first sign of trouble, |
| rather than the normal policy of trying to continue if at all possible. |
| .Pp |
| It is probably also a good idea to recompile the program with suitable |
| options and symbols for debugger support. |
| .Pp |
| @roff_fill@If the program starts to give unusual results, coredump or generally |
| @roff_fill@behave differently without emitting any of the messages mentioned in |
| @roff_fill@the next section, it is likely because it depends on the storage |
| @roff_fill@being filled with zero bytes. |
| @roff_fill@Try running it with the |
| @roff_fill@.Dq Z |
| @roff_fill@option set; |
| @roff_fill@if that improves the situation, this diagnosis has been confirmed. |
| @roff_fill@If the program still misbehaves, |
| @roff_fill@the likely problem is accessing memory outside the allocated area. |
| @roff_fill@.Pp |
| @roff_fill@Alternatively, if the symptoms are not easy to reproduce, setting the |
| @roff_fill@.Dq J |
| @roff_fill@option may help provoke the problem. |
| @roff_fill@.Pp |
| @roff_stats@In truly difficult cases, the |
| @roff_stats@.Dq U |
| @roff_stats@option can provide a detailed trace of all calls made to these |
| @roff_stats@functions. |
| @roff_stats@.Pp |
| Unfortunately this implementation does not provide much detail about |
| the problems it detects; the performance impact for storing such information |
| would be prohibitive. |
| There are a number of allocator implementations available on the Internet |
| which focus on detecting and pinpointing problems by trading performance for |
| extra sanity checks and detailed diagnostics. |
| .Sh DIAGNOSTIC MESSAGES |
| If any of the memory allocation/deallocation functions detect an error or |
| warning condition, a message will be printed to file descriptor |
| .Dv STDERR_FILENO . |
| Errors will result in the process dumping core. |
| If the |
| .Dq A |
| option is set, all warnings are treated as errors. |
| .Pp |
| The |
| .Va jemalloc_message |
| variable allows the programmer to override the function which emits |
| the text strings forming the errors and warnings if for some reason |
| the |
| .Dv stderr |
| file descriptor is not suitable for this. |
| Please note that doing anything which tries to allocate memory in |
| this function is likely to result in a crash or deadlock. |
| .Pp |
| All messages are prefixed by |
| .Dq <jemalloc>: . |
| .Sh RETURN VALUES |
| The |
| .Fn malloc |
| and |
| .Fn calloc |
| functions return a pointer to the allocated memory if successful; otherwise |
| a |
| .Dv NULL |
| pointer is returned and |
| .Va errno |
| is set to |
| .Er ENOMEM . |
| .Pp |
| The |
| .Fn posix_memalign |
| function returns the value 0 if successful; otherwise it returns an error value. |
| The |
| .Fn posix_memalign |
| function will fail if: |
| .Bl -tag -width Er |
| .It Bq Er EINVAL |
| The |
| .Fa alignment |
| parameter is not a power of 2 at least as large as |
| .Fn sizeof "void *" . |
| .It Bq Er ENOMEM |
| Memory allocation error. |
| .El |
| .Pp |
| The |
| .Fn realloc |
| function returns a pointer, possibly identical to |
| .Fa ptr , |
| to the allocated memory |
| if successful; otherwise a |
| .Dv NULL |
| pointer is returned, and |
| .Va errno |
| is set to |
| .Er ENOMEM |
| if the error was the result of an allocation failure. |
| The |
| .Fn realloc |
| function always leaves the original buffer intact |
| when an error occurs. |
| .Pp |
| The |
| .Fn free |
| function returns no value. |
| .Pp |
| The |
| .Fn malloc_usable_size |
| function returns the usable size of the allocation pointed to by |
| .Fa ptr . |
| .Sh ENVIRONMENT |
| The following environment variables affect the execution of the allocation |
| functions: |
| .Bl -tag -width ".Ev JEMALLOC_OPTIONS" |
| .It Ev JEMALLOC_OPTIONS |
| If the environment variable |
| .Ev JEMALLOC_OPTIONS |
| is set, the characters it contains will be interpreted as flags to the |
| allocation functions. |
| .El |
| .Sh EXAMPLES |
| To dump core whenever a problem occurs: |
| .Pp |
| .Bd -literal -offset indent |
| ln -s 'A' /etc/jemalloc.conf |
| .Ed |
| .Pp |
| To specify in the source that a program does no return value checking |
| on calls to these functions: |
| .Bd -literal -offset indent |
| jemalloc_options = "X"; |
| .Ed |
| .Sh SEE ALSO |
| .Xr madvise 2 , |
| .Xr mmap 2 , |
| .Xr sbrk 2 , |
| .Xr alloca 3 , |
| .Xr atexit 3 , |
| .Xr getpagesize 3 |
| .Sh STANDARDS |
| The |
| .Fn malloc , |
| .Fn calloc , |
| .Fn realloc |
| and |
| .Fn free |
| functions conform to |
| .St -isoC . |
| .Pp |
| The |
| .Fn posix_memalign |
| function conforms to |
| .St -p1003.1-2001 . |