Jonathan Corbet | 6c19efb | 2009-09-08 17:49:37 -0600 | [diff] [blame] | 1 | Using flexible arrays in the kernel |
Jonathan Corbet | 1243ba9 | 2009-10-14 12:43:22 -0600 | [diff] [blame] | 2 | Last updated for 2.6.32 |
Jonathan Corbet | 6c19efb | 2009-09-08 17:49:37 -0600 | [diff] [blame] | 3 | Jonathan Corbet <corbet@lwn.net> |
| 4 | |
| 5 | Large contiguous memory allocations can be unreliable in the Linux kernel. |
| 6 | Kernel programmers will sometimes respond to this problem by allocating |
| 7 | pages with vmalloc(). This solution not ideal, though. On 32-bit systems, |
| 8 | memory from vmalloc() must be mapped into a relatively small address space; |
| 9 | it's easy to run out. On SMP systems, the page table changes required by |
| 10 | vmalloc() allocations can require expensive cross-processor interrupts on |
| 11 | all CPUs. And, on all systems, use of space in the vmalloc() range |
| 12 | increases pressure on the translation lookaside buffer (TLB), reducing the |
| 13 | performance of the system. |
| 14 | |
| 15 | In many cases, the need for memory from vmalloc() can be eliminated by |
| 16 | piecing together an array from smaller parts; the flexible array library |
| 17 | exists to make this task easier. |
| 18 | |
| 19 | A flexible array holds an arbitrary (within limits) number of fixed-sized |
| 20 | objects, accessed via an integer index. Sparse arrays are handled |
| 21 | reasonably well. Only single-page allocations are made, so memory |
| 22 | allocation failures should be relatively rare. The down sides are that the |
| 23 | arrays cannot be indexed directly, individual object size cannot exceed the |
| 24 | system page size, and putting data into a flexible array requires a copy |
| 25 | operation. It's also worth noting that flexible arrays do no internal |
| 26 | locking at all; if concurrent access to an array is possible, then the |
| 27 | caller must arrange for appropriate mutual exclusion. |
| 28 | |
| 29 | The creation of a flexible array is done with: |
| 30 | |
| 31 | #include <linux/flex_array.h> |
| 32 | |
| 33 | struct flex_array *flex_array_alloc(int element_size, |
| 34 | unsigned int total, |
| 35 | gfp_t flags); |
| 36 | |
| 37 | The individual object size is provided by element_size, while total is the |
| 38 | maximum number of objects which can be stored in the array. The flags |
| 39 | argument is passed directly to the internal memory allocation calls. With |
| 40 | the current code, using flags to ask for high memory is likely to lead to |
| 41 | notably unpleasant side effects. |
| 42 | |
Jonathan Corbet | 1243ba9 | 2009-10-14 12:43:22 -0600 | [diff] [blame] | 43 | It is also possible to define flexible arrays at compile time with: |
| 44 | |
| 45 | DEFINE_FLEX_ARRAY(name, element_size, total); |
| 46 | |
| 47 | This macro will result in a definition of an array with the given name; the |
| 48 | element size and total will be checked for validity at compile time. |
| 49 | |
Jonathan Corbet | 6c19efb | 2009-09-08 17:49:37 -0600 | [diff] [blame] | 50 | Storing data into a flexible array is accomplished with a call to: |
| 51 | |
| 52 | int flex_array_put(struct flex_array *array, unsigned int element_nr, |
| 53 | void *src, gfp_t flags); |
| 54 | |
| 55 | This call will copy the data from src into the array, in the position |
| 56 | indicated by element_nr (which must be less than the maximum specified when |
| 57 | the array was created). If any memory allocations must be performed, flags |
| 58 | will be used. The return value is zero on success, a negative error code |
| 59 | otherwise. |
| 60 | |
| 61 | There might possibly be a need to store data into a flexible array while |
| 62 | running in some sort of atomic context; in this situation, sleeping in the |
| 63 | memory allocator would be a bad thing. That can be avoided by using |
| 64 | GFP_ATOMIC for the flags value, but, often, there is a better way. The |
| 65 | trick is to ensure that any needed memory allocations are done before |
| 66 | entering atomic context, using: |
| 67 | |
| 68 | int flex_array_prealloc(struct flex_array *array, unsigned int start, |
Eric Paris | 5d30b10 | 2011-04-28 15:55:52 -0400 | [diff] [blame] | 69 | unsigned int nr_elements, gfp_t flags); |
Jonathan Corbet | 6c19efb | 2009-09-08 17:49:37 -0600 | [diff] [blame] | 70 | |
| 71 | This function will ensure that memory for the elements indexed in the range |
Eric Paris | 5d30b10 | 2011-04-28 15:55:52 -0400 | [diff] [blame] | 72 | defined by start and nr_elements has been allocated. Thereafter, a |
Jonathan Corbet | 6c19efb | 2009-09-08 17:49:37 -0600 | [diff] [blame] | 73 | flex_array_put() call on an element in that range is guaranteed not to |
| 74 | block. |
| 75 | |
| 76 | Getting data back out of the array is done with: |
| 77 | |
| 78 | void *flex_array_get(struct flex_array *fa, unsigned int element_nr); |
| 79 | |
| 80 | The return value is a pointer to the data element, or NULL if that |
| 81 | particular element has never been allocated. |
| 82 | |
| 83 | Note that it is possible to get back a valid pointer for an element which |
| 84 | has never been stored in the array. Memory for array elements is allocated |
| 85 | one page at a time; a single allocation could provide memory for several |
Jonathan Corbet | 1243ba9 | 2009-10-14 12:43:22 -0600 | [diff] [blame] | 86 | adjacent elements. Flexible array elements are normally initialized to the |
| 87 | value FLEX_ARRAY_FREE (defined as 0x6c in <linux/poison.h>), so errors |
| 88 | involving that number probably result from use of unstored array entries. |
| 89 | Note that, if array elements are allocated with __GFP_ZERO, they will be |
| 90 | initialized to zero and this poisoning will not happen. |
Jonathan Corbet | 6c19efb | 2009-09-08 17:49:37 -0600 | [diff] [blame] | 91 | |
Jonathan Corbet | 1243ba9 | 2009-10-14 12:43:22 -0600 | [diff] [blame] | 92 | Individual elements in the array can be cleared with: |
| 93 | |
| 94 | int flex_array_clear(struct flex_array *array, unsigned int element_nr); |
| 95 | |
| 96 | This function will set the given element to FLEX_ARRAY_FREE and return |
| 97 | zero. If storage for the indicated element is not allocated for the array, |
| 98 | flex_array_clear() will return -EINVAL instead. Note that clearing an |
| 99 | element does not release the storage associated with it; to reduce the |
| 100 | allocated size of an array, call: |
| 101 | |
| 102 | int flex_array_shrink(struct flex_array *array); |
| 103 | |
| 104 | The return value will be the number of pages of memory actually freed. |
| 105 | This function works by scanning the array for pages containing nothing but |
| 106 | FLEX_ARRAY_FREE bytes, so (1) it can be expensive, and (2) it will not work |
| 107 | if the array's pages are allocated with __GFP_ZERO. |
| 108 | |
| 109 | It is possible to remove all elements of an array with a call to: |
Jonathan Corbet | 6c19efb | 2009-09-08 17:49:37 -0600 | [diff] [blame] | 110 | |
| 111 | void flex_array_free_parts(struct flex_array *array); |
| 112 | |
| 113 | This call frees all elements, but leaves the array itself in place. |
| 114 | Freeing the entire array is done with: |
| 115 | |
| 116 | void flex_array_free(struct flex_array *array); |
| 117 | |
| 118 | As of this writing, there are no users of flexible arrays in the mainline |
| 119 | kernel. The functions described here are also not exported to modules; |
| 120 | that will probably be fixed when somebody comes up with a need for it. |