Kirill A. Shutemov | 33041a0 | 2014-06-06 14:38:23 -0700 | [diff] [blame] | 1 | The remap_file_pages() system call is used to create a nonlinear mapping, |
| 2 | that is, a mapping in which the pages of the file are mapped into a |
| 3 | nonsequential order in memory. The advantage of using remap_file_pages() |
| 4 | over using repeated calls to mmap(2) is that the former approach does not |
| 5 | require the kernel to create additional VMA (Virtual Memory Area) data |
| 6 | structures. |
| 7 | |
| 8 | Supporting of nonlinear mapping requires significant amount of non-trivial |
| 9 | code in kernel virtual memory subsystem including hot paths. Also to get |
| 10 | nonlinear mapping work kernel need a way to distinguish normal page table |
| 11 | entries from entries with file offset (pte_file). Kernel reserves flag in |
| 12 | PTE for this purpose. PTE flags are scarce resource especially on some CPU |
| 13 | architectures. It would be nice to free up the flag for other usage. |
| 14 | |
| 15 | Fortunately, there are not many users of remap_file_pages() in the wild. |
| 16 | It's only known that one enterprise RDBMS implementation uses the syscall |
| 17 | on 32-bit systems to map files bigger than can linearly fit into 32-bit |
| 18 | virtual address space. This use-case is not critical anymore since 64-bit |
| 19 | systems are widely available. |
| 20 | |
| 21 | The plan is to deprecate the syscall and replace it with an emulation. |
| 22 | The emulation will create new VMAs instead of nonlinear mappings. It's |
| 23 | going to work slower for rare users of remap_file_pages() but ABI is |
| 24 | preserved. |
| 25 | |
| 26 | One side effect of emulation (apart from performance) is that user can hit |
| 27 | vm.max_map_count limit more easily due to additional VMAs. See comment for |
| 28 | DEFAULT_MAX_MAP_COUNT for more details on the limit. |