blob: cb0cfd6672fa79186b5818425e605cf03e904e0e [file] [log] [blame]
Mike Rapoport00151902018-03-21 21:22:38 +02001.. _soft_dirty:
Pavel Emelyanov0f8975e2013-07-03 15:01:20 -07002
Mike Rapoport00151902018-03-21 21:22:38 +02003===============
4Soft-Dirty PTEs
5===============
6
7The soft-dirty is a bit on a PTE which helps to track which pages a task
Pavel Emelyanov0f8975e2013-07-03 15:01:20 -07008writes to. In order to do this tracking one should
9
10 1. Clear soft-dirty bits from the task's PTEs.
11
Mike Rapoport00151902018-03-21 21:22:38 +020012 This is done by writing "4" into the ``/proc/PID/clear_refs`` file of the
Pavel Emelyanov0f8975e2013-07-03 15:01:20 -070013 task in question.
14
15 2. Wait some time.
16
17 3. Read soft-dirty bits from the PTEs.
18
Mike Rapoport00151902018-03-21 21:22:38 +020019 This is done by reading from the ``/proc/PID/pagemap``. The bit 55 of the
Pavel Emelyanov0f8975e2013-07-03 15:01:20 -070020 64-bit qword is the soft-dirty one. If set, the respective PTE was
21 written to since step 1.
22
23
Mike Rapoport00151902018-03-21 21:22:38 +020024Internally, to do this tracking, the writable bit is cleared from PTEs
Pavel Emelyanov0f8975e2013-07-03 15:01:20 -070025when the soft-dirty bit is cleared. So, after this, when the task tries to
26modify a page at some virtual address the #PF occurs and the kernel sets
27the soft-dirty bit on the respective PTE.
28
Mike Rapoport00151902018-03-21 21:22:38 +020029Note, that although all the task's address space is marked as r/o after the
Pavel Emelyanov0f8975e2013-07-03 15:01:20 -070030soft-dirty bits clear, the #PF-s that occur after that are processed fast.
31This is so, since the pages are still mapped to physical memory, and thus all
32the kernel does is finds this fact out and puts both writable and soft-dirty
33bits on the PTE.
34
Mike Rapoport00151902018-03-21 21:22:38 +020035While in most cases tracking memory changes by #PF-s is more than enough
Cyrill Gorcunovd9104d12013-09-11 14:22:24 -070036there is still a scenario when we can lose soft dirty bits -- a task
37unmaps a previously mapped memory region and then maps a new one at exactly
38the same place. When unmap is called, the kernel internally clears PTE values
39including soft dirty bits. To notify user space application about such
40memory region renewal the kernel always marks new memory regions (and
41expanded regions) as soft dirty.
Pavel Emelyanov0f8975e2013-07-03 15:01:20 -070042
Mike Rapoport00151902018-03-21 21:22:38 +020043This feature is actively used by the checkpoint-restore project. You
Pavel Emelyanov0f8975e2013-07-03 15:01:20 -070044can find more details about it on http://criu.org
45
46
47-- Pavel Emelyanov, Apr 9, 2013