blob: efd8f605bcd52fdf6b4d3fde71bd416593ea0006 [file] [log] [blame]
Shailabh Nagarc7572492006-07-14 00:24:40 -07001Per-task statistics interface
2-----------------------------
3
4
5Taskstats is a netlink-based interface for sending per-task and
6per-process statistics from the kernel to userspace.
7
8Taskstats was designed for the following benefits:
9
10- efficiently provide statistics during lifetime of a task and on its exit
11- unified interface for multiple accounting subsystems
12- extensibility for use by future accounting patches
13
14Terminology
15-----------
16
17"pid", "tid" and "task" are used interchangeably and refer to the standard
18Linux task defined by struct task_struct. per-pid stats are the same as
19per-task stats.
20
21"tgid", "process" and "thread group" are used interchangeably and refer to the
22tasks that share an mm_struct i.e. the traditional Unix process. Despite the
23use of tgid, there is no special treatment for the task that is thread group
24leader - a process is deemed alive as long as it has any task belonging to it.
25
26Usage
27-----
28
29To get statistics during task's lifetime, userspace opens a unicast netlink
30socket (NETLINK_GENERIC family) and sends commands specifying a pid or a tgid.
31The response contains statistics for a task (if pid is specified) or the sum of
32statistics for all tasks of the process (if tgid is specified).
33
34To obtain statistics for tasks which are exiting, userspace opens a multicast
Shailabh Nagarad4ecbc2006-07-14 00:24:44 -070035netlink socket. Each time a task exits, its per-pid statistics is always sent
36by the kernel to each listener on the multicast socket. In addition, if it is
37the last thread exiting its thread group, an additional record containing the
38per-tgid stats are also sent. The latter contains the sum of per-pid stats for
39all threads in the thread group, both past and present.
Shailabh Nagarc7572492006-07-14 00:24:40 -070040
Shailabh Nagara3baf642006-07-14 00:24:42 -070041getdelays.c is a simple utility demonstrating usage of the taskstats interface
42for reporting delay accounting statistics.
Shailabh Nagarc7572492006-07-14 00:24:40 -070043
44Interface
45---------
46
47The user-kernel interface is encapsulated in include/linux/taskstats.h
48
49To avoid this documentation becoming obsolete as the interface evolves, only
50an outline of the current version is given. taskstats.h always overrides the
51description here.
52
53struct taskstats is the common accounting structure for both per-pid and
54per-tgid data. It is versioned and can be extended by each accounting subsystem
55that is added to the kernel. The fields and their semantics are defined in the
56taskstats.h file.
57
58The data exchanged between user and kernel space is a netlink message belonging
59to the NETLINK_GENERIC family and using the netlink attributes interface.
60The messages are in the format
61
62 +----------+- - -+-------------+-------------------+
63 | nlmsghdr | Pad | genlmsghdr | taskstats payload |
64 +----------+- - -+-------------+-------------------+
65
66
67The taskstats payload is one of the following three kinds:
68
691. Commands: Sent from user to kernel. The payload is one attribute, of type
70TASKSTATS_CMD_ATTR_PID/TGID, containing a u32 pid or tgid in the attribute
71payload. The pid/tgid denotes the task/process for which userspace wants
72statistics.
73
742. Response for a command: sent from the kernel in response to a userspace
75command. The payload is a series of three attributes of type:
76
77a) TASKSTATS_TYPE_AGGR_PID/TGID : attribute containing no payload but indicates
78a pid/tgid will be followed by some stats.
79
80b) TASKSTATS_TYPE_PID/TGID: attribute whose payload is the pid/tgid whose stats
81is being returned.
82
83c) TASKSTATS_TYPE_STATS: attribute with a struct taskstsats as payload. The
84same structure is used for both per-pid and per-tgid stats.
85
863. New message sent by kernel whenever a task exits. The payload consists of a
87 series of attributes of the following type:
88
89a) TASKSTATS_TYPE_AGGR_PID: indicates next two attributes will be pid+stats
90b) TASKSTATS_TYPE_PID: contains exiting task's pid
91c) TASKSTATS_TYPE_STATS: contains the exiting task's per-pid stats
92d) TASKSTATS_TYPE_AGGR_TGID: indicates next two attributes will be tgid+stats
93e) TASKSTATS_TYPE_TGID: contains tgid of process to which task belongs
94f) TASKSTATS_TYPE_STATS: contains the per-tgid stats for exiting task's process
95
96
97per-tgid stats
98--------------
99
100Taskstats provides per-process stats, in addition to per-task stats, since
101resource management is often done at a process granularity and aggregating task
102stats in userspace alone is inefficient and potentially inaccurate (due to lack
103of atomicity).
104
105However, maintaining per-process, in addition to per-task stats, within the
Shailabh Nagarad4ecbc2006-07-14 00:24:44 -0700106kernel has space and time overheads. To address this, the taskstats code
107accumalates each exiting task's statistics into a process-wide data structure.
108When the last task of a process exits, the process level data accumalated also
109gets sent to userspace (along with the per-task data).
Shailabh Nagarc7572492006-07-14 00:24:40 -0700110
Shailabh Nagarad4ecbc2006-07-14 00:24:44 -0700111When a user queries to get per-tgid data, the sum of all other live threads in
112the group is added up and added to the accumalated total for previously exited
113threads of the same thread group.
Shailabh Nagarc7572492006-07-14 00:24:40 -0700114
115Extending taskstats
116-------------------
117
118There are two ways to extend the taskstats interface to export more
119per-task/process stats as patches to collect them get added to the kernel
120in future:
121
1221. Adding more fields to the end of the existing struct taskstats. Backward
123 compatibility is ensured by the version number within the
124 structure. Userspace will use only the fields of the struct that correspond
125 to the version its using.
126
1272. Defining separate statistic structs and using the netlink attributes
128 interface to return them. Since userspace processes each netlink attribute
129 independently, it can always ignore attributes whose type it does not
130 understand (because it is using an older version of the interface).
131
132
133Choosing between 1. and 2. is a matter of trading off flexibility and
134overhead. If only a few fields need to be added, then 1. is the preferable
135path since the kernel and userspace don't need to incur the overhead of
136processing new netlink attributes. But if the new fields expand the existing
137struct too much, requiring disparate userspace accounting utilities to
138unnecessarily receive large structures whose fields are of no interest, then
139extending the attributes structure would be worthwhile.
140
141----