Mauro Carvalho Chehab | 09338fb | 2017-05-14 09:42:08 -0300 | [diff] [blame] | 1 | ======== |
Vassili Karpov | 48dba8a | 2007-02-28 20:13:45 -0800 | [diff] [blame] | 2 | CPU load |
Mauro Carvalho Chehab | 09338fb | 2017-05-14 09:42:08 -0300 | [diff] [blame] | 3 | ======== |
Vassili Karpov | 48dba8a | 2007-02-28 20:13:45 -0800 | [diff] [blame] | 4 | |
Mauro Carvalho Chehab | 09338fb | 2017-05-14 09:42:08 -0300 | [diff] [blame] | 5 | Linux exports various bits of information via ``/proc/stat`` and |
| 6 | ``/proc/uptime`` that userland tools, such as top(1), use to calculate |
| 7 | the average time system spent in a particular state, for example:: |
Vassili Karpov | 48dba8a | 2007-02-28 20:13:45 -0800 | [diff] [blame] | 8 | |
| 9 | $ iostat |
| 10 | Linux 2.6.18.3-exp (linmac) 02/20/2007 |
| 11 | |
| 12 | avg-cpu: %user %nice %system %iowait %steal %idle |
| 13 | 10.01 0.00 2.92 5.44 0.00 81.63 |
| 14 | |
| 15 | ... |
| 16 | |
| 17 | Here the system thinks that over the default sampling period the |
| 18 | system spent 10.01% of the time doing work in user space, 2.92% in the |
| 19 | kernel, and was overall 81.63% of the time idle. |
| 20 | |
Mauro Carvalho Chehab | 09338fb | 2017-05-14 09:42:08 -0300 | [diff] [blame] | 21 | In most cases the ``/proc/stat`` information reflects the reality quite |
Vassili Karpov | 48dba8a | 2007-02-28 20:13:45 -0800 | [diff] [blame] | 22 | closely, however due to the nature of how/when the kernel collects |
| 23 | this data sometimes it can not be trusted at all. |
| 24 | |
| 25 | So how is this information collected? Whenever timer interrupt is |
| 26 | signalled the kernel looks what kind of task was running at this |
| 27 | moment and increments the counter that corresponds to this tasks |
| 28 | kind/state. The problem with this is that the system could have |
| 29 | switched between various states multiple times between two timer |
| 30 | interrupts yet the counter is incremented only for the last state. |
| 31 | |
| 32 | |
| 33 | Example |
| 34 | ------- |
| 35 | |
| 36 | If we imagine the system with one task that periodically burns cycles |
Mauro Carvalho Chehab | 09338fb | 2017-05-14 09:42:08 -0300 | [diff] [blame] | 37 | in the following manner:: |
Vassili Karpov | 48dba8a | 2007-02-28 20:13:45 -0800 | [diff] [blame] | 38 | |
Mauro Carvalho Chehab | 09338fb | 2017-05-14 09:42:08 -0300 | [diff] [blame] | 39 | time line between two timer interrupts |
| 40 | |--------------------------------------| |
| 41 | ^ ^ |
| 42 | |_ something begins working | |
| 43 | |_ something goes to sleep |
| 44 | (only to be awaken quite soon) |
Vassili Karpov | 48dba8a | 2007-02-28 20:13:45 -0800 | [diff] [blame] | 45 | |
| 46 | In the above situation the system will be 0% loaded according to the |
Mauro Carvalho Chehab | 09338fb | 2017-05-14 09:42:08 -0300 | [diff] [blame] | 47 | ``/proc/stat`` (since the timer interrupt will always happen when the |
Vassili Karpov | 48dba8a | 2007-02-28 20:13:45 -0800 | [diff] [blame] | 48 | system is executing the idle handler), but in reality the load is |
| 49 | closer to 99%. |
| 50 | |
| 51 | One can imagine many more situations where this behavior of the kernel |
Mauro Carvalho Chehab | 09338fb | 2017-05-14 09:42:08 -0300 | [diff] [blame] | 52 | will lead to quite erratic information inside ``/proc/stat``:: |
Vassili Karpov | 48dba8a | 2007-02-28 20:13:45 -0800 | [diff] [blame] | 53 | |
| 54 | |
Mauro Carvalho Chehab | 09338fb | 2017-05-14 09:42:08 -0300 | [diff] [blame] | 55 | /* gcc -o hog smallhog.c */ |
| 56 | #include <time.h> |
| 57 | #include <limits.h> |
| 58 | #include <signal.h> |
| 59 | #include <sys/time.h> |
| 60 | #define HIST 10 |
Vassili Karpov | 48dba8a | 2007-02-28 20:13:45 -0800 | [diff] [blame] | 61 | |
Mauro Carvalho Chehab | 09338fb | 2017-05-14 09:42:08 -0300 | [diff] [blame] | 62 | static volatile sig_atomic_t stop; |
Vassili Karpov | 48dba8a | 2007-02-28 20:13:45 -0800 | [diff] [blame] | 63 | |
Mauro Carvalho Chehab | 09338fb | 2017-05-14 09:42:08 -0300 | [diff] [blame] | 64 | static void sighandler (int signr) |
| 65 | { |
| 66 | (void) signr; |
| 67 | stop = 1; |
| 68 | } |
| 69 | static unsigned long hog (unsigned long niters) |
| 70 | { |
| 71 | stop = 0; |
| 72 | while (!stop && --niters); |
| 73 | return niters; |
| 74 | } |
| 75 | int main (void) |
| 76 | { |
| 77 | int i; |
| 78 | struct itimerval it = { .it_interval = { .tv_sec = 0, .tv_usec = 1 }, |
| 79 | .it_value = { .tv_sec = 0, .tv_usec = 1 } }; |
| 80 | sigset_t set; |
| 81 | unsigned long v[HIST]; |
| 82 | double tmp = 0.0; |
| 83 | unsigned long n; |
| 84 | signal (SIGALRM, &sighandler); |
| 85 | setitimer (ITIMER_REAL, &it, NULL); |
Vassili Karpov | 48dba8a | 2007-02-28 20:13:45 -0800 | [diff] [blame] | 86 | |
Mauro Carvalho Chehab | 09338fb | 2017-05-14 09:42:08 -0300 | [diff] [blame] | 87 | hog (ULONG_MAX); |
| 88 | for (i = 0; i < HIST; ++i) v[i] = ULONG_MAX - hog (ULONG_MAX); |
| 89 | for (i = 0; i < HIST; ++i) tmp += v[i]; |
| 90 | tmp /= HIST; |
| 91 | n = tmp - (tmp / 3.0); |
Vassili Karpov | 48dba8a | 2007-02-28 20:13:45 -0800 | [diff] [blame] | 92 | |
Mauro Carvalho Chehab | 09338fb | 2017-05-14 09:42:08 -0300 | [diff] [blame] | 93 | sigemptyset (&set); |
| 94 | sigaddset (&set, SIGALRM); |
Vassili Karpov | 48dba8a | 2007-02-28 20:13:45 -0800 | [diff] [blame] | 95 | |
Mauro Carvalho Chehab | 09338fb | 2017-05-14 09:42:08 -0300 | [diff] [blame] | 96 | for (;;) { |
| 97 | hog (n); |
| 98 | sigwait (&set, &i); |
| 99 | } |
| 100 | return 0; |
| 101 | } |
Vassili Karpov | 48dba8a | 2007-02-28 20:13:45 -0800 | [diff] [blame] | 102 | |
| 103 | |
| 104 | References |
| 105 | ---------- |
| 106 | |
Mauro Carvalho Chehab | 09338fb | 2017-05-14 09:42:08 -0300 | [diff] [blame] | 107 | - http://lkml.org/lkml/2007/2/12/6 |
| 108 | - Documentation/filesystems/proc.txt (1.8) |
Vassili Karpov | 48dba8a | 2007-02-28 20:13:45 -0800 | [diff] [blame] | 109 | |
| 110 | |
| 111 | Thanks |
| 112 | ------ |
| 113 | |
| 114 | Con Kolivas, Pavel Machek |