| Demonstrations of cpuunclaimed, the Linux eBPF/bcc version. |
| |
| |
| This tool samples the length of the CPU run queues and determine when there are |
| idle CPUs, yet queued threads waiting their turn. It reports the amount of idle |
| (yet unclaimed by waiting threads) CPU as a system-wide percentage. For |
| example: |
| |
| # ./cpuunclaimed.py |
| Sampling run queues... Output every 1 seconds. Hit Ctrl-C to end. |
| %CPU 83.00%, unclaimed idle 0.12% |
| %CPU 87.25%, unclaimed idle 0.38% |
| %CPU 85.00%, unclaimed idle 0.25% |
| %CPU 85.00%, unclaimed idle 0.25% |
| %CPU 80.88%, unclaimed idle 0.00% |
| %CPU 82.25%, unclaimed idle 0.00% |
| %CPU 83.50%, unclaimed idle 0.12% |
| %CPU 81.50%, unclaimed idle 0.00% |
| %CPU 81.38%, unclaimed idle 0.00% |
| [...] |
| |
| This shows a system running at over 80% CPU utilization, and with less than |
| 0.5% unclaimed idle CPUs. |
| |
| Unclaimed idle CPUs can happen for a number of reasons: |
| |
| - An application has been bound to some, but not all, CPUs, and has runnable |
| threads that cannot migrate to other CPUs due to this configuration. |
| - CPU affinity: an optimization that leaves threads on CPUs where the CPU |
| caches are warm, even if this means short periods of waiting while other |
| CPUs are idle. The wait period is tunale (see sysctl, kernel.sched*). |
| - Scheduler bugs. |
| |
| An unclaimed idle of < 1% is likely to be CPU affinity, and not usually a |
| cause for concern. By leaving the CPU idle, overall throughput of the system |
| may be improved. This tool is best for identifying larger issues, > 2%, due |
| to the coarseness of its 99 Hertz samples. |
| |
| |
| This is an 8 CPU system, with an 8 CPU-bound threaded application running that |
| has been bound to one CPU (via taskset): |
| |
| # ./cpuunclaimed.py |
| Sampling run queues... Output every 1 seconds. Hit Ctrl-C to end. |
| %CPU 12.63%, unclaimed idle 86.36% |
| %CPU 12.50%, unclaimed idle 87.50% |
| %CPU 12.63%, unclaimed idle 87.37% |
| %CPU 12.75%, unclaimed idle 87.25% |
| %CPU 12.50%, unclaimed idle 87.50% |
| %CPU 12.63%, unclaimed idle 87.37% |
| %CPU 12.50%, unclaimed idle 87.50% |
| %CPU 12.50%, unclaimed idle 87.50% |
| [...] |
| |
| It shows that 7 of the 8 CPUs (87.5%) are idle at the same time there are |
| queued threads waiting to run on CPU. This is an artifical situation caused |
| by binding threads to the same CPU, to demonstrate how the tool works. |
| |
| |
| This is an 8 CPU system running a Linux kernel build with "make -j8", and -T |
| to print timestamps: |
| |
| # ./cpuunclaimed.py -T |
| Sampling run queues... Output every 1 seconds. Hit Ctrl-C to end. |
| 22:25:55 %CPU 98.88%, unclaimed idle 0.12% |
| 22:25:56 %CPU 99.75%, unclaimed idle 0.25% |
| 22:25:57 %CPU 99.50%, unclaimed idle 0.50% |
| 22:25:58 %CPU 99.25%, unclaimed idle 0.75% |
| 22:25:59 %CPU 99.75%, unclaimed idle 0.25% |
| 22:26:00 %CPU 99.50%, unclaimed idle 0.50% |
| 22:26:01 %CPU 99.25%, unclaimed idle 0.75% |
| 22:26:02 %CPU 99.25%, unclaimed idle 0.75% |
| 22:26:03 %CPU 99.01%, unclaimed idle 0.87% |
| 22:26:04 %CPU 99.88%, unclaimed idle 0.12% |
| 22:26:05 %CPU 99.38%, unclaimed idle 0.62% |
| |
| There's now a consistent, yet small, amount of unclaimed idle CPU. This is |
| expected to be deliberate: CPU affinity, as mentioned earlier. |
| |
| |
| The -j option will print raw samples: around one hundred lines of output |
| every second. For the same system with a Linux kernel build of "make -j8": |
| |
| # ./cpuunclaimed.py -j |
| TIMESTAMP_ns,CPU0,CPU1,CPU2,CPU3,CPU4,CPU5,CPU6,CPU7 |
| 514606928954752,1,1,1,1,1,1,1,1 |
| 514606939054312,1,1,1,1,1,1,1,2 |
| 514606949156518,1,1,1,1,1,1,1,1 |
| 514606959256596,2,2,1,1,1,1,1,1 |
| 514606969357989,1,1,1,1,1,2,1,1 |
| 514606979459700,1,2,1,1,1,2,1,1 |
| 514606989560481,1,1,1,1,1,1,1,1 |
| 514606999661396,1,1,1,1,1,1,2,1 |
| 514607009795601,1,1,1,1,1,1,1,2 |
| 514607019862711,1,1,1,1,1,1,1,1 |
| 514607029963734,1,1,1,1,1,1,1,1 |
| 514607040062372,1,1,1,1,1,1,1,1 |
| 514607050197735,1,1,1,2,1,1,1,1 |
| 514607060266464,1,1,1,1,1,1,1,2 |
| 514607070368025,1,1,1,1,1,2,1,1 |
| 514607080468375,1,1,1,1,1,1,1,2 |
| 514607090570292,3,2,1,1,1,1,1,1 |
| 514607100670725,1,1,1,1,1,2,1,1 |
| 514607110771946,1,2,1,1,1,1,1,1 |
| 514607120873489,1,1,1,1,2,1,2,1 |
| 514607130973857,2,1,1,1,3,1,1,1 |
| 514607141080056,0,1,1,1,1,2,1,3 |
| 514607151176312,1,1,1,2,1,1,1,1 |
| 514607161277753,1,1,1,1,1,1,2,1 |
| 514607171379095,1,1,1,1,1,1,1,1 |
| 514607181479262,1,1,1,1,1,1,1,1 |
| 514607191580794,3,1,1,1,1,1,1,1 |
| 514607201680952,1,1,1,1,1,1,2,1 |
| 514607211783683,1,1,1,1,1,1,1,1 |
| 514607221883274,1,1,1,1,1,1,0,1 |
| 514607231984244,1,1,1,1,1,1,1,1 |
| 514607242085698,1,1,1,1,1,1,1,1 |
| 514607252216898,1,2,1,1,1,1,1,1 |
| 514607262289420,1,1,1,1,1,2,1,1 |
| 514607272389922,1,1,1,1,1,1,1,1 |
| 514607282489413,1,1,1,1,1,1,1,1 |
| 514607292589950,1,3,1,1,1,1,1,1 |
| 514607302693367,1,1,1,1,2,1,1,1 |
| 514607312793792,1,1,1,1,1,1,1,1 |
| 514607322895249,1,1,1,3,1,1,3,1 |
| 514607332994278,1,0,1,1,1,2,1,2 |
| 514607343095836,1,1,1,1,1,2,1,1 |
| 514607353196533,1,1,1,1,2,1,1,1 |
| 514607363297749,1,1,1,1,1,1,1,2 |
| 514607373399011,1,1,1,1,1,1,1,2 |
| 514607383499730,1,1,1,1,1,1,1,2 |
| 514607393601510,1,1,1,1,1,1,1,2 |
| 514607403704117,2,1,1,1,1,1,1,2 |
| 514607413802700,1,1,1,1,2,1,0,1 |
| 514607423904559,1,1,1,1,1,1,1,1 |
| [...] |
| |
| The output is verbose: printing out a timestamp, and then the length of each |
| CPU's run queue. The second last line, of timestamp 514607413802700, is an |
| example of what this tool detects: CPU 4 has a run queue length of 4, which |
| means one thread running and one thread queued, while CPU 6 has a run queue |
| length of 0: idle. The very next sample shows all CPUs busy. |
| |
| |
| The -J option prints raw samples with time offsets showing when the samples |
| were collected on each CPU. It's mostly useful for debugging the tool itself. |
| For example, during a Linux kernel build: |
| |
| # ./cpuunclaimed.py -J |
| TIMESTAMP_ns,CPU0,CPU1,CPU2,CPU3,CPU4,CPU5,CPU6,CPU7,OFFSET_ns_CPU0,OFFSET_ns_CPU1,OFFSET_ns_CPU2,OFFSET_ns_CPU3,OFFSET_ns_CPU4,OFFSET_ns_CPU5,OFFSET_ns_CPU6,OFFSET_ns_CPU7 |
| 514722625198188,1,1,1,1,1,1,1,2,0,28321,51655,73396,89654,111172,132803,159792 |
| 514722635299034,1,1,1,1,1,2,1,1,0,28809,51999,74183,89552,110011,131995,153519 |
| 514722645400274,1,1,1,1,1,1,1,2,0,28024,51333,73652,88964,110075,131973,153568 |
| 514722655501816,1,2,1,1,1,1,1,1,0,28893,51671,75233,89496,109430,131945,153694 |
| 514722665602594,1,1,2,1,1,2,1,1,0,28623,50988,73866,89383,109186,131786,154555 |
| 514722675703498,1,1,1,1,1,1,1,1,0,27379,51031,73175,89625,110380,131482,104811 |
| 514722685804942,1,1,1,1,1,2,1,1,0,27213,50501,72183,88797,108780,130659,152153 |
| 514722695906294,1,1,1,1,1,1,1,1,0,27036,51182,73420,87861,109585,130364,155089 |
| 514722706005778,1,1,1,1,1,1,1,1,0,28492,51851,74138,89744,110208,132462,154060 |
| 514722716060705,1,1,1,1,1,1,1,1,0,154499,152528,155232,155046,154502,178746,200001 |
| 514722726209615,1,1,1,1,1,1,1,1,0,28170,49580,72605,87741,108144,130723,152138 |
| 514722736309475,1,2,1,1,1,1,1,1,0,27421,51386,73061,89358,109457,131273,153005 |
| 514722746410845,1,2,1,1,1,2,1,1,0,27788,50840,72720,88920,109111,131143,152979 |
| 514722756511363,1,1,1,1,1,1,2,1,0,28280,50977,73559,89848,109659,131579,152693 |
| 514722766613044,1,1,1,1,1,1,1,1,0,28046,50812,72754,89160,110108,130735,152948 |
| 514722776712932,1,1,1,2,1,1,1,1,0,28586,51177,73974,89588,109947,132376,154162 |
| 514722786815477,1,1,1,1,1,1,1,1,0,27973,71104,72539,88302,108896,130414,152236 |
| 514722796914955,1,1,1,1,1,1,1,1,0,29054,52354,74214,89592,110615,132586,153925 |
| 514722807044060,1,1,1,1,1,1,1,1,1587130,0,24079,46633,61787,82325,104706,125278 |
| 514722817117432,2,1,2,1,1,1,1,1,0,27628,51038,75138,89724,109340,132426,155348 |
| 514722827218254,1,1,1,1,1,1,2,1,0,29111,51868,74347,88904,109911,132764,153851 |
| 514722837340158,1,1,1,1,1,1,1,1,0,7366,30760,53528,68622,89317,111095,132319 |
| 514722847421305,1,1,1,1,1,1,1,1,0,28257,51105,73841,89037,110820,131605,153368 |
| 514722857521112,1,1,1,1,1,1,1,1,0,28544,51441,73857,89530,110497,131915,153513 |
| 514722867626129,0,2,1,1,1,1,1,1,0,24621,47917,70568,85391,106670,128081,150329 |
| 514722877727183,2,1,1,1,1,1,1,1,0,24869,47630,71547,84761,106048,128444,149285 |
| 514722887824589,1,1,1,1,1,1,2,1,0,28793,51212,73863,89584,109773,132348,153194 |
| 514722897925481,1,1,1,1,1,1,2,1,0,29278,51163,73961,89774,109592,132029,153715 |
| 514722908026097,1,1,1,1,1,1,1,1,0,30630,35595,36210,188001,190815,190072,190732 |
| 514722918127439,1,1,1,1,1,1,1,1,0,28544,51885,73948,89987,109763,132632,154083 |
| 514722928227399,1,1,1,1,1,1,1,1,0,31882,51574,74769,89939,110578,132951,154356 |
| 514722938329471,1,1,1,1,1,1,1,1,0,28498,51304,74101,89670,110278,132653,153176 |
| 514722948430589,1,1,1,1,1,1,1,1,0,27868,50925,73477,89676,109583,132360,153014 |
| 514722958531802,1,1,1,1,1,1,1,1,0,28505,50886,73729,89919,109618,131988,152896 |
| 514722968632181,1,1,1,1,1,1,1,1,0,28492,51749,73977,90334,109816,132897,152890 |
| 514722978733584,1,1,1,1,1,1,1,1,0,28847,50957,74121,90014,110019,132377,152978 |
| 514722988834321,1,1,1,1,1,1,1,1,0,28601,51437,74021,89968,110252,132233,153623 |
| 514722998937170,1,1,2,1,1,1,1,1,0,27007,50044,73259,87725,108663,132194,152459 |
| 514723009036821,1,2,1,2,1,1,1,1,0,28226,50937,73983,89110,110476,131740,153663 |
| 514723019137577,1,1,1,1,1,1,1,1,0,30261,52357,75657,87803,61823,131850,153585 |
| 514723029238745,1,1,1,1,1,1,1,1,0,28030,50752,74452,89240,110791,132187,153327 |
| 514723039339069,1,1,1,1,1,1,1,1,0,29791,52636,75996,90475,110414,132232,154714 |
| 514723049439822,1,1,1,1,2,1,1,1,0,29133,56662,74153,89520,110683,132740,154708 |
| 514723059541617,1,1,1,1,1,1,1,1,0,27932,51480,74644,89656,109176,131499,153732 |
| 514723069642500,1,1,2,1,1,1,2,1,0,27678,51509,73984,90136,110124,131554,153459 |
| 514723079743525,2,1,1,1,1,1,1,1,0,28029,51424,74394,90056,110087,132383,152963 |
| 514723089844091,2,1,1,2,1,1,1,1,0,28944,51692,74440,90339,110402,132722,154083 |
| 514723099945957,1,1,2,1,1,1,1,1,0,28425,51267,73164,89322,115048,114630,115187 |
| 514723110047020,1,1,2,0,1,1,1,2,0,28192,50811,76814,89835,109370,131265,153511 |
| 514723120216662,1,1,2,1,1,2,1,1,29,34,0,4514,19268,40293,62674,84009 |
| [...] |
| |
| This is a Xen guest system, and it shows that CPU 0 usually completes first (an |
| offset of 0), followed by CPU 1 around 28000 nanoseconds later, and so on. |
| The spread of offsets is triggered by the bcc Python library that initializes |
| the timers, which steps through the CPUs in sequence, with a small delay |
| between them merely from executing its own loop code. |
| |
| Here's more output during a Linux kernel build: |
| |
| # ./cpuunclaimed.py -J |
| TIMESTAMP_ns,CPU0,CPU1,CPU2,CPU3,CPU4,CPU5,CPU6,CPU7,OFFSET_ns_CPU0,OFFSET_ns_CPU1,OFFSET_ns_CPU2,OFFSET_ns_CPU3,OFFSET_ns_CPU4,OFFSET_ns_CPU5,OFFSET_ns_CPU6,OFFSET_ns_CPU7 |
| 514722625198188,1,1,1,1,1,1,1,2,0,28321,51655,73396,89654,111172,132803,159792 |
| 515700745758947,2,1,1,1,1,1,1,1,0,19835,34891,49397,59364,71988,87571,102769 |
| 515700755860451,2,1,1,1,1,1,1,2,0,19946,34323,49855,59844,72741,87925,102891 |
| 515700765960560,1,1,1,1,1,1,1,1,0,20805,35339,50436,59677,73557,88661,104796 |
| 515700776061744,1,1,1,1,1,1,1,1,1626,77,0,190,153452,154665,178218,154116 |
| 515700786162017,1,1,1,1,1,1,1,1,0,20497,35361,51552,59787,74451,147789,104545 |
| 515700796262811,1,1,1,1,1,1,1,2,0,20910,35657,50805,60175,73953,88492,103527 |
| 515700806364951,1,1,1,1,1,1,1,1,0,20140,35023,50074,59726,72757,88040,102421 |
| 515700816465253,1,1,1,1,1,1,2,1,0,20952,34899,50262,60048,72890,88067,103545 |
| 515700826566573,1,1,1,1,1,1,1,1,0,20898,35490,50609,59805,74060,88550,103354 |
| 515700836667480,1,1,1,1,1,1,2,1,0,20548,34760,50959,59490,73059,87820,103006 |
| 515700846768182,1,1,1,1,1,1,2,1,0,20571,35113,50777,59962,74139,88543,103192 |
| 515700856869468,1,1,2,1,1,2,2,1,0,20932,35382,50510,60106,73739,91818,103684 |
| 515700866971905,1,1,1,2,1,1,1,1,0,19780,33018,49075,58375,71949,86537,102136 |
| 515700877073459,2,1,1,1,1,1,1,1,0,20065,73966,48989,58832,71408,85714,101067 |
| 515700887172772,1,1,1,1,1,1,1,1,0,20909,34608,51493,59890,73564,88668,103454 |
| 515700897273292,1,2,1,1,1,1,1,1,0,20353,35292,50114,59773,73948,88615,103383 |
| 515700907374341,1,1,2,1,1,1,1,1,0,20816,35206,50915,60062,73878,88857,103794 |
| 515700917475331,1,1,6,1,1,2,1,1,0,20752,34931,50280,59764,73781,88329,103234 |
| 515700927576958,1,1,1,1,1,1,1,1,0,19929,34703,50181,59364,73004,88053,103127 |
| 515700937677298,1,1,2,2,1,1,1,1,0,21178,34724,50740,61193,73452,89030,103390 |
| 515700947778409,2,1,1,1,1,1,1,1,0,21059,35604,50853,60098,73919,88675,103506 |
| 515700957879196,2,1,1,1,1,1,1,1,0,21326,35939,51492,60083,74249,89474,103761 |
| [...] |
| |
| Notice the tighter range of offsets? I began executing cpuunclaimed when the |
| system was idle, and it initialized the CPU timers more quickly, and then I |
| began the Linux kernel build. |
| |
| Here's some different output, this time from a physical system with 4 CPUs, |
| also doing a kernel build, |
| |
| # ./cpuunclaimed.py -J |
| TIMESTAMP_ns,CPU0,CPU1,CPU2,CPU3,OFFSET_ns_CPU0,OFFSET_ns_CPU1,OFFSET_ns_CPU2,OFFSET_ns_CPU3 |
| 4429382557480,1,1,1,1,0,6011,10895,16018 |
| 4429392655042,2,1,1,1,0,8217,13661,19378 |
| 4429402757604,1,1,1,1,0,6879,12433,18000 |
| 4429412857809,1,1,1,1,0,8303,13190,18719 |
| 4429422960709,2,1,1,1,0,6095,11234,17079 |
| 4429433060391,1,1,1,2,0,6747,12480,18070 |
| 4429443161699,1,1,1,1,0,6560,12264,17945 |
| 4429453262002,1,2,1,1,0,6992,12644,18341 |
| 4429463363706,1,2,1,1,0,6211,12071,17853 |
| 4429473465571,1,1,1,1,0,5766,11495,17638 |
| 4429483566920,1,1,1,1,0,5223,11736,16358 |
| 4429493666279,1,1,1,1,0,6964,12653,18410 |
| 4429503769113,1,1,1,1,0,5161,11399,16612 |
| 4429513870744,1,1,1,1,0,5943,10583,15768 |
| 4429523969826,1,1,1,1,0,6533,12336,18189 |
| 4429534070311,1,1,1,1,0,6834,12816,18488 |
| 4429544170456,1,1,1,1,0,7284,13401,19129 |
| 4429554274467,1,2,1,1,0,5941,11160,16594 |
| 4429564372365,1,2,1,1,0,7514,13618,19190 |
| 4429574474406,1,2,1,1,0,6687,12650,18248 |
| 4429584574220,1,2,1,1,0,7912,13705,19136 |
| [...] |
| |
| If the offset range becomes too great, we can no longer conclude about when |
| some CPUs were idle and others had queued work. The tool will detect this, |
| and print an error message and exit. |
| |
| |
| Some systems can power down CPUs when idle, and when they wake up again the |
| timed samples may resume from different offsets. If this happens, this tool |
| can no longer draw conclusions about when some CPUs were idle and others |
| had queued work, so it prints an error, and exits. Eg: |
| |
| # ./cpuunclaimed.py 1 |
| Sampling run queues... Output every 1 seconds. Hit Ctrl-C to end. |
| %CPU 0.25%, unclaimed idle 0.00% |
| %CPU 0.75%, unclaimed idle 0.00% |
| %CPU 0.25%, unclaimed idle 0.00% |
| %CPU 0.00%, unclaimed idle 0.00% |
| %CPU 0.00%, unclaimed idle 0.00% |
| %CPU 0.12%, unclaimed idle 0.00% |
| %CPU 0.00%, unclaimed idle 0.00% |
| %CPU 0.25%, unclaimed idle 0.00% |
| %CPU 0.00%, unclaimed idle 0.00% |
| %CPU 0.12%, unclaimed idle 0.00% |
| %CPU 0.13%, unclaimed idle 0.00% |
| %CPU 0.12%, unclaimed idle 0.00% |
| %CPU 0.00%, unclaimed idle 0.00% |
| %CPU 0.00%, unclaimed idle 0.00% |
| %CPU 0.00%, unclaimed idle 0.00% |
| %CPU 0.00%, unclaimed idle 0.00% |
| ERROR: CPU samples arrived at skewed offsets (CPUs may have powered down when idle), spanning 4328176 ns (expected < 4040404 ns). Debug with -J, and see the man page. As output may begin to be unreliable, exiting. |
| |
| It's expected that this will only really occur on idle systems. |
| |
| USAGE: |
| |
| # ./cpuunclaimed.py -h |
| usage: cpuunclaimed.py [-h] [-j] [-J] [-T] [interval] [count] |
| |
| Sample CPU run queues and calculate unclaimed idle CPU |
| |
| positional arguments: |
| interval output interval, in seconds |
| count number of outputs |
| |
| optional arguments: |
| -h, --help show this help message and exit |
| -j, --csv print sample summaries (verbose) as comma-separated values |
| -J, --fullcsv print sample summaries with extra fields: CPU sample |
| offsets |
| -T, --timestamp include timestamp on output |
| |
| examples: |
| ./cpuunclaimed # sample and calculate unclaimed idle CPUs, |
| # output every 1 second (default) |
| ./cpuunclaimed 5 10 # print 5 second summaries, 10 times |
| ./cpuunclaimed -T 1 # 1s summaries and timestamps |
| ./cpuunclaimed -j # raw dump of all samples (verbose), CSV |