Blame - tools/syscount_example.txt - platform/external/bcc

blob: eaa1050a46f1199eec12de43335195169ac765a9 [file] [log] [blame]

Sasha Goldshtein	8e583cc	2017-02-09 10:11:50 -0500	[diff] [blame]	1	Demonstrations of syscount, the Linux/eBPF version.
				2
				3
				4	syscount summarizes syscall counts across the system or a specific process,
				5	with optional latency information. It is very useful for general workload
				6	characterization, for example:
				7
				8	# syscount
				9	Tracing syscalls, printing top 10... Ctrl+C to quit.
				10	[09:39:04]
				11	SYSCALL COUNT
				12	write 10739
				13	read 10584
				14	wait4 1460
				15	nanosleep 1457
				16	select 795
				17	rt_sigprocmask 689
				18	clock_gettime 653
				19	rt_sigaction 128
				20	futex 86
				21	ioctl 83
				22	^C
				23
				24	These are the top 10 entries; you can get more by using the -T switch. Here,
				25	the output indicates that the write and read syscalls were very common, followed
				26	immediately by wait4, nanosleep, and so on. By default, syscount counts across
				27	the entire system, but we can point it to a specific process of interest:
				28
				29	# syscount -p $(pidof dd)
				30	Tracing syscalls, printing top 10... Ctrl+C to quit.
				31	[09:40:21]
				32	SYSCALL COUNT
				33	read 7878397
				34	write 7878397
				35	^C
				36
				37	Indeed, dd's workload is a bit easier to characterize. Occasionally, the count
				38	of syscalls is not enough, and you'd also want an aggregate latency:
				39
				40	# syscount -L
				41	Tracing syscalls, printing top 10... Ctrl+C to quit.
				42	[09:41:32]
				43	SYSCALL COUNT TIME (us)
				44	select 16 3415860.022
				45	nanosleep 291 12038.707
				46	ftruncate 1 122.939
				47	write 4 63.389
				48	stat 1 23.431
				49	fstat 1 5.088
				50	[unknown: 321] 32 4.965
				51	timerfd_settime 1 4.830
				52	ioctl 3 4.802
				53	kill 1 4.342
				54	^C
				55
				56	The select and nanosleep calls are responsible for a lot of time, but remember
				57	these are blocking calls. This output was taken from a mostly idle system. Note
				58	the "unknown" entry -- syscall 321 is the bpf() syscall, which is not in the
				59	table used by this tool (borrowed from strace sources).
				60
				61	Another direction would be to understand which processes are making a lot of
				62	syscalls, thus responsible for a lot of activity. This is what the -P switch
				63	does:
				64
				65	# syscount -P
				66	Tracing syscalls, printing top 10... Ctrl+C to quit.
				67	[09:58:13]
				68	PID COMM COUNT
				69	13820 vim 548
				70	30216 sshd 149
				71	29633 bash 72
				72	25188 screen 70
				73	25776 mysqld 30
				74	31285 python 10
				75	529 systemd-udevd 9
				76	1 systemd 8
				77	494 systemd-journal 5
				78	^C
				79
				80	This is again from a mostly idle system over an interval of a few seconds.
				81
				82	Sometimes, you'd only care about failed syscalls -- these are the ones that
				83	might be worth investigating with follow-up tools like opensnoop, execsnoop,
				84	or trace. Use the -x switch for this; the following example also demonstrates
				85	the -i switch, for printing at predefined intervals:
				86
				87	# syscount -x -i 5
				88	Tracing failed syscalls, printing top 10... Ctrl+C to quit.
				89	[09:44:16]
				90	SYSCALL COUNT
				91	futex 13
				92	getxattr 10
				93	stat 8
				94	open 6
				95	wait4 3
				96	access 2
				97	[unknown: 321] 1
				98
				99	[09:44:21]
				100	SYSCALL COUNT
				101	futex 12
				102	getxattr 10
				103	[unknown: 321] 2
				104	wait4 1
				105	access 1
				106	pause 1
				107	^C
				108
				109	USAGE:
				110	# syscount -h
				111	usage: syscount.py [-h] [-p PID] [-i INTERVAL] [-T TOP] [-x] [-L] [-m] [-P]
				112	[-l]
				113
				114	Summarize syscall counts and latencies.
				115
				116	optional arguments:
				117	-h, --help show this help message and exit
				118	-p PID, --pid PID trace only this pid
				119	-i INTERVAL, --interval INTERVAL
				120	print summary at this interval (seconds)
				121	-T TOP, --top TOP print only the top syscalls by count or latency
				122	-x, --failures trace only failed syscalls (return < 0)
				123	-L, --latency collect syscall latency
				124	-m, --milliseconds display latency in milliseconds (default:
				125	microseconds)
				126	-P, --process count by process and not by syscall
				127	-l, --list print list of recognized syscalls and exit