blob: 0b1b53353f12d8ebf172f13d7ec5d2cdb377505c [file] [log] [blame]
osdl.org!shemmingeraba5acd2004-04-15 20:56:59 +00001<!doctype linuxdoc system>
2
3<article>
4
5<title>SS Utility: Quick Intro
6<author>Alexey Kuznetosv, <tt/kuznet@ms2.inr.ac.ru/
7<date>some_negative_number, 20 Sep 2001
8<abstract>
9<tt/ss/ is one another utility to investigate sockets.
10Functionally it is NOT better than <tt/netstat/ combined
11with some perl/awk scripts and though it is surely faster
12it is not enough to make it much better. :-)
13So, stop reading this now and do not waste your time.
14Well, certainly, it proposes some functionality, which current
15netstat is still not able to do, but surely will soon.
16</abstract>
17
18<sect>Why?
19
20<p> <tt>/proc</tt> interface is inadequate, unfortunately.
21When amount of sockets is enough large, <tt/netstat/ or even
22plain <tt>cat /proc/net/tcp/</tt> cause nothing but pains and curses.
23In linux-2.4 the desease became worse: even if amount
24of sockets is small reading <tt>/proc/net/tcp/</tt> is slow enough.
25
26This utility presents a new approach, which is supposed to scale
27well. I am not going to describe technical details here and
28will concentrate on description of the command.
29The only important thing to say is that it is not so bad idea
30to load module <tt/tcp_diag/, which can be found in directory
31<tt/Modules/ of <tt/iproute2/. If you do not make this <tt/ss/
32will work, but it falls back to <tt>/proc</tt> and becomes slow
33like <tt/netstat/, well, a bit faster yet (see section "Some numbers").
34
35<sect>Old news
36
37<p>
38In the simplest form <tt/ss/ is equivalent to netstat
39with some small deviations.
40
41<itemize>
42<item><tt/ss -t -a/ dumps all TCP sockets
43<item><tt/ss -u -a/ dumps all UDP sockets
44<item><tt/ss -w -a/ dumps all RAW sockets
45<item><tt/ss -x -a/ dumps all UNIX sockets
46</itemize>
47
48<p>
49Option <tt/-o/ shows TCP timers state.
50Option <tt/-e/ shows some extended information.
51Etc. etc. etc. Seems, all the options of netstat related to sockets
52are supported. Though not AX.25 and other bizarres. :-)
53If someone wants, he can make support for decnet and ipx.
54Some rudimentary support for them is already present in iproute2 libutils,
55and I will be glad to see these new members.
56
57<p>
58However, standard functionality is a bit different:
59
60<p>
61The first: without option <tt/-a/ sockets in states
62<tt/TIME-WAIT/ and <tt/SYN-RECV/ are skipped too.
63It is more reasonable default, I think.
64
65<p>
66The second: format of UNIX sockets is different. It coincides
67with tcp/udp. Though standard kernel still does not allow to
68see write/read queues and peer address of connected UNIX sockets,
69the patch doing this exists.
70
71<p>
72The third: default is to dump only TCP sockets, rather than all of the types.
73
74<p>
75The next: by default it does not resolve numeric host addresses (like <tt/ip/)!
76Resolving is enabled with option <tt/-r/. Service names, usually stored
77in local files, are resolved by default. Also, if service database
78does not contain references to a port, <tt/ss/ queries system
79<tt/rpcbind/. RPC services are prefixed with <tt/rpc./
80Resolution of services may be suppressed with option <tt/-n/.
81
82<p>
83It does not accept "long" options (I dislike them, sorry).
84So, address family is given with family identifier following
85option <tt/-f/ to be algined to iproute2 conventions.
86Mostly, it is to allow option parser to parse
87addresses correctly, but as side effect it really limits dumping
88to sockets supporting only given family. Option <tt/-A/ followed
89by list of socket tables to dump is also supported.
90Logically, id of socket table is different of _address_ family, which is
91another point of incompatibility. So, id is one of
92<tt/all/, <tt/tcp/, <tt/udp/,
93<tt/raw/, <tt/inet/, <tt/unix/, <tt/packet/, <tt/netlink/. See?
94Well, <tt/inet/ is just abbreviation for <tt/tcp|udp|raw/
95and it is not difficult to guess that <tt/packet/ allows
96to look at packet sockets. Actually, there are also some other abbreviations,
97f.e. <tt/unix_dgram/ selects only datagram UNIX sockets.
98
99<p>
100The next: well, I still do not know. :-)
101
102
103
104
105<sect>Time to talk about new functionality.
106
107<p>It is builtin filtering of socket lists.
108
109<sect1> Filtering by state.
110
111<p>
112<tt/ss/ allows to filter socket states, using keywords
113<tt/state/ and <tt/exclude/, followed by some state
114identifier.
115
116<p>
117State identifier are standard TCP state names (not listed,
118they are useless for you if you already do not know them)
119or abbreviations:
120
121<itemize>
122<item><tt/all/ - for all the states
123<item><tt/bucket/ - for TCP minisockets (<tt/TIME-WAIT|SYN-RECV/)
124<item><tt/big/ - all except for minisockets
125<item><tt/connected/ - not closed and not listening
126<item><tt/synchronized/ - connected and not <tt/SYN-SENT/
127</itemize>
128
129<p>
130 F.e. to dump all tcp sockets except <tt/SYN-RECV/:
131
132<tscreen><verb>
133 ss exclude SYN-RECV
134</verb></tscreen>
135
136<p>
137 If neither <tt/state/ nor <tt/exclude/ directives
138 are present,
139 state filter defaults to <tt/all/ with option <tt/-a/
140 or to <tt/all/,
141 excluding listening, syn-recv, time-wait and closed sockets.
142
143<sect1> Filtering by addresses and ports.
144
145<p>
146Option list may contain address/port filter.
147It is boolean expression which consists of boolean operation
148<tt/or/, <tt/and/, <tt/not/ and predicates.
149Actually, all the flavors of names for boolean operations are eaten:
150<tt/&amp/, <tt/&amp&amp/, <tt/|/, <tt/||/, <tt/!/, but do not forget
151about special sense given to these symbols by unix shells and escape
152them correctly, when used from command line.
153
154<p>
155Predicates may be of the folowing kinds:
156
157<itemize>
158<item>A. Address/port match, where address is checked against mask
159 and port is either wildcard or exact. It is one of:
160
161<tscreen><verb>
162 dst prefix:port
163 src prefix:port
164 src unix:STRING
165 src link:protocol:ifindex
166 src nl:channel:pid
167</verb></tscreen>
168
169 Both prefix and port may be absent or replaced with <tt/*/,
170 which means wildcard. UNIX socket use more powerful scheme
171 matching to socket names by shell wildcards. Also, prefixes
172 unix: and link: may be omitted, if address family is evident
173 from context (with option <tt/-x/ or with <tt/-f unix/
174 or with <tt/unix/ keyword)
175
176<p>
177 F.e.
178
179<tscreen><verb>
180 dst 10.0.0.1
181 dst 10.0.0.1:
182 dst 10.0.0.1/32:
183 dst 10.0.0.1:*
184</verb></tscreen>
185 are equivalent and mean socket connected to
186 any port on host 10.0.0.1
187
188<tscreen><verb>
189 dst 10.0.0.0/24:22
190</verb></tscreen>
191 sockets connected to port 22 on network
192 10.0.0.0...255.
193
194<p>
195 Note that port separated of address with colon, which creates
196 troubles with IPv6 addresses. Generally, we interpret the last
197 colon as splitting port. To allow to give IPv6 addresses,
198 trick like used in IPv6 HTTP URLs may be used:
199
200<tscreen><verb>
201 dst [::1]
202</verb></tscreen>
203 are sockets connected to ::1 on any port
204
205<p>
206 Another way is <tt/dst ::1/128/. / helps to understand that
207 colon is part of IPv6 address.
208
209<p>
210 Now we can add another alias for <tt/dst 10.0.0.1/:
211 <tt/dst [10.0.0.1]/. :-)
212
213<p> Address may be a DNS name. In this case all the addresses are looked
214 up (in all the address families, if it is not limited by option <tt/-f/
215 or special address prefix <tt/inet:/, <tt/inet6/) and resulting
216 expression is <tt/or/ over all of them.
217
218<item> B. Port expressions:
219<tscreen><verb>
220 dport &gt= :1024
221 dport != :22
222 sport &lt :32000
223</verb></tscreen>
224 etc.
225
226 All the relations: <tt/&lt/, <tt/&gt/, <tt/=/, <tt/>=/, <tt/=/, <tt/==/,
227 <tt/!=/, <tt/eq/, <tt/ge/, <tt/lt/, <tt/ne/...
228 Use variant which you like more, but not forget to escape special
229 characters when typing them in command line. :-)
230
231 Note that port number syntactically coincides to the case A!
232 You may even add an IP address, but it will not participate
233 incomparison, except for <tt/==/ and <tt/!=/, which are equivalent
234 to corresponding predicates of type A. F.e.
235<p>
236<tt/dst 10.0.0.1:22/
237 is equivalent to <tt/dport eq 10.0.0.1:22/
238 and
239 <tt/not dst 10.0.0.1:22/ is equivalent to
240 <tt/dport neq 10.0.0.1:22/
241
242<item>C. Keyword <tt/autobound/. It matches to sockets bound automatically
243 on local system.
244
245</itemize>
246
247
248<sect> Examples
249
250<p>
251<itemize>
252<item>1. List all the tcp sockets in state <tt/FIN-WAIT-1/ for our apache
253 to network 193.233.7/24 and look at their timers:
254
255<tscreen><verb>
256 ss -o state fin-wait-1 \( sport = :http or sport = :https \) \
257 dst 193.233.7/24
258</verb></tscreen>
259
260 Oops, forgot to say that missing logical operation is
261 equivalent to <tt/and/.
262
263<item> 2. Well, now look at the rest...
264
265<tscreen><verb>
266 ss -o excl fin-wait-1
267 ss state fin-wait-1 \( sport neq :http and sport neq :https \) \
268 or not dst 193.233.7/24
269</verb></tscreen>
270
271 Note that we have to do _two_ calls of ss to do this.
272 State match is always anded to address/port match.
273 The reason for this is purely technical: ss does fast skip of
274 not matching states before parsing addresses and I consider the
275 ability to skip fastly gobs of time-wait and syn-recv sockets
276 as more important than logical generality.
277
278<item> 3. So, let's look at all our sockets using autobound ports:
279
280<tscreen><verb>
281 ss -a -A all autobound
282</verb></tscreen>
283
284
285<item> 4. And eventually find all the local processes connected
286 to local X servers:
287
288<tscreen><verb>
289 ss -xp dst "/tmp/.X11-unix/*"
290</verb></tscreen>
291
292 Pardon, this does not work with current kernel, patching is required.
293 But we still can look at server side:
294
295<tscreen><verb>
296 ss -x src "/tmp/.X11-unix/*"
297</verb></tscreen>
298
299</itemize>
300
301
302<sect> Returning to ground: real manual
303
304<p>
305<sect1> Command arguments
306
307<p> General format of arguments to <tt/ss/ is:
308
309<tscreen><verb>
310 ss [ OPTIONS ] [ STATE-FILTER ] [ ADDRESS-FILTER ]
311</verb></tscreen>
312
313<sect2><tt/OPTIONS/
314<p> <tt/OPTIONS/ is list of single letter options, using common unix
315conventions.
316
317<itemize>
318<item><tt/-h/ - show help page
319<item><tt/-?/ - the same, of course
320<item><tt/-v/, <tt/-V/ - print version of <tt/ss/ and exit
321<item><tt/-s/ - print summary statistics. This option does not parse
322socket lists obtaining summary from various sources. It is useful
323when amount of sockets is so huge that parsing <tt>/proc/net/tcp</tt>
324is painful.
325<item><tt/-D FILE/ - do not display anything, just dump raw information
326about TCP sockets to <tt/FILE/ after applying filters. If <tt/FILE/ is <tt/-/
327<tt/stdout/ is used.
328<item><tt/-F FILE/ - read continuation of filter from <tt/FILE/.
329Each line of <tt/FILE/ is interpreted like single command line option.
330If <tt/FILE/ is <tt/-/ <tt/stdin/ is used.
331<item><tt/-r/ - try to resolve numeric address/ports
332<item><tt/-n/ - do not try to resolve ports
333<item><tt/-o/ - show some optional information, f.e. TCP timers
334<item><tt/-i/ - show some infomration specific to TCP (RTO, congestion
335window, slow start threshould etc.)
336<item><tt/-e/ - show even more optional information
337<item><tt/-m/ - show extended information on memory used by the socket.
338It is available only with <tt/tcp_diag/ enabled.
339<item><tt/-p/ - show list of processes owning the socket
340<item><tt/-f FAMILY/ - default address family used for parsing addresses.
341 Also this option limits listing to sockets supporting
342 given address family. Currently the following families
343 are supported: <tt/unix/, <tt/inet/, <tt/inet6/, <tt/link/,
344 <tt/netlink/.
345<item><tt/-4/ - alias for <tt/-f inet/
346<item><tt/-6/ - alias for <tt/-f inet6/
347<item><tt/-0/ - alias for <tt/-f link/
348<item><tt/-A LIST-OF-TABLES/ - list of socket tables to dump, separated
349 by commas. The following identifiers are understood:
350 <tt/all/, <tt/inet/, <tt/tcp/, <tt/udp/, <tt/raw/,
351 <tt/unix/, <tt/packet/, <tt/netlink/, <tt/unix_dgram/,
352 <tt/unix_stream/, <tt/packet_raw/, <tt/packet_dgram/.
353<item><tt/-x/ - alias for <tt/-A unix/
354<item><tt/-t/ - alias for <tt/-A tcp/
355<item><tt/-u/ - alias for <tt/-A udp/
356<item><tt/-w/ - alias for <tt/-A raw/
357<item><tt/-a/ - show sockets of all the states. By default sockets
358 in states <tt/LISTEN/, <tt/TIME-WAIT/, <tt/SYN_RECV/
359 and <tt/CLOSE/ are skipped.
360<item><tt/-l/ - show only sockets in state <tt/LISTEN/
361</itemize>
362
363<sect2><tt/STATE-FILTER/
364
365<p><tt/STATE-FILTER/ allows to construct arbitrary set of
366states to match. Its syntax is sequence of keywords <tt/state/
367and <tt/exclude/ followed by identifier of state.
368Available identifiers are:
369
370<p>
371<itemize>
372<item> All standard TCP states: <tt/established/, <tt/syn-sent/,
373<tt/syn-recv/, <tt/fin-wait-1/, <tt/fin-wait-2/, <tt/time-wait/,
374<tt/closed/, <tt/close-wait/, <tt/last-ack/, <tt/listen/ and <tt/closing/.
375
376<item><tt/all/ - for all the states
377<item><tt/connected/ - all the states except for <tt/listen/ and <tt/closed/
378<item><tt/synchronized/ - all the <tt/connected/ states except for
379<tt/syn-sent/
380<item><tt/bucket/ - states, which are maintained as minisockets, i.e.
381<tt/time-wait/ and <tt/syn-recv/.
382<item><tt/big/ - opposite to <tt/bucket/
383</itemize>
384
385<sect2><tt/ADDRESS_FILTER/
386
387<p><tt/ADDRESS_FILTER/ is boolean expression with operations <tt/and/, <tt/or/
388and <tt/not/, which can be abbreviated in C style f.e. as <tt/&amp/,
389<tt/&amp&amp/.
390
391<p>
392Predicates check socket addresses, both local and remote.
393There are the following kinds of predicates:
394
395<itemize>
396<item> <tt/dst ADDRESS_PATTERN/ - matches remote address and port
397<item> <tt/src ADDRESS_PATTERN/ - matches local address and port
398<item> <tt/dport RELOP PORT/ - compares remote port to a number
399<item> <tt/sport RELOP PORT/ - compares local port to a number
400<item> <tt/autobound/ - checks that socket is bound to an ephemeral
401 port
402</itemize>
403
404<p><tt/RELOP/ is some of <tt/&lt=/, <tt/&gt=/, <tt/==/ etc.
405To make this more convinient for use in unix shell, alphabetic
406FORTRAN-like notations <tt/le/, <tt/gt/ etc. are accepted as well.
407
408<p>The format and semantics of <tt/ADDRESS_PATTERN/ depends on address
409family.
410
411<itemize>
412<item><tt/inet/ - <tt/ADDRESS_PATTERN/ consists of IP prefix, optionally
413followed by colon and port. If prefix or port part is absent or replaced
414with <tt/*/, this means wildcard match.
415<item><tt/inet6/ - The same as <tt/inet/, only prefix refers to an IPv6
416address. Unlike <tt/inet/ colon becomes ambiguous, so that <tt/ss/ allows
417to use scheme, like used in URLs, where address is suppounded with
418<tt/[/ ... <tt/]/.
419<item><tt/unix/ - <tt/ADDRESS_PATTERN/ is shell-style wildcard.
420<item><tt/packet/ - format looks like <tt/inet/, only interface index
421stays instead of port and link layer protocol id instead of address.
422<item><tt/netlink/ - format looks like <tt/inet/, only socket pid
423stays instead of port and netlink channel instead of address.
424</itemize>
425
426<p><tt/PORT/ is syntactically <tt/ADDRESS_PATTERN/ with wildcard
427address part. Certainly, it is undefined for UNIX sockets.
428
429<sect1> Environment variables
430
431<p>
432<tt/ss/ allows to change source of information using various
433environment variables:
434
435<p>
436<itemize>
437<item> <tt/PROC_SLABINFO/ to override <tt>/proc/slabinfo</tt>
438<item> <tt/PROC_NET_TCP/ to override <tt>/proc/net/tcp</tt>
439<item> <tt/PROC_NET_UDP/ to override <tt>/proc/net/udp</tt>
440<item> etc.
441</itemize>
442
443<p>
444Variable <tt/PROC_ROOT/ allows to change root of all the <tt>/proc/</tt>
445hierarchy.
446
447<p>
448Variable <tt/TCPDIAG_FILE/ prescribes to open a file instead of
449requesting kernel to dump information about TCP sockets.
450
451
452<p> This option is used mainly to investigate bug reports,
453when dumps of files usually found in <tt>/proc/</tt> are recevied
454by e-mail.
455
456<sect1> Output format
457
458<p>Six columns. The first is <tt/Netid/, it denotes socket type and
459transport protocol, when it is ambiguous: <tt/tcp/, <tt/udp/, <tt/raw/,
460<tt/u_str/ is abbreviation for <tt/unix_stream/, <tt/u_dgr/ for UNIX
461datagram sockets, <tt/nl/ for netlink, <tt/p_raw/ and <tt/p_dgr/ for
462raw and datagram packet sockets. This column is optional, it will
463be hidden, if filter selects an unique netid.
464
465<p>
466The second column is <tt/State/. Socket state is displayed here.
467The names are standard TCP names, except for <tt/UNCONN/, which
468cannot happen for TCP, but normal for not connected sockets
469of another types. Again, this column can be hidden.
470
471<p>
472Then two columns (<tt/Recv-Q/ and <tt/Send-Q/) showing amount of data
473queued for receive and transmit.
474
475<p>
476And the last two columns display local address and port of the socket
477and its peer address, if the socket is connected.
478
479<p>
480If options <tt/-o/, <tt/-e/ or <tt/-p/ were given, options are
481displayed not in fixed positions but separated by spaces pairs:
482<tt/option:value/. If value is not a single number, it is presented
483as list of values, enclosed to <tt/(/ ... <tt/)/ and separated with
484commas. F.e.
485
486<tscreen><verb>
487 timer:(keepalive,111min,0)
488</verb></tscreen>
489is typical format for TCP timer (option <tt/-o/).
490
491<tscreen><verb>
492 users:((X,113,3))
493</verb></tscreen>
494is typical for list of users (option <tt/-p/).
495
496
497<sect>Some numbers
498
499<p>
500Well, let us use <tt/pidentd/ and a tool <tt/ibench/ to measure
501its performance. It is 30 requests per second here. Nothing to test,
502it is too slow. OK, let us patch pidentd with patch from directory
503Patches. After this it handles about 4300 requests per second
504and becomes handy tool to pollute socket tables with lots of timewait
505buckets.
506
507<p>
508So, each test starts from pollution tables with 30000 sockets
509and then doing full dump of the table piped to wc and measuring
510timings with time:
511
512<p>Results:
513
514<itemize>
515<item> <tt/netstat -at/ - 15.6 seconds
516<item> <tt/ss -atr/, but without <tt/tcp_diag/ - 5.4 seconds
517<item> <tt/ss -atr/ with <tt/tcp_diag/ - 0.47 seconds
518</itemize>
519
520No comments. Though one comment is necessary, most of time
521without <tt/tcp_diag/ is wasted inside kernel with completely
522blocked networking. More than 10 seconds, yes. <tt/tcp_diag/
523does the same work for 100 milliseconds of system time.
524
525</article>