Blame - Documentation/networking/rxrpc.txt - kernel/msm-4.9

blob: 60d05eb77c6429f883d342e95de4546c50e759a2 [file] [log] [blame]

David Howells	17926a7	2007-04-26 15:48:28 -0700	[diff] [blame]	1	======================
				2	RxRPC NETWORK PROTOCOL
				3	======================
				4
				5	The RxRPC protocol driver provides a reliable two-phase transport on top of UDP
				6	that can be used to perform RxRPC remote operations. This is done over sockets
				7	of AF_RXRPC family, using sendmsg() and recvmsg() with control data to send and
				8	receive data, aborts and errors.
				9
				10	Contents of this document:
				11
				12	(*) Overview.
				13
				14	(*) RxRPC protocol summary.
				15
				16	(*) AF_RXRPC driver model.
				17
				18	(*) Control messages.
				19
				20	(*) Socket options.
				21
				22	(*) Security.
				23
				24	(*) Example client usage.
				25
				26	(*) Example server usage.
				27
David Howells	651350d	2007-04-26 15:50:17 -0700	[diff] [blame]	28	(*) AF_RXRPC kernel interface.
				29
David Howells	17926a7	2007-04-26 15:48:28 -0700	[diff] [blame]	30
				31	========
				32	OVERVIEW
				33	========
				34
				35	RxRPC is a two-layer protocol. There is a session layer which provides
				36	reliable virtual connections using UDP over IPv4 (or IPv6) as the transport
				37	layer, but implements a real network protocol; and there's the presentation
				38	layer which renders structured data to binary blobs and back again using XDR
				39	(as does SunRPC):
				40
				41	+-------------+
				42	\| Application \|
				43	+-------------+
				44	\| XDR \| Presentation
				45	+-------------+
				46	\| RxRPC \| Session
				47	+-------------+
				48	\| UDP \| Transport
				49	+-------------+
				50
				51
				52	AF_RXRPC provides:
				53
				54	(1) Part of an RxRPC facility for both kernel and userspace applications by
				55	making the session part of it a Linux network protocol (AF_RXRPC).
				56
				57	(2) A two-phase protocol. The client transmits a blob (the request) and then
				58	receives a blob (the reply), and the server receives the request and then
				59	transmits the reply.
				60
				61	(3) Retention of the reusable bits of the transport system set up for one call
				62	to speed up subsequent calls.
				63
				64	(4) A secure protocol, using the Linux kernel's key retention facility to
				65	manage security on the client end. The server end must of necessity be
				66	more active in security negotiations.
				67
				68	AF_RXRPC does not provide XDR marshalling/presentation facilities. That is
				69	left to the application. AF_RXRPC only deals in blobs. Even the operation ID
				70	is just the first four bytes of the request blob, and as such is beyond the
				71	kernel's interest.
				72
				73
				74	Sockets of AF_RXRPC family are:
				75
				76	(1) created as type SOCK_DGRAM;
				77
				78	(2) provided with a protocol of the type of underlying transport they're going
				79	to use - currently only PF_INET is supported.
				80
				81
				82	The Andrew File System (AFS) is an example of an application that uses this and
				83	that has both kernel (filesystem) and userspace (utility) components.
				84
				85
				86	======================
				87	RXRPC PROTOCOL SUMMARY
				88	======================
				89
				90	An overview of the RxRPC protocol:
				91
				92	(*) RxRPC sits on top of another networking protocol (UDP is the only option
				93	currently), and uses this to provide network transport. UDP ports, for
				94	example, provide transport endpoints.
				95
				96	(*) RxRPC supports multiple virtual "connections" from any given transport
				97	endpoint, thus allowing the endpoints to be shared, even to the same
				98	remote endpoint.
				99
				100	(*) Each connection goes to a particular "service". A connection may not go
				101	to multiple services. A service may be considered the RxRPC equivalent of
				102	a port number. AF_RXRPC permits multiple services to share an endpoint.
				103
				104	(*) Client-originating packets are marked, thus a transport endpoint can be
				105	shared between client and server connections (connections have a
				106	direction).
				107
				108	(*) Up to a billion connections may be supported concurrently between one
				109	local transport endpoint and one service on one remote endpoint. An RxRPC
				110	connection is described by seven numbers:
				111
				112	Local address }
				113	Local port } Transport (UDP) address
				114	Remote address }
				115	Remote port }
				116	Direction
				117	Connection ID
				118	Service ID
				119
				120	(*) Each RxRPC operation is a "call". A connection may make up to four
				121	billion calls, but only up to four calls may be in progress on a
				122	connection at any one time.
				123
				124	(*) Calls are two-phase and asymmetric: the client sends its request data,
				125	which the service receives; then the service transmits the reply data
				126	which the client receives.
				127
				128	(*) The data blobs are of indefinite size, the end of a phase is marked with a
				129	flag in the packet. The number of packets of data making up one blob may
				130	not exceed 4 billion, however, as this would cause the sequence number to
				131	wrap.
				132
				133	(*) The first four bytes of the request data are the service operation ID.
				134
				135	(*) Security is negotiated on a per-connection basis. The connection is
				136	initiated by the first data packet on it arriving. If security is
				137	requested, the server then issues a "challenge" and then the client
				138	replies with a "response". If the response is successful, the security is
				139	set for the lifetime of that connection, and all subsequent calls made
				140	upon it use that same security. In the event that the server lets a
				141	connection lapse before the client, the security will be renegotiated if
				142	the client uses the connection again.
				143
				144	(*) Calls use ACK packets to handle reliability. Data packets are also
				145	explicitly sequenced per call.
				146
				147	(*) There are two types of positive acknowledgement: hard-ACKs and soft-ACKs.
				148	A hard-ACK indicates to the far side that all the data received to a point
				149	has been received and processed; a soft-ACK indicates that the data has
				150	been received but may yet be discarded and re-requested. The sender may
				151	not discard any transmittable packets until they've been hard-ACK'd.
				152
				153	(*) Reception of a reply data packet implicitly hard-ACK's all the data
				154	packets that make up the request.
				155
				156	(*) An call is complete when the request has been sent, the reply has been
				157	received and the final hard-ACK on the last packet of the reply has
				158	reached the server.
				159
				160	(*) An call may be aborted by either end at any time up to its completion.
				161
				162
				163	=====================
				164	AF_RXRPC DRIVER MODEL
				165	=====================
				166
				167	About the AF_RXRPC driver:
				168
				169	(*) The AF_RXRPC protocol transparently uses internal sockets of the transport
				170	protocol to represent transport endpoints.
				171
				172	(*) AF_RXRPC sockets map onto RxRPC connection bundles. Actual RxRPC
				173	connections are handled transparently. One client socket may be used to
				174	make multiple simultaneous calls to the same service. One server socket
				175	may handle calls from many clients.
				176
				177	(*) Additional parallel client connections will be initiated to support extra
				178	concurrent calls, up to a tunable limit.
				179
				180	(*) Each connection is retained for a certain amount of time [tunable] after
				181	the last call currently using it has completed in case a new call is made
				182	that could reuse it.
				183
				184	(*) Each internal UDP socket is retained [tunable] for a certain amount of
				185	time [tunable] after the last connection using it discarded, in case a new
				186	connection is made that could use it.
				187
				188	(*) A client-side connection is only shared between calls if they have have
				189	the same key struct describing their security (and assuming the calls
				190	would otherwise share the connection). Non-secured calls would also be
				191	able to share connections with each other.
				192
				193	(*) A server-side connection is shared if the client says it is.
				194
				195	(*) ACK'ing is handled by the protocol driver automatically, including ping
				196	replying.
				197
				198	(*) SO_KEEPALIVE automatically pings the other side to keep the connection
				199	alive [TODO].
				200
				201	(*) If an ICMP error is received, all calls affected by that error will be
				202	aborted with an appropriate network error passed through recvmsg().
				203
				204
				205	Interaction with the user of the RxRPC socket:
				206
				207	(*) A socket is made into a server socket by binding an address with a
				208	non-zero service ID.
				209
				210	(*) In the client, sending a request is achieved with one or more sendmsgs,
				211	followed by the reply being received with one or more recvmsgs.
				212
				213	(*) The first sendmsg for a request to be sent from a client contains a tag to
				214	be used in all other sendmsgs or recvmsgs associated with that call. The
				215	tag is carried in the control data.
				216
				217	(*) connect() is used to supply a default destination address for a client
				218	socket. This may be overridden by supplying an alternate address to the
				219	first sendmsg() of a call (struct msghdr::msg_name).
				220
				221	(*) If connect() is called on an unbound client, a random local port will
				222	bound before the operation takes place.
				223
				224	(*) A server socket may also be used to make client calls. To do this, the
				225	first sendmsg() of the call must specify the target address. The server's
				226	transport endpoint is used to send the packets.
				227
				228	(*) Once the application has received the last message associated with a call,
				229	the tag is guaranteed not to be seen again, and so it can be used to pin
				230	client resources. A new call can then be initiated with the same tag
				231	without fear of interference.
				232
				233	(*) In the server, a request is received with one or more recvmsgs, then the
				234	the reply is transmitted with one or more sendmsgs, and then the final ACK
				235	is received with a last recvmsg.
				236
				237	(*) When sending data for a call, sendmsg is given MSG_MORE if there's more
				238	data to come on that call.
				239
				240	(*) When receiving data for a call, recvmsg flags MSG_MORE if there's more
				241	data to come for that call.
				242
				243	(*) When receiving data or messages for a call, MSG_EOR is flagged by recvmsg
				244	to indicate the terminal message for that call.
				245
				246	(*) A call may be aborted by adding an abort control message to the control
				247	data. Issuing an abort terminates the kernel's use of that call's tag.
				248	Any messages waiting in the receive queue for that call will be discarded.
				249
				250	(*) Aborts, busy notifications and challenge packets are delivered by recvmsg,
				251	and control data messages will be set to indicate the context. Receiving
				252	an abort or a busy message terminates the kernel's use of that call's tag.
				253
				254	(*) The control data part of the msghdr struct is used for a number of things:
				255
				256	(*) The tag of the intended or affected call.
				257
				258	(*) Sending or receiving errors, aborts and busy notifications.
				259
				260	(*) Notifications of incoming calls.
				261
				262	(*) Sending debug requests and receiving debug replies [TODO].
				263
				264	(*) When the kernel has received and set up an incoming call, it sends a
				265	message to server application to let it know there's a new call awaiting
				266	its acceptance [recvmsg reports a special control message]. The server
				267	application then uses sendmsg to assign a tag to the new call. Once that
				268	is done, the first part of the request data will be delivered by recvmsg.
				269
				270	(*) The server application has to provide the server socket with a keyring of
				271	secret keys corresponding to the security types it permits. When a secure
				272	connection is being set up, the kernel looks up the appropriate secret key
				273	in the keyring and then sends a challenge packet to the client and
				274	receives a response packet. The kernel then checks the authorisation of
				275	the packet and either aborts the connection or sets up the security.
				276
				277	(*) The name of the key a client will use to secure its communications is
				278	nominated by a socket option.
				279
				280
				281	Notes on recvmsg:
				282
				283	(*) If there's a sequence of data messages belonging to a particular call on
				284	the receive queue, then recvmsg will keep working through them until:
				285
				286	(a) it meets the end of that call's received data,
				287
				288	(b) it meets a non-data message,
				289
				290	(c) it meets a message belonging to a different call, or
				291
				292	(d) it fills the user buffer.
				293
				294	If recvmsg is called in blocking mode, it will keep sleeping, awaiting the
				295	reception of further data, until one of the above four conditions is met.
				296
				297	(2) MSG_PEEK operates similarly, but will return immediately if it has put any
				298	data in the buffer rather than sleeping until it can fill the buffer.
				299
				300	(3) If a data message is only partially consumed in filling a user buffer,
				301	then the remainder of that message will be left on the front of the queue
				302	for the next taker. MSG_TRUNC will never be flagged.
				303
				304	(4) If there is more data to be had on a call (it hasn't copied the last byte
				305	of the last data message in that phase yet), then MSG_MORE will be
				306	flagged.
				307
				308
				309	================
				310	CONTROL MESSAGES
				311	================
				312
				313	AF_RXRPC makes use of control messages in sendmsg() and recvmsg() to multiplex
				314	calls, to invoke certain actions and to report certain conditions. These are:
				315
				316	MESSAGE ID SRT DATA MEANING
				317	======================= === =========== ===============================
				318	RXRPC_USER_CALL_ID sr- User ID App's call specifier
				319	RXRPC_ABORT srt Abort code Abort code to issue/received
				320	RXRPC_ACK -rt n/a Final ACK received
				321	RXRPC_NET_ERROR -rt error num Network error on call
				322	RXRPC_BUSY -rt n/a Call rejected (server busy)
				323	RXRPC_LOCAL_ERROR -rt error num Local error encountered
				324	RXRPC_NEW_CALL -r- n/a New call received
				325	RXRPC_ACCEPT s-- n/a Accept new call
				326
				327	(SRT = usable in Sendmsg / delivered by Recvmsg / Terminal message)
				328
				329	(*) RXRPC_USER_CALL_ID
				330
				331	This is used to indicate the application's call ID. It's an unsigned long
				332	that the app specifies in the client by attaching it to the first data
				333	message or in the server by passing it in association with an RXRPC_ACCEPT
				334	message. recvmsg() passes it in conjunction with all messages except
				335	those of the RXRPC_NEW_CALL message.
				336
				337	(*) RXRPC_ABORT
				338
				339	This is can be used by an application to abort a call by passing it to
				340	sendmsg, or it can be delivered by recvmsg to indicate a remote abort was
				341	received. Either way, it must be associated with an RXRPC_USER_CALL_ID to
				342	specify the call affected. If an abort is being sent, then error EBADSLT
				343	will be returned if there is no call with that user ID.
				344
				345	(*) RXRPC_ACK
				346
				347	This is delivered to a server application to indicate that the final ACK
				348	of a call was received from the client. It will be associated with an
				349	RXRPC_USER_CALL_ID to indicate the call that's now complete.
				350
				351	(*) RXRPC_NET_ERROR
				352
				353	This is delivered to an application to indicate that an ICMP error message
				354	was encountered in the process of trying to talk to the peer. An
				355	errno-class integer value will be included in the control message data
				356	indicating the problem, and an RXRPC_USER_CALL_ID will indicate the call
				357	affected.
				358
				359	(*) RXRPC_BUSY
				360
				361	This is delivered to a client application to indicate that a call was
				362	rejected by the server due to the server being busy. It will be
				363	associated with an RXRPC_USER_CALL_ID to indicate the rejected call.
				364
				365	(*) RXRPC_LOCAL_ERROR
				366
				367	This is delivered to an application to indicate that a local error was
				368	encountered and that a call has been aborted because of it. An
				369	errno-class integer value will be included in the control message data
				370	indicating the problem, and an RXRPC_USER_CALL_ID will indicate the call
				371	affected.
				372
				373	(*) RXRPC_NEW_CALL
				374
				375	This is delivered to indicate to a server application that a new call has
				376	arrived and is awaiting acceptance. No user ID is associated with this,
				377	as a user ID must subsequently be assigned by doing an RXRPC_ACCEPT.
				378
				379	(*) RXRPC_ACCEPT
				380
				381	This is used by a server application to attempt to accept a call and
				382	assign it a user ID. It should be associated with an RXRPC_USER_CALL_ID
				383	to indicate the user ID to be assigned. If there is no call to be
				384	accepted (it may have timed out, been aborted, etc.), then sendmsg will
				385	return error ENODATA. If the user ID is already in use by another call,
				386	then error EBADSLT will be returned.
				387
				388
				389	==============
				390	SOCKET OPTIONS
				391	==============
				392
				393	AF_RXRPC sockets support a few socket options at the SOL_RXRPC level:
				394
				395	(*) RXRPC_SECURITY_KEY
				396
				397	This is used to specify the description of the key to be used. The key is
				398	extracted from the calling process's keyrings with request_key() and
				399	should be of "rxrpc" type.
				400
				401	The optval pointer points to the description string, and optlen indicates
				402	how long the string is, without the NUL terminator.
				403
				404	(*) RXRPC_SECURITY_KEYRING
				405
				406	Similar to above but specifies a keyring of server secret keys to use (key
				407	type "keyring"). See the "Security" section.
				408
				409	(*) RXRPC_EXCLUSIVE_CONNECTION
				410
				411	This is used to request that new connections should be used for each call
				412	made subsequently on this socket. optval should be NULL and optlen 0.
				413
				414	(*) RXRPC_MIN_SECURITY_LEVEL
				415
				416	This is used to specify the minimum security level required for calls on
				417	this socket. optval must point to an int containing one of the following
				418	values:
				419
				420	(a) RXRPC_SECURITY_PLAIN
				421
				422	Encrypted checksum only.
				423
				424	(b) RXRPC_SECURITY_AUTH
				425
				426	Encrypted checksum plus packet padded and first eight bytes of packet
				427	encrypted - which includes the actual packet length.
				428
				429	(c) RXRPC_SECURITY_ENCRYPTED
				430
				431	Encrypted checksum plus entire packet padded and encrypted, including
				432	actual packet length.
				433
				434
				435	========
				436	SECURITY
				437	========
				438
				439	Currently, only the kerberos 4 equivalent protocol has been implemented
				440	(security index 2 - rxkad). This requires the rxkad module to be loaded and,
				441	on the client, tickets of the appropriate type to be obtained from the AFS
				442	kaserver or the kerberos server and installed as "rxrpc" type keys. This is
				443	normally done using the klog program. An example simple klog program can be
				444	found at:
				445
				446	http://people.redhat.com/~dhowells/rxrpc/klog.c
				447
				448	The payload provided to add_key() on the client should be of the following
				449	form:
				450
				451	struct rxrpc_key_sec2_v1 {
				452	uint16_t security_index; /* 2 */
				453	uint16_t ticket_length; /* length of ticket[] */
				454	uint32_t expiry; /* time at which expires */
				455	uint8_t kvno; /* key version number */
				456	uint8_t __pad[3];
				457	uint8_t session_key[8]; /* DES session key */
				458	uint8_t ticket[0]; /* the encrypted ticket */
				459	};
				460
				461	Where the ticket blob is just appended to the above structure.
				462
				463
				464	For the server, keys of type "rxrpc_s" must be made available to the server.
				465	They have a description of "<serviceID>:<securityIndex>" (eg: "52:2" for an
				466	rxkad key for the AFS VL service). When such a key is created, it should be
				467	given the server's secret key as the instantiation data (see the example
				468	below).
				469
				470	add_key("rxrpc_s", "52:2", secret_key, 8, keyring);
				471
				472	A keyring is passed to the server socket by naming it in a sockopt. The server
				473	socket then looks the server secret keys up in this keyring when secure
				474	incoming connections are made. This can be seen in an example program that can
				475	be found at:
				476
				477	http://people.redhat.com/~dhowells/rxrpc/listen.c
				478
				479
				480	====================
				481	EXAMPLE CLIENT USAGE
				482	====================
				483
				484	A client would issue an operation by:
				485
				486	(1) An RxRPC socket is set up by:
				487
				488	client = socket(AF_RXRPC, SOCK_DGRAM, PF_INET);
				489
				490	Where the third parameter indicates the protocol family of the transport
				491	socket used - usually IPv4 but it can also be IPv6 [TODO].
				492
				493	(2) A local address can optionally be bound:
				494
				495	struct sockaddr_rxrpc srx = {
				496	.srx_family = AF_RXRPC,
				497	.srx_service = 0, /* we're a client */
				498	.transport_type = SOCK_DGRAM, /* type of transport socket */
				499	.transport.sin_family = AF_INET,
				500	.transport.sin_port = htons(7000), /* AFS callback */
				501	.transport.sin_address = 0, /* all local interfaces */
				502	};
				503	bind(client, &srx, sizeof(srx));
				504
				505	This specifies the local UDP port to be used. If not given, a random
				506	non-privileged port will be used. A UDP port may be shared between
				507	several unrelated RxRPC sockets. Security is handled on a basis of
				508	per-RxRPC virtual connection.
				509
				510	(3) The security is set:
				511
				512	const char *key = "AFS:cambridge.redhat.com";
				513	setsockopt(client, SOL_RXRPC, RXRPC_SECURITY_KEY, key, strlen(key));
				514
				515	This issues a request_key() to get the key representing the security
				516	context. The minimum security level can be set:
				517
				518	unsigned int sec = RXRPC_SECURITY_ENCRYPTED;
				519	setsockopt(client, SOL_RXRPC, RXRPC_MIN_SECURITY_LEVEL,
				520	&sec, sizeof(sec));
				521
				522	(4) The server to be contacted can then be specified (alternatively this can
				523	be done through sendmsg):
				524
				525	struct sockaddr_rxrpc srx = {
				526	.srx_family = AF_RXRPC,
				527	.srx_service = VL_SERVICE_ID,
				528	.transport_type = SOCK_DGRAM, /* type of transport socket */
				529	.transport.sin_family = AF_INET,
				530	.transport.sin_port = htons(7005), /* AFS volume manager */
				531	.transport.sin_address = ...,
				532	};
				533	connect(client, &srx, sizeof(srx));
				534
				535	(5) The request data should then be posted to the server socket using a series
				536	of sendmsg() calls, each with the following control message attached:
				537
				538	RXRPC_USER_CALL_ID - specifies the user ID for this call
				539
				540	MSG_MORE should be set in msghdr::msg_flags on all but the last part of
				541	the request. Multiple requests may be made simultaneously.
				542
Frederik Schwarzer	025dfda	2008-10-16 19:02:37 +0200	[diff] [blame]	543	If a call is intended to go to a destination other than the default
David Howells	17926a7	2007-04-26 15:48:28 -0700	[diff] [blame]	544	specified through connect(), then msghdr::msg_name should be set on the
				545	first request message of that call.
				546
				547	(6) The reply data will then be posted to the server socket for recvmsg() to
				548	pick up. MSG_MORE will be flagged by recvmsg() if there's more reply data
				549	for a particular call to be read. MSG_EOR will be set on the terminal
				550	read for a call.
				551
				552	All data will be delivered with the following control message attached:
				553
				554	RXRPC_USER_CALL_ID - specifies the user ID for this call
				555
				556	If an abort or error occurred, this will be returned in the control data
				557	buffer instead, and MSG_EOR will be flagged to indicate the end of that
				558	call.
				559
				560
				561	====================
				562	EXAMPLE SERVER USAGE
				563	====================
				564
				565	A server would be set up to accept operations in the following manner:
				566
				567	(1) An RxRPC socket is created by:
				568
				569	server = socket(AF_RXRPC, SOCK_DGRAM, PF_INET);
				570
				571	Where the third parameter indicates the address type of the transport
				572	socket used - usually IPv4.
				573
				574	(2) Security is set up if desired by giving the socket a keyring with server
				575	secret keys in it:
				576
				577	keyring = add_key("keyring", "AFSkeys", NULL, 0,
				578	KEY_SPEC_PROCESS_KEYRING);
				579
				580	const char secret_key[8] = {
				581	0xa7, 0x83, 0x8a, 0xcb, 0xc7, 0x83, 0xec, 0x94 };
				582	add_key("rxrpc_s", "52:2", secret_key, 8, keyring);
				583
				584	setsockopt(server, SOL_RXRPC, RXRPC_SECURITY_KEYRING, "AFSkeys", 7);
				585
				586	The keyring can be manipulated after it has been given to the socket. This
				587	permits the server to add more keys, replace keys, etc. whilst it is live.
				588
				589	(2) A local address must then be bound:
				590
				591	struct sockaddr_rxrpc srx = {
				592	.srx_family = AF_RXRPC,
				593	.srx_service = VL_SERVICE_ID, /* RxRPC service ID */
				594	.transport_type = SOCK_DGRAM, /* type of transport socket */
				595	.transport.sin_family = AF_INET,
				596	.transport.sin_port = htons(7000), /* AFS callback */
				597	.transport.sin_address = 0, /* all local interfaces */
				598	};
				599	bind(server, &srx, sizeof(srx));
				600
				601	(3) The server is then set to listen out for incoming calls:
				602
				603	listen(server, 100);
				604
				605	(4) The kernel notifies the server of pending incoming connections by sending
				606	it a message for each. This is received with recvmsg() on the server
				607	socket. It has no data, and has a single dataless control message
				608	attached:
				609
				610	RXRPC_NEW_CALL
				611
				612	The address that can be passed back by recvmsg() at this point should be
				613	ignored since the call for which the message was posted may have gone by
				614	the time it is accepted - in which case the first call still on the queue
				615	will be accepted.
				616
				617	(5) The server then accepts the new call by issuing a sendmsg() with two
				618	pieces of control data and no actual data:
				619
				620	RXRPC_ACCEPT - indicate connection acceptance
				621	RXRPC_USER_CALL_ID - specify user ID for this call
				622
				623	(6) The first request data packet will then be posted to the server socket for
				624	recvmsg() to pick up. At that point, the RxRPC address for the call can
				625	be read from the address fields in the msghdr struct.
				626
				627	Subsequent request data will be posted to the server socket for recvmsg()
				628	to collect as it arrives. All but the last piece of the request data will
				629	be delivered with MSG_MORE flagged.
				630
				631	All data will be delivered with the following control message attached:
				632
				633	RXRPC_USER_CALL_ID - specifies the user ID for this call
				634
				635	(8) The reply data should then be posted to the server socket using a series
				636	of sendmsg() calls, each with the following control messages attached:
				637
				638	RXRPC_USER_CALL_ID - specifies the user ID for this call
				639
				640	MSG_MORE should be set in msghdr::msg_flags on all but the last message
				641	for a particular call.
				642
				643	(9) The final ACK from the client will be posted for retrieval by recvmsg()
				644	when it is received. It will take the form of a dataless message with two
				645	control messages attached:
				646
				647	RXRPC_USER_CALL_ID - specifies the user ID for this call
				648	RXRPC_ACK - indicates final ACK (no data)
				649
				650	MSG_EOR will be flagged to indicate that this is the final message for
				651	this call.
				652
				653	(10) Up to the point the final packet of reply data is sent, the call can be
				654	aborted by calling sendmsg() with a dataless message with the following
				655	control messages attached:
				656
				657	RXRPC_USER_CALL_ID - specifies the user ID for this call
				658	RXRPC_ABORT - indicates abort code (4 byte data)
				659
				660	Any packets waiting in the socket's receive queue will be discarded if
				661	this is issued.
				662
				663	Note that all the communications for a particular service take place through
				664	the one server socket, using control messages on sendmsg() and recvmsg() to
				665	determine the call affected.
David Howells	651350d	2007-04-26 15:50:17 -0700	[diff] [blame]	666
				667
				668	=========================
				669	AF_RXRPC KERNEL INTERFACE
				670	=========================
				671
				672	The AF_RXRPC module also provides an interface for use by in-kernel utilities
				673	such as the AFS filesystem. This permits such a utility to:
				674
				675	(1) Use different keys directly on individual client calls on one socket
				676	rather than having to open a whole slew of sockets, one for each key it
				677	might want to use.
				678
				679	(2) Avoid having RxRPC call request_key() at the point of issue of a call or
				680	opening of a socket. Instead the utility is responsible for requesting a
				681	key at the appropriate point. AFS, for instance, would do this during VFS
				682	operations such as open() or unlink(). The key is then handed through
				683	when the call is initiated.
				684
				685	(3) Request the use of something other than GFP_KERNEL to allocate memory.
				686
				687	(4) Avoid the overhead of using the recvmsg() call. RxRPC messages can be
				688	intercepted before they get put into the socket Rx queue and the socket
				689	buffers manipulated directly.
				690
				691	To use the RxRPC facility, a kernel utility must still open an AF_RXRPC socket,
Matt LaPlante	01dd2fb	2007-10-20 01:34:40 +0200	[diff] [blame]	692	bind an address as appropriate and listen if it's to be a server socket, but
David Howells	651350d	2007-04-26 15:50:17 -0700	[diff] [blame]	693	then it passes this to the kernel interface functions.
				694
				695	The kernel interface functions are as follows:
				696
				697	(*) Begin a new client call.
				698
				699	struct rxrpc_call *
				700	rxrpc_kernel_begin_call(struct socket *sock,
				701	struct sockaddr_rxrpc *srx,
				702	struct key *key,
				703	unsigned long user_call_ID,
				704	gfp_t gfp);
				705
				706	This allocates the infrastructure to make a new RxRPC call and assigns
				707	call and connection numbers. The call will be made on the UDP port that
				708	the socket is bound to. The call will go to the destination address of a
				709	connected client socket unless an alternative is supplied (srx is
				710	non-NULL).
				711
				712	If a key is supplied then this will be used to secure the call instead of
				713	the key bound to the socket with the RXRPC_SECURITY_KEY sockopt. Calls
				714	secured in this way will still share connections if at all possible.
				715
				716	The user_call_ID is equivalent to that supplied to sendmsg() in the
				717	control data buffer. It is entirely feasible to use this to point to a
				718	kernel data structure.
				719
				720	If this function is successful, an opaque reference to the RxRPC call is
				721	returned. The caller now holds a reference on this and it must be
				722	properly ended.
				723
				724	(*) End a client call.
				725
				726	void rxrpc_kernel_end_call(struct rxrpc_call *call);
				727
				728	This is used to end a previously begun call. The user_call_ID is expunged
				729	from AF_RXRPC's knowledge and will not be seen again in association with
				730	the specified call.
				731
				732	(*) Send data through a call.
				733
				734	int rxrpc_kernel_send_data(struct rxrpc_call call, struct msghdr msg,
				735	size_t len);
				736
				737	This is used to supply either the request part of a client call or the
				738	reply part of a server call. msg.msg_iovlen and msg.msg_iov specify the
				739	data buffers to be used. msg_iov may not be NULL and must point
				740	exclusively to in-kernel virtual addresses. msg.msg_flags may be given
				741	MSG_MORE if there will be subsequent data sends for this call.
				742
				743	The msg must not specify a destination address, control data or any flags
				744	other than MSG_MORE. len is the total amount of data to transmit.
				745
				746	(*) Abort a call.
				747
				748	void rxrpc_kernel_abort_call(struct rxrpc_call *call, u32 abort_code);
				749
				750	This is used to abort a call if it's still in an abortable state. The
				751	abort code specified will be placed in the ABORT message sent.
				752
				753	(*) Intercept received RxRPC messages.
				754
				755	typedef void (rxrpc_interceptor_t)(struct sock sk,
				756	unsigned long user_call_ID,
				757	struct sk_buff *skb);
				758
				759	void
				760	rxrpc_kernel_intercept_rx_messages(struct socket *sock,
				761	rxrpc_interceptor_t interceptor);
				762
				763	This installs an interceptor function on the specified AF_RXRPC socket.
				764	All messages that would otherwise wind up in the socket's Rx queue are
				765	then diverted to this function. Note that care must be taken to process
				766	the messages in the right order to maintain DATA message sequentiality.
				767
				768	The interceptor function itself is provided with the address of the socket
				769	and handling the incoming message, the ID assigned by the kernel utility
				770	to the call and the socket buffer containing the message.
				771
				772	The skb->mark field indicates the type of message:
				773
				774	MARK MEANING
				775	=============================== =======================================
				776	RXRPC_SKB_MARK_DATA Data message
				777	RXRPC_SKB_MARK_FINAL_ACK Final ACK received for an incoming call
				778	RXRPC_SKB_MARK_BUSY Client call rejected as server busy
				779	RXRPC_SKB_MARK_REMOTE_ABORT Call aborted by peer
				780	RXRPC_SKB_MARK_NET_ERROR Network error detected
				781	RXRPC_SKB_MARK_LOCAL_ERROR Local error encountered
				782	RXRPC_SKB_MARK_NEW_CALL New incoming call awaiting acceptance
				783
				784	The remote abort message can be probed with rxrpc_kernel_get_abort_code().
				785	The two error messages can be probed with rxrpc_kernel_get_error_number().
				786	A new call can be accepted with rxrpc_kernel_accept_call().
				787
				788	Data messages can have their contents extracted with the usual bunch of
				789	socket buffer manipulation functions. A data message can be determined to
				790	be the last one in a sequence with rxrpc_kernel_is_data_last(). When a
				791	data message has been used up, rxrpc_kernel_data_delivered() should be
				792	called on it..
				793
				794	Non-data messages should be handled to rxrpc_kernel_free_skb() to dispose
				795	of. It is possible to get extra refs on all types of message for later
				796	freeing, but this may pin the state of a call until the message is finally
				797	freed.
				798
				799	(*) Accept an incoming call.
				800
				801	struct rxrpc_call *
				802	rxrpc_kernel_accept_call(struct socket *sock,
				803	unsigned long user_call_ID);
				804
				805	This is used to accept an incoming call and to assign it a call ID. This
				806	function is similar to rxrpc_kernel_begin_call() and calls accepted must
				807	be ended in the same way.
				808
				809	If this function is successful, an opaque reference to the RxRPC call is
				810	returned. The caller now holds a reference on this and it must be
				811	properly ended.
				812
				813	(*) Reject an incoming call.
				814
				815	int rxrpc_kernel_reject_call(struct socket *sock);
				816
				817	This is used to reject the first incoming call on the socket's queue with
				818	a BUSY message. -ENODATA is returned if there were no incoming calls.
				819	Other errors may be returned if the call had been aborted (-ECONNABORTED)
				820	or had timed out (-ETIME).
				821
				822	(*) Record the delivery of a data message and free it.
				823
				824	void rxrpc_kernel_data_delivered(struct sk_buff *skb);
				825
				826	This is used to record a data message as having been delivered and to
				827	update the ACK state for the call. The socket buffer will be freed.
				828
				829	(*) Free a message.
				830
				831	void rxrpc_kernel_free_skb(struct sk_buff *skb);
				832
				833	This is used to free a non-DATA socket buffer intercepted from an AF_RXRPC
				834	socket.
				835
				836	(*) Determine if a data message is the last one on a call.
				837
				838	bool rxrpc_kernel_is_data_last(struct sk_buff *skb);
				839
				840	This is used to determine if a socket buffer holds the last data message
				841	to be received for a call (true will be returned if it does, false
				842	if not).
				843
				844	The data message will be part of the reply on a client call and the
				845	request on an incoming call. In the latter case there will be more
				846	messages, but in the former case there will not.
				847
				848	(*) Get the abort code from an abort message.
				849
				850	u32 rxrpc_kernel_get_abort_code(struct sk_buff *skb);
				851
				852	This is used to extract the abort code from a remote abort message.
				853
				854	(*) Get the error number from a local or network error message.
				855
				856	int rxrpc_kernel_get_error_number(struct sk_buff *skb);
				857
				858	This is used to extract the error number from a message indicating either
				859	a local error occurred or a network error occurred.
David Howells	76181c1	2007-10-16 23:29:46 -0700	[diff] [blame]	860
				861	(*) Allocate a null key for doing anonymous security.
				862
				863	struct key rxrpc_get_null_key(const char keyname);
				864
				865	This is used to allocate a null RxRPC key that can be used to indicate
				866	anonymous security for a particular domain.