Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 1 | /*****************************************/ |
| 2 | Kernel Connector. |
| 3 | /*****************************************/ |
| 4 | |
| 5 | Kernel connector - new netlink based userspace <-> kernel space easy |
| 6 | to use communication module. |
| 7 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 8 | The Connector driver makes it easy to connect various agents using a |
| 9 | netlink based network. One must register a callback and an identifier. |
| 10 | When the driver receives a special netlink message with the appropriate |
| 11 | identifier, the appropriate callback will be called. |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 12 | |
| 13 | From the userspace point of view it's quite straightforward: |
| 14 | |
| 15 | socket(); |
| 16 | bind(); |
| 17 | send(); |
| 18 | recv(); |
| 19 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 20 | But if kernelspace wants to use the full power of such connections, the |
| 21 | driver writer must create special sockets, must know about struct sk_buff |
| 22 | handling, etc... The Connector driver allows any kernelspace agents to use |
| 23 | netlink based networking for inter-process communication in a significantly |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 24 | easier way: |
| 25 | |
| 26 | int cn_add_callback(struct cb_id *id, char *name, void (*callback) (void *)); |
| 27 | void cn_netlink_send(struct cn_msg *msg, u32 __group, int gfp_mask); |
| 28 | |
| 29 | struct cb_id |
| 30 | { |
| 31 | __u32 idx; |
| 32 | __u32 val; |
| 33 | }; |
| 34 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 35 | idx and val are unique identifiers which must be registered in the |
| 36 | connector.h header for in-kernel usage. void (*callback) (void *) is a |
| 37 | callback function which will be called when a message with above idx.val |
| 38 | is received by the connector core. The argument for that function must |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 39 | be dereferenced to struct cn_msg *. |
| 40 | |
| 41 | struct cn_msg |
| 42 | { |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 43 | struct cb_id id; |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 44 | |
| 45 | __u32 seq; |
| 46 | __u32 ack; |
| 47 | |
| 48 | __u32 len; /* Length of the following data */ |
| 49 | __u8 data[0]; |
| 50 | }; |
| 51 | |
| 52 | /*****************************************/ |
| 53 | Connector interfaces. |
| 54 | /*****************************************/ |
| 55 | |
| 56 | int cn_add_callback(struct cb_id *id, char *name, void (*callback) (void *)); |
| 57 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 58 | Registers new callback with connector core. |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 59 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 60 | struct cb_id *id - unique connector's user identifier. |
| 61 | It must be registered in connector.h for legal in-kernel users. |
| 62 | char *name - connector's callback symbolic name. |
| 63 | void (*callback) (void *) - connector's callback. |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 64 | Argument must be dereferenced to struct cn_msg *. |
| 65 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 66 | |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 67 | void cn_del_callback(struct cb_id *id); |
| 68 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 69 | Unregisters new callback with connector core. |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 70 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 71 | struct cb_id *id - unique connector's user identifier. |
| 72 | |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 73 | |
Evgeniy Polyakov | b191ba0 | 2006-03-20 22:21:40 -0800 | [diff] [blame] | 74 | int cn_netlink_send(struct cn_msg *msg, u32 __groups, int gfp_mask); |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 75 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 76 | Sends message to the specified groups. It can be safely called from |
| 77 | softirq context, but may silently fail under strong memory pressure. |
| 78 | If there are no listeners for given group -ESRCH can be returned. |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 79 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 80 | struct cn_msg * - message header(with attached data). |
| 81 | u32 __group - destination group. |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 82 | If __group is zero, then appropriate group will |
| 83 | be searched through all registered connector users, |
| 84 | and message will be delivered to the group which was |
| 85 | created for user with the same ID as in msg. |
| 86 | If __group is not zero, then message will be delivered |
| 87 | to the specified group. |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 88 | int gfp_mask - GFP mask. |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 89 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 90 | Note: When registering new callback user, connector core assigns |
| 91 | netlink group to the user which is equal to it's id.idx. |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 92 | |
| 93 | /*****************************************/ |
| 94 | Protocol description. |
| 95 | /*****************************************/ |
| 96 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 97 | The current framework offers a transport layer with fixed headers. The |
| 98 | recommended protocol which uses such a header is as following: |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 99 | |
| 100 | msg->seq and msg->ack are used to determine message genealogy. When |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 101 | someone sends a message, they use a locally unique sequence and random |
| 102 | acknowledge number. The sequence number may be copied into |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 103 | nlmsghdr->nlmsg_seq too. |
| 104 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 105 | The sequence number is incremented with each message sent. |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 106 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 107 | If you expect a reply to the message, then the sequence number in the |
| 108 | received message MUST be the same as in the original message, and the |
| 109 | acknowledge number MUST be the same + 1. |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 110 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 111 | If we receive a message and its sequence number is not equal to one we |
| 112 | are expecting, then it is a new message. If we receive a message and |
| 113 | its sequence number is the same as one we are expecting, but its |
| 114 | acknowledge is not equal to the acknowledge number in the original |
| 115 | message + 1, then it is a new message. |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 116 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 117 | Obviously, the protocol header contains the above id. |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 118 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 119 | The connector allows event notification in the following form: kernel |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 120 | driver or userspace process can ask connector to notify it when |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 121 | selected ids will be turned on or off (registered or unregistered its |
| 122 | callback). It is done by sending a special command to the connector |
| 123 | driver (it also registers itself with id={-1, -1}). |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 124 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 125 | As example of this usage can be found in the cn_test.c module which |
| 126 | uses the connector to request notification and to send messages. |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 127 | |
| 128 | /*****************************************/ |
| 129 | Reliability. |
| 130 | /*****************************************/ |
| 131 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 132 | Netlink itself is not a reliable protocol. That means that messages can |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 133 | be lost due to memory pressure or process' receiving queue overflowed, |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 134 | so caller is warned that it must be prepared. That is why the struct |
| 135 | cn_msg [main connector's message header] contains u32 seq and u32 ack |
| 136 | fields. |
Evgeniy Polyakov | eb0d604 | 2005-10-13 14:42:04 -0700 | [diff] [blame] | 137 | |
| 138 | /*****************************************/ |
| 139 | Userspace usage. |
| 140 | /*****************************************/ |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 141 | |
Evgeniy Polyakov | eb0d604 | 2005-10-13 14:42:04 -0700 | [diff] [blame] | 142 | 2.6.14 has a new netlink socket implementation, which by default does not |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 143 | allow people to send data to netlink groups other than 1. |
| 144 | So, if you wish to use a netlink socket (for example using connector) |
| 145 | with a different group number, the userspace application must subscribe to |
| 146 | that group first. It can be achieved by the following pseudocode: |
Evgeniy Polyakov | eb0d604 | 2005-10-13 14:42:04 -0700 | [diff] [blame] | 147 | |
| 148 | s = socket(PF_NETLINK, SOCK_DGRAM, NETLINK_CONNECTOR); |
| 149 | |
| 150 | l_local.nl_family = AF_NETLINK; |
| 151 | l_local.nl_groups = 12345; |
| 152 | l_local.nl_pid = 0; |
| 153 | |
| 154 | if (bind(s, (struct sockaddr *)&l_local, sizeof(struct sockaddr_nl)) == -1) { |
| 155 | perror("bind"); |
| 156 | close(s); |
| 157 | return -1; |
| 158 | } |
| 159 | |
| 160 | { |
| 161 | int on = l_local.nl_groups; |
| 162 | setsockopt(s, 270, 1, &on, sizeof(on)); |
| 163 | } |
| 164 | |
| 165 | Where 270 above is SOL_NETLINK, and 1 is a NETLINK_ADD_MEMBERSHIP socket |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 166 | option. To drop a multicast subscription, one should call the above socket |
| 167 | option with the NETLINK_DROP_MEMBERSHIP parameter which is defined as 0. |
Evgeniy Polyakov | eb0d604 | 2005-10-13 14:42:04 -0700 | [diff] [blame] | 168 | |
| 169 | 2.6.14 netlink code only allows to select a group which is less or equal to |
| 170 | the maximum group number, which is used at netlink_kernel_create() time. |
| 171 | In case of connector it is CN_NETLINK_USERS + 0xf, so if you want to use |
| 172 | group number 12345, you must increment CN_NETLINK_USERS to that number. |
| 173 | Additional 0xf numbers are allocated to be used by non-in-kernel users. |
| 174 | |
| 175 | Due to this limitation, group 0xffffffff does not work now, so one can |
| 176 | not use add/remove connector's group notifications, but as far as I know, |
| 177 | only cn_test.c test module used it. |
| 178 | |
| 179 | Some work in netlink area is still being done, so things can be changed in |
| 180 | 2.6.15 timeframe, if it will happen, documentation will be updated for that |
| 181 | kernel. |