Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 1 | /*****************************************/ |
| 2 | Kernel Connector. |
| 3 | /*****************************************/ |
| 4 | |
| 5 | Kernel connector - new netlink based userspace <-> kernel space easy |
| 6 | to use communication module. |
| 7 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 8 | The Connector driver makes it easy to connect various agents using a |
| 9 | netlink based network. One must register a callback and an identifier. |
| 10 | When the driver receives a special netlink message with the appropriate |
| 11 | identifier, the appropriate callback will be called. |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 12 | |
| 13 | From the userspace point of view it's quite straightforward: |
| 14 | |
| 15 | socket(); |
| 16 | bind(); |
| 17 | send(); |
| 18 | recv(); |
| 19 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 20 | But if kernelspace wants to use the full power of such connections, the |
| 21 | driver writer must create special sockets, must know about struct sk_buff |
| 22 | handling, etc... The Connector driver allows any kernelspace agents to use |
| 23 | netlink based networking for inter-process communication in a significantly |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 24 | easier way: |
| 25 | |
Philipp Reisner | 7069331 | 2009-10-02 02:40:05 +0000 | [diff] [blame] | 26 | int cn_add_callback(struct cb_id *id, char *name, void (*callback) (struct cn_msg *, struct netlink_skb_parms *)); |
David Fries | 34470e0 | 2014-04-08 22:37:08 -0500 | [diff] [blame] | 27 | void cn_netlink_send_multi(struct cn_msg *msg, u16 len, u32 portid, u32 __group, int gfp_mask); |
| 28 | void cn_netlink_send(struct cn_msg *msg, u32 portid, u32 __group, int gfp_mask); |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 29 | |
| 30 | struct cb_id |
| 31 | { |
| 32 | __u32 idx; |
| 33 | __u32 val; |
| 34 | }; |
| 35 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 36 | idx and val are unique identifiers which must be registered in the |
| 37 | connector.h header for in-kernel usage. void (*callback) (void *) is a |
| 38 | callback function which will be called when a message with above idx.val |
| 39 | is received by the connector core. The argument for that function must |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 40 | be dereferenced to struct cn_msg *. |
| 41 | |
| 42 | struct cn_msg |
| 43 | { |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 44 | struct cb_id id; |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 45 | |
| 46 | __u32 seq; |
| 47 | __u32 ack; |
| 48 | |
| 49 | __u32 len; /* Length of the following data */ |
| 50 | __u8 data[0]; |
| 51 | }; |
| 52 | |
| 53 | /*****************************************/ |
| 54 | Connector interfaces. |
| 55 | /*****************************************/ |
| 56 | |
Philipp Reisner | 7069331 | 2009-10-02 02:40:05 +0000 | [diff] [blame] | 57 | int cn_add_callback(struct cb_id *id, char *name, void (*callback) (struct cn_msg *, struct netlink_skb_parms *)); |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 58 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 59 | Registers new callback with connector core. |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 60 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 61 | struct cb_id *id - unique connector's user identifier. |
| 62 | It must be registered in connector.h for legal in-kernel users. |
| 63 | char *name - connector's callback symbolic name. |
Philipp Reisner | 7069331 | 2009-10-02 02:40:05 +0000 | [diff] [blame] | 64 | void (*callback) (struct cn..) - connector's callback. |
| 65 | cn_msg and the sender's credentials |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 66 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 67 | |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 68 | void cn_del_callback(struct cb_id *id); |
| 69 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 70 | Unregisters new callback with connector core. |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 71 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 72 | struct cb_id *id - unique connector's user identifier. |
| 73 | |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 74 | |
David Fries | 34470e0 | 2014-04-08 22:37:08 -0500 | [diff] [blame] | 75 | int cn_netlink_send_multi(struct cn_msg *msg, u16 len, u32 portid, u32 __groups, int gfp_mask); |
| 76 | int cn_netlink_send(struct cn_msg *msg, u32 portid, u32 __groups, int gfp_mask); |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 77 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 78 | Sends message to the specified groups. It can be safely called from |
| 79 | softirq context, but may silently fail under strong memory pressure. |
| 80 | If there are no listeners for given group -ESRCH can be returned. |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 81 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 82 | struct cn_msg * - message header(with attached data). |
David Fries | 34470e0 | 2014-04-08 22:37:08 -0500 | [diff] [blame] | 83 | u16 len - for *_multi multiple cn_msg messages can be sent |
| 84 | u32 port - destination port. |
| 85 | If non-zero the message will be sent to the |
| 86 | given port, which should be set to the |
| 87 | original sender. |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 88 | u32 __group - destination group. |
David Fries | 34470e0 | 2014-04-08 22:37:08 -0500 | [diff] [blame] | 89 | If port and __group is zero, then appropriate group will |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 90 | be searched through all registered connector users, |
| 91 | and message will be delivered to the group which was |
| 92 | created for user with the same ID as in msg. |
| 93 | If __group is not zero, then message will be delivered |
| 94 | to the specified group. |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 95 | int gfp_mask - GFP mask. |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 96 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 97 | Note: When registering new callback user, connector core assigns |
Francis Galiegue | a33f322 | 2010-04-23 00:08:02 +0200 | [diff] [blame] | 98 | netlink group to the user which is equal to its id.idx. |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 99 | |
| 100 | /*****************************************/ |
| 101 | Protocol description. |
| 102 | /*****************************************/ |
| 103 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 104 | The current framework offers a transport layer with fixed headers. The |
| 105 | recommended protocol which uses such a header is as following: |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 106 | |
| 107 | msg->seq and msg->ack are used to determine message genealogy. When |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 108 | someone sends a message, they use a locally unique sequence and random |
| 109 | acknowledge number. The sequence number may be copied into |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 110 | nlmsghdr->nlmsg_seq too. |
| 111 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 112 | The sequence number is incremented with each message sent. |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 113 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 114 | If you expect a reply to the message, then the sequence number in the |
| 115 | received message MUST be the same as in the original message, and the |
| 116 | acknowledge number MUST be the same + 1. |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 117 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 118 | If we receive a message and its sequence number is not equal to one we |
| 119 | are expecting, then it is a new message. If we receive a message and |
| 120 | its sequence number is the same as one we are expecting, but its |
David Fries | 8a0427d | 2014-04-08 22:37:09 -0500 | [diff] [blame] | 121 | acknowledge is not equal to the sequence number in the original |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 122 | message + 1, then it is a new message. |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 123 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 124 | Obviously, the protocol header contains the above id. |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 125 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 126 | The connector allows event notification in the following form: kernel |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 127 | driver or userspace process can ask connector to notify it when |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 128 | selected ids will be turned on or off (registered or unregistered its |
| 129 | callback). It is done by sending a special command to the connector |
| 130 | driver (it also registers itself with id={-1, -1}). |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 131 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 132 | As example of this usage can be found in the cn_test.c module which |
| 133 | uses the connector to request notification and to send messages. |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 134 | |
| 135 | /*****************************************/ |
| 136 | Reliability. |
| 137 | /*****************************************/ |
| 138 | |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 139 | Netlink itself is not a reliable protocol. That means that messages can |
Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 140 | be lost due to memory pressure or process' receiving queue overflowed, |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 141 | so caller is warned that it must be prepared. That is why the struct |
| 142 | cn_msg [main connector's message header] contains u32 seq and u32 ack |
| 143 | fields. |
Evgeniy Polyakov | eb0d604 | 2005-10-13 14:42:04 -0700 | [diff] [blame] | 144 | |
| 145 | /*****************************************/ |
| 146 | Userspace usage. |
| 147 | /*****************************************/ |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 148 | |
Evgeniy Polyakov | eb0d604 | 2005-10-13 14:42:04 -0700 | [diff] [blame] | 149 | 2.6.14 has a new netlink socket implementation, which by default does not |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 150 | allow people to send data to netlink groups other than 1. |
| 151 | So, if you wish to use a netlink socket (for example using connector) |
| 152 | with a different group number, the userspace application must subscribe to |
| 153 | that group first. It can be achieved by the following pseudocode: |
Evgeniy Polyakov | eb0d604 | 2005-10-13 14:42:04 -0700 | [diff] [blame] | 154 | |
| 155 | s = socket(PF_NETLINK, SOCK_DGRAM, NETLINK_CONNECTOR); |
| 156 | |
| 157 | l_local.nl_family = AF_NETLINK; |
| 158 | l_local.nl_groups = 12345; |
| 159 | l_local.nl_pid = 0; |
| 160 | |
| 161 | if (bind(s, (struct sockaddr *)&l_local, sizeof(struct sockaddr_nl)) == -1) { |
| 162 | perror("bind"); |
| 163 | close(s); |
| 164 | return -1; |
| 165 | } |
| 166 | |
| 167 | { |
| 168 | int on = l_local.nl_groups; |
| 169 | setsockopt(s, 270, 1, &on, sizeof(on)); |
| 170 | } |
| 171 | |
| 172 | Where 270 above is SOL_NETLINK, and 1 is a NETLINK_ADD_MEMBERSHIP socket |
Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 173 | option. To drop a multicast subscription, one should call the above socket |
| 174 | option with the NETLINK_DROP_MEMBERSHIP parameter which is defined as 0. |
Evgeniy Polyakov | eb0d604 | 2005-10-13 14:42:04 -0700 | [diff] [blame] | 175 | |
| 176 | 2.6.14 netlink code only allows to select a group which is less or equal to |
| 177 | the maximum group number, which is used at netlink_kernel_create() time. |
| 178 | In case of connector it is CN_NETLINK_USERS + 0xf, so if you want to use |
| 179 | group number 12345, you must increment CN_NETLINK_USERS to that number. |
| 180 | Additional 0xf numbers are allocated to be used by non-in-kernel users. |
| 181 | |
| 182 | Due to this limitation, group 0xffffffff does not work now, so one can |
| 183 | not use add/remove connector's group notifications, but as far as I know, |
| 184 | only cn_test.c test module used it. |
| 185 | |
| 186 | Some work in netlink area is still being done, so things can be changed in |
| 187 | 2.6.15 timeframe, if it will happen, documentation will be updated for that |
| 188 | kernel. |