Marc Boucher | e6869a8 | 2000-03-20 06:03:29 +0000 | [diff] [blame^] | 1 | ------------------------------------------------------------------------------------ |
| 2 | IPv4 Queuing Documentation |
| 3 | ------------------------------------------------------------------------------------ |
| 4 | |
| 5 | Note: this file is temporary until the documentation is complete. |
| 6 | |
| 7 | Upgrade information: |
| 8 | * If upgrading from the queue device (v0.90.4 or below), you will need to |
| 9 | delete the old shared library, usually found in |
| 10 | /usr/local/lib/iptables/libipt_QUEUE.so |
| 11 | |
| 12 | TODO List: |
| 13 | * Non-blocking i/o for userspace api |
| 14 | * Buffered verdicts |
| 15 | * Reschedule processing if userspace busy |
| 16 | * Better session reliability |
| 17 | * Testsuite scripts, fix/improve tools |
| 18 | * Documentation |
| 19 | * Multiple queues per protocol? |
| 20 | * Performance analysis |
| 21 | * Userspace language bindings |
| 22 | |
| 23 | |
| 24 | Overview: |
| 25 | The following diagram is a conceptual view of how the queue operates: |
| 26 | |
| 27 | +---------+ |
| 28 | | QUEUE | |
| 29 | +---------+ |
| 30 | | | |
| 31 | | +---+ | --> dequeue() --> nf_reinject() [stack] |
| 32 | | | V | | |
| 33 | | +---+ | |
| 34 | | | |
| 35 | | +---+ | |
| 36 | | | W | | |
| 37 | | +---+ | |
| 38 | | | |
| 39 | | +---+ | |
| 40 | | | V | | |
| 41 | | +---+ | |
| 42 | | | |
| 43 | | +---+ | |
| 44 | | | V | | <-- set_verdict() [user] |
| 45 | | +---+ | |
| 46 | | | |
| 47 | | +---+ | |
| 48 | | | W | | |
| 49 | | +---+ | |
| 50 | | | |
| 51 | | +---+ | |
| 52 | | | N | | --> notify_user() [user] |
| 53 | | +---+ | |
| 54 | | | |
| 55 | +---------+ <-- set_mode() [user] |
| 56 | ^ |
| 57 | | |
| 58 | enqueue() |
| 59 | ^ |
| 60 | | |
| 61 | nf_queue() [stack] |
| 62 | |
| 63 | |
| 64 | The queue is processed via a kernel thread, which is woken up upon enqueue() |
| 65 | set_mode() and set_verdict(). |
| 66 | |
| 67 | As the queue is modal, and netlink is connectionless, a reasonable amount of |
| 68 | state needs to be maintained. |
| 69 | |
| 70 | Packet states: |
| 71 | N = new packet (default initial state) |
| 72 | W = user notfied, waiting for verdict |
| 73 | V = verdict set (usually by user) |
| 74 | |
| 75 | Queue states (settable by user): |
| 76 | * HOLD (default initial state) |
| 77 | enqueue packets |
| 78 | do not notify user |
| 79 | do not accept verdicts |
| 80 | do not dequeue packets |
| 81 | |
| 82 | * NORMAL |
| 83 | enqueue packets |
| 84 | notify user of new packets (may copy entire packet) |
| 85 | accept verdicts from user (may include modified packet) |
| 86 | dequeue packets |
| 87 | |
| 88 | * FLUSH (returns to HOLD when queue is empty, unless terminating) |
| 89 | do not enqueue packets |
| 90 | do not not notify user |
| 91 | set verdicts on all packets to NF_DROP |
| 92 | dequeue all packets for dropping |
| 93 | |
| 94 | Note that for HOLD & NORMAL queue states, new packets are dropped if the |
| 95 | queue is full. |
| 96 | |
| 97 | Known bugs: |
| 98 | - Userspace app gets unknown message from kernel if it sends an invalid |
| 99 | message type (should get an NLMSG_ERROR). |
| 100 | |
| 101 | Documentation notes: |
| 102 | libipq: |
| 103 | - Queue is held after flush completes, user must either start copying |
| 104 | or shutdown or the queue will fill up. |
| 105 | |
| 106 | - If you get a IPQ_ERR_RTRUNC message, your local receive |
| 107 | buffer is probably too small. Netlink has no way of detecting |
| 108 | this, and thinks the message was delivered (technically, it was, |
| 109 | to your *socket* receive buffer though). Thus you need to respond |
| 110 | with an NF_DROP for the packet and use a bigger buffer. |
| 111 | |
| 112 | - If you modify a packet, you must recalculate checksums as |
| 113 | appropriate before sending it back. |
| 114 | |
| 115 | - The code wont stop you from doing this, but try not to set NF_QUEUE |
| 116 | verdict on packets. |
| 117 | |
| 118 | |