mac80211: defer tailroom counter manipulation when roaming

During roaming, the crypto_tx_tailroom_needed_cnt counter
will often take values 2,1,0,1,2 because first keys are
removed and then new keys are added. This is inefficient
because during the 0->1 transition, synchronize_net must
be called to avoid packet races, although typically no
packets would be flowing during that time.

To avoid that, defer the decrement (2->1, 1->0) when keys
are removed (by half a second). This means the counter
will really have the values 2,2,2,3,4 ... 2, thus never
reaching 0 and having to do the 0->1 transition.

Note that this patch entirely disregards the drivers for
which this optimisation was done to start with, for them
the key removal itself will be expensive because it has
to synchronize_net() after the counter is incremented to
remove the key from HW crypto. For them the sequence will
look like this: 0,1,0,1,0,1,0,1,0 (*) which is clearly a
lot more inefficient. This could be addressed separately,
during key removal the 0->1->0 sequence isn't necessary.

(*) it starts at 0 because HW crypto is on, then goes to
    1 when HW crypto is disabled for a key, then back to
    0 because the key is deleted; this happens for both
    keys in the example. When new keys are added, it goes
    to 1 first because they're added in software; when a
    key is moved to hardware it goes back to 0

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
diff --git a/net/mac80211/ieee80211_i.h b/net/mac80211/ieee80211_i.h
index 0acc07b..54d09ec 100644
--- a/net/mac80211/ieee80211_i.h
+++ b/net/mac80211/ieee80211_i.h
@@ -680,6 +680,8 @@
 
 	/* count for keys needing tailroom space allocation */
 	int crypto_tx_tailroom_needed_cnt;
+	int crypto_tx_tailroom_pending_dec;
+	struct delayed_work dec_tailroom_needed_wk;
 
 	struct net_device *dev;
 	struct ieee80211_local *local;