tunnels: Optimize tx path

We currently dirty a cache line to update tunnel device stats
(tx_packets/tx_bytes). We better use the txq->tx_bytes/tx_packets
counters that already are present in cpu cache, in the cache
line shared with txq->_xmit_lock

This patch extends IPTUNNEL_XMIT() macro to use txq pointer
provided by the caller.

Also &tunnel->dev->stats can be replaced by &dev->stats

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
diff --git a/include/net/ipip.h b/include/net/ipip.h
index 87acf8f..0159221 100644
--- a/include/net/ipip.h
+++ b/include/net/ipip.h
@@ -42,9 +42,9 @@
 	ip_select_ident(iph, &rt->u.dst, NULL);				\
 									\
 	err = ip_local_out(skb);					\
-	if (net_xmit_eval(err) == 0) {					\
-		stats->tx_bytes += pkt_len;				\
-		stats->tx_packets++;					\
+	if (likely(net_xmit_eval(err) == 0)) {				\
+		txq->tx_bytes += pkt_len;				\
+		txq->tx_packets++;					\
 	} else {							\
 		stats->tx_errors++;					\
 		stats->tx_aborted_errors++;				\