net: bcmgenet: fix dev->stats.tx_bytes accounting

1. Add bytes_compl local variable to __bcmgenet_tx_reclaim() to collect
   transmitted bytes. dev->stats updates can then be moved outside the
   while-loop. bytes_compl is also needed for future BQL support.
2. When bcmgenet device uses Tx checksum offload, each transmitted skb
   gets an extra 64-byte header prepended to it. Before this header is
   prepended to the skb, we need to save the skb "wire" length in
   GENET_CB(skb)->bytes_sent, so that proper Tx bytes accounting can
   be done in __bcmgenet_tx_reclaim().
3. skb->len covers the entire length of skb, whether it is linear or
   fragmented. Thus, when we clean the fragments, do not increase
   transmitted bytes.

Fixes: 1c1008c793fa ("net: bcmgenet: add main driver file")
Signed-off-by: Petri Gynther <pgynther@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.h b/drivers/net/ethernet/broadcom/genet/bcmgenet.h
index 9673675..1e2dc34 100644
--- a/drivers/net/ethernet/broadcom/genet/bcmgenet.h
+++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.h
@@ -531,6 +531,12 @@
 	u32		flags;
 };
 
+struct bcmgenet_skb_cb {
+	unsigned int bytes_sent;	/* bytes on the wire (no TSB) */
+};
+
+#define GENET_CB(skb)	((struct bcmgenet_skb_cb *)((skb)->cb))
+
 struct bcmgenet_tx_ring {
 	spinlock_t	lock;		/* ring lock */
 	struct napi_struct napi;	/* NAPI per tx queue */