net/mlx5e: Use prefetchw when a write is to follow

"prefetchw()" prefetches the cacheline for write. Use it for
skb->data, as soon we'll be copying the packet header there.

Performance:
Single-stream packet-rate tested with pktgen.
Packets are dropped in tc level to zoom into driver data-path.
Larger gain is expected for smaller packets, as less time
is spent on handling SKB fragments, making the path shorter
and the improvement more significant.

---------------------------------------------
packet size | before    | after     | gain  |
64B         | 4,113,306 | 4,778,720 |  16%  |
1024B       | 3,633,819 | 3,950,593 | 8.7%  |

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Cc: kernel-team@fb.com
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
1 file changed