net/mlx4: use one page fragment per incoming frame

mlx4 driver has a suboptimal memory allocation strategy for regular
MTU=1500 frames, as it uses two page fragments :

One of 512 bytes and one of 1024 bytes.

This makes GRO less effective, as each GSO packet contains 8 MSS instead
of 16 MSS.

Performance of a single TCP flow gains 25 % increase with the following
patch.

Before patch :

A:~# netperf -H 192.168.0.2 -Cc
MIGRATED TCP STREAM TEST ...
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

 87380  16384  16384    10.00      13798.47   3.06     4.20     0.436   0.598

After patch :

A:~# netperf -H 192.68.0.2 -Cc
MIGRATED TCP STREAM TEST ...
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

 87380  16384  16384    10.00      17273.80   3.44     4.19     0.391   0.477

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Amir Vadai <amirv@mellanox.com>
Acked-By: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1 file changed