Merge branch 'ila-cached-route'

Tom Herbert says:

====================
ila: Cache a route in ILA lwt structure

Add a dst_cache to ila_lwt structure. This holds a cached route for the
translated address. In ila_output we now perform a route lookup after
translation and if possible (destination in original route is full 128
bits) we set the dst_cache. Subsequent calls to ila_output can then use
the cache to avoid the route lookup.

This eliminates the need to set the gateway on ILA routes as previously
was being done. Now we can do somthing like:

./ip route add 3333::2000:0:0:2/128 encap ila 2222:0:0:2 \
    csum-mode neutral-map dev eth0  ## No via needed!

Also, add destroy_state to lwt ops. We need this do destroy the
dst_cache.

- v2
  - Fixed comparisons to fc_dst_len to make comparison against number
    of bits in data structure not bytes.
  - Move destroy_state under build_state (requested by Jiri)
  - Other minor cleanup

Tested:

Running 200 TCP_RR streams:

  Baseline, no ILA

    1730716 tps
    102/170/313 50/90/99% latencies
    88.11 CPU utilization

  Using ILA in both directions

    1680428 tps
    105/176/325 50/90/99% latencies
    88.16 CPU utilization
====================

Signed-off-by: David S. Miller <davem@davemloft.net>