sfc: Workaround flush failures on Falcon B0

Under certain conditions a PHY may backpressure Falcon B0
in such a way that flushes timeout. In normal circumstances
the phy poller would fix the PHY, and the flush could complete.

But efx_nic_flush_queues() is always called after efx_stop_all(),
so the poller has been stopped. Even if this weren't the case,
how long would we have to wait for the poller to fix this? And
several callers of efx_nic_flush_queues() are about to reset
the device anyway - so we don't need to do anything.

Work around this bug by scheduling a reset. Ensure that the
MAC is never rewired back into the datapath before the reset
runs (we already ignore all rx events anyway).

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
diff --git a/drivers/net/sfc/efx.c b/drivers/net/sfc/efx.c
index 0319000..d1a1d32 100644
--- a/drivers/net/sfc/efx.c
+++ b/drivers/net/sfc/efx.c
@@ -27,6 +27,7 @@
 #include "nic.h"
 
 #include "mcdi.h"
+#include "workarounds.h"
 
 /**************************************************************************
  *
@@ -556,10 +557,18 @@
 	BUG_ON(efx->port_enabled);
 
 	rc = efx_nic_flush_queues(efx);
-	if (rc)
+	if (rc && EFX_WORKAROUND_7803(efx)) {
+		/* Schedule a reset to recover from the flush failure. The
+		 * descriptor caches reference memory we're about to free,
+		 * but falcon_reconfigure_mac_wrapper() won't reconnect
+		 * the MACs because of the pending reset. */
+		EFX_ERR(efx, "Resetting to recover from flush failure\n");
+		efx_schedule_reset(efx, RESET_TYPE_ALL);
+	} else if (rc) {
 		EFX_ERR(efx, "failed to flush queues\n");
-	else
+	} else {
 		EFX_LOG(efx, "successfully flushed all queues\n");
+	}
 
 	efx_for_each_channel(channel, efx) {
 		EFX_LOG(channel->efx, "shut down chan %d\n", channel->channel);