shill: accelerate link monitoring after a suspend/resume event

When going into suspend, devices using the ath9k chipset do not disassociate
from the AP. Consequently, on networks like GoogleGuest, we are often able
to re-use the prior-to-suspend association following resume.

For some APs, however, this leads to the AP ending up in a confused state.
In particular, the Airport Extreme 802.11ac (A1521) sends DEAUTH frames
while we're sleeping, but it acknowledges our Nullfunc frames when we resume.
In fact, the AP even answers our ADDBA request with a Successful ADDBA respone.
When in this state, however, the AP does not deliver our frames to the IP
stack. E.g. ARPs fail.

Improve our post-suspend-resume behavior on these APs by running LinkMonitor
immediately on resume, and with a lower timeout. We use a per-probe timeout
of 200ms, which is at the 98.48%-ile of ARP response time observed in the
field last week. The total timeout is 1 second, and I believe we accept
late replies, so if the first ARP probe is replied too slowly, we'll still
accept it. A one-second ARP reply is at the 99.24%-ile.

Note that after LinkMonitor fails, it still takes ~10 seconds to establish
connectivity to this AP. That's because the Reassociate attempt times out.

While there:
- clarify a log message in Device::StopPortalDetection
- fix bad indent in LinkMonitor::StartInternal
- add some diagnostic infomation in cases where the IsArpRequest
  matcher fails to match expected values

BUG=chromium:244920
TEST=unit tests, manual (see below)

Manual testing
--------------
 1. grab a device which has an atheros wifi chip (e.g. lumpy, link)
 2. connect to "cros airport extreme wpa2", with password "chromeos"
 3. suspend device (e.g. close lid)
 4. wait >5 minutes.
 5. resume device (e.g. open lid)
 6. verify that the device detects a link failure in < 5 seconds
    "Link monitor has reached the failure threshold" should occur
    less than 5 seconds after "OnAfterResume" in /var/log/net.log
 7. wait 15-20 seconds
 8. verify that the device reconnects to the AP
 9. wait 45 seconds
10. verify that the device does not detect another link failure
11. switch to GoogleGuest.
12. suspend device
13. wait >5 minutes
14. resume device
15. check that device does not detect a link failure

Change-Id: I900ba45714875f5785bba3d47c73f37b863553fb
Reviewed-on: https://gerrit.chromium.org/gerrit/63126
Reviewed-by: Paul Stewart <pstew@chromium.org>
Commit-Queue: mukesh agrawal <quiche@chromium.org>
Tested-by: mukesh agrawal <quiche@chromium.org>
diff --git a/link_monitor.h b/link_monitor.h
index 074e5a3..ff6785b 100644
--- a/link_monitor.h
+++ b/link_monitor.h
@@ -37,10 +37,11 @@
   // are reset, and the link monitoring quiesces.  Needed by Metrics.
   static const int kFailureThreshold;
 
-  // The number of milliseconds between ARP requests.  Needed by Metrics.
-  static const int kTestPeriodMilliseconds;
+  // The default number of milliseconds between ARP requests. Needed by Metrics.
+  static const int kDefaultTestPeriodMilliseconds;
 
   // The default list of technologies for which link monitoring is enabled.
+  // Needed by DefaultProfile.
   static const char kDefaultLinkMonitorTechnologies[];
 
   LinkMonitor(const ConnectionRefPtr &connection,
@@ -53,8 +54,15 @@
   // Starts link-monitoring on the selected connection.  Returns
   // true if successful, false otherwise.
   virtual bool Start();
+  // Stop link-monitoring on the selected connection. Clears any
+  // accumulated statistics.
   virtual void Stop();
 
+  // Inform LinkMonitor that the system is resuming from sleep.
+  // LinkMonitor will immediately probe the gateway, using a lower
+  // timeout than normal.
+  virtual void OnAfterResume();
+
   // Return modified cumulative average of the gateway ARP response
   // time.  Returns zero if no samples are available.  For each
   // missed ARP response, the sample is assumed to be the full
@@ -69,11 +77,20 @@
   friend class LinkMonitorForTest;
   friend class LinkMonitorTest;
 
+  // The number of milliseconds between ARP requests when running a quick test.
+  // Needed by unit tests.
+  static const int kFastTestPeriodMilliseconds;
+
   // The number of samples to compute a "strict" average over.  When
   // more samples than this number arrive, this determines how "slow"
   // our simple low-pass filter works.
   static const int kMaxResponseSampleFilterDepth;
 
+  // Similar to Start, except that the initial probes use
+  // |probe_period_milliseconds|. After successfully probing with both
+  // broadcast and unicast ARPs (at least one of each), LinkMonitor
+  // switches itself to kDefaultTestPeriodMilliseconds.
+  virtual bool StartInternal(int probe_period_milliseconds);
   // Add a response time sample to the buffer.
   void AddResponseTimeSample(int response_time_milliseconds);
   // Create an ArpClient instance so we can receive and transmit ARP
@@ -109,12 +126,21 @@
   // ArpClient instance used for performing link tests.
   scoped_ptr<ArpClient> arp_client_;
 
+  // How frequently we send an ARP request. This is also the timeout
+  // for a pending request.
+  int test_period_milliseconds_;
   // The number of consecutive times we have failed in receiving
   // responses to broadcast ARP requests.
   int broadcast_failure_count_;
   // The number of consecutive times we have failed in receiving
   // responses to unicast ARP requests.
   int unicast_failure_count_;
+  // The number of consecutive times we have succeeded in receiving
+  // responses to broadcast ARP requests.
+  int broadcast_success_count_;
+  // The number of consecutive times we have succeeded in receiving
+  // responses to unicast ARP requests.
+  int unicast_success_count_;
 
   // Whether this iteration of the test was a unicast request
   // to the gateway instead of broadcast.  The link monitor