shill: accelerate link monitoring after a suspend/resume event
When going into suspend, devices using the ath9k chipset do not disassociate
from the AP. Consequently, on networks like GoogleGuest, we are often able
to re-use the prior-to-suspend association following resume.
For some APs, however, this leads to the AP ending up in a confused state.
In particular, the Airport Extreme 802.11ac (A1521) sends DEAUTH frames
while we're sleeping, but it acknowledges our Nullfunc frames when we resume.
In fact, the AP even answers our ADDBA request with a Successful ADDBA respone.
When in this state, however, the AP does not deliver our frames to the IP
stack. E.g. ARPs fail.
Improve our post-suspend-resume behavior on these APs by running LinkMonitor
immediately on resume, and with a lower timeout. We use a per-probe timeout
of 200ms, which is at the 98.48%-ile of ARP response time observed in the
field last week. The total timeout is 1 second, and I believe we accept
late replies, so if the first ARP probe is replied too slowly, we'll still
accept it. A one-second ARP reply is at the 99.24%-ile.
Note that after LinkMonitor fails, it still takes ~10 seconds to establish
connectivity to this AP. That's because the Reassociate attempt times out.
While there:
- clarify a log message in Device::StopPortalDetection
- fix bad indent in LinkMonitor::StartInternal
- add some diagnostic infomation in cases where the IsArpRequest
matcher fails to match expected values
BUG=chromium:244920
TEST=unit tests, manual (see below)
Manual testing
--------------
1. grab a device which has an atheros wifi chip (e.g. lumpy, link)
2. connect to "cros airport extreme wpa2", with password "chromeos"
3. suspend device (e.g. close lid)
4. wait >5 minutes.
5. resume device (e.g. open lid)
6. verify that the device detects a link failure in < 5 seconds
"Link monitor has reached the failure threshold" should occur
less than 5 seconds after "OnAfterResume" in /var/log/net.log
7. wait 15-20 seconds
8. verify that the device reconnects to the AP
9. wait 45 seconds
10. verify that the device does not detect another link failure
11. switch to GoogleGuest.
12. suspend device
13. wait >5 minutes
14. resume device
15. check that device does not detect a link failure
Change-Id: I900ba45714875f5785bba3d47c73f37b863553fb
Reviewed-on: https://gerrit.chromium.org/gerrit/63126
Reviewed-by: Paul Stewart <pstew@chromium.org>
Commit-Queue: mukesh agrawal <quiche@chromium.org>
Tested-by: mukesh agrawal <quiche@chromium.org>
7 files changed