adb: fix adb client running out of sockets on Windows

Background
==========

On Windows, if you run "adb shell exit" in a loop in two windows,
eventually the adb client will be unable to connect to the adb server. I
think connect() is returning WSAEADDRINUSE: "Only one usage of each
socket address (protocol/network address/port) is normally permitted.
(10048)". The Windows System Event Log may also show Event 4227, Tcpip.
Netstat output is filled with:

  # for the adb server
  TCP    127.0.0.1:5037         127.0.0.1:65523        TIME_WAIT
  # for the adb client
  TCP    127.0.0.1:65523        127.0.0.1:5037         TIME_WAIT

The error probably means that the client is running out of free
address:port pairs.

The first netstat line is unavoidable, but the second line exists
because the adb client is not waiting for orderly/graceful shutdown of
the socket, and that is apparently required on Windows to get rid of the
second line. For more info, see
https://github.com/CompareAndSwap/SocketCloseTest .

This is exacerbated by the fact that "adb shell exit" makes 4 socket
connections to the adb server: 1) host:version, 2) host:features, 3)
host:version (again), 4) shell:exit. Also exacerbating is the fact that
the adb protocol is length-prefixed so the client typically does not
have to 'read() until zero' which effectively waits for orderly/graceful
shutdown.

The Fix
=======

Introduce a function, ReadOrderlyShutdown(), that should be called in
the adb client to wait for the server to close its socket, before
closing the client socket.

I reviewed all code where the adb client makes a connection to the adb
server and added ReadOrderlyShutdown() when it made sense. I wasn't able
to add it to the following:

* interactive_shell: this doesn't matter because this is interactive and
  thus can't be run fast enough to use up ports.
* adb sideload: I couldn't get enough test coverage and I don't think
  this is being called frequently enough to be a problem.
* send_shell_command, backup, adb_connect_command, adb shell, adb
  exec-out, install_multiple_app, adb_send_emulator_command: These
  already wait for server socket shutdown since they already call
  recv() until zero.
* restore, adb exec-in: protocol design can't have the server close
  first.
* adb start-server: no fd is actually returned
* create_local_service_socket, local_connect_arbitrary_ports,
  connect_device: probably called rarely enough not to be a problem.

Also in this change
===================

* Clarify comments in when adb_shutdown() is called before exit().
* add some missing adb_close() in adb sideload.
* Fixup error handling and comments in adb_send_emulator_command().
* Make SyncConnection::SendQuit return a success boolean.
* Add unittest for adb emu kill command. This gets code coverage over
  this very careful piece of code.

Change-Id: Iad0b1336f5b74186af2cd35f7ea827d0fa77a17c
Signed-off-by: Spencer Low <CompareAndSwap@gmail.com>
diff --git a/adb/adb.cpp b/adb/adb.cpp
index 1eb3a3c..c39c178 100644
--- a/adb/adb.cpp
+++ b/adb/adb.cpp
@@ -963,9 +963,11 @@
         fflush(stdout);
         SendOkay(reply_fd);
 
-        // At least on Windows, if we exit() without shutdown(SD_SEND) or
-        // closesocket(), the client's next recv() will error-out with
-        // WSAECONNRESET and they'll never read the OKAY.
+        // On Windows, if the process exits with open sockets that
+        // shutdown(SD_SEND) has not been called on, TCP RST segments will be
+        // sent to the peers which will cause their next recv() to error-out
+        // with WSAECONNRESET. In the case of this code, that means the client
+        // may not read the OKAY sent above.
         adb_shutdown(reply_fd);
 
         exit(0);
diff --git a/adb/adb_client.cpp b/adb/adb_client.cpp
index fa8ce9e..ddeb5f1 100644
--- a/adb/adb_client.cpp
+++ b/adb/adb_client.cpp
@@ -206,6 +206,7 @@
                 return -1;
             }
 
+            ReadOrderlyShutdown(fd);
             adb_close(fd);
 
             if (sscanf(&version_string[0], "%04x", &version) != 1) {
@@ -227,6 +228,7 @@
                    version, ADB_SERVER_VERSION);
             fd = _adb_connect("host:kill", error);
             if (fd >= 0) {
+                ReadOrderlyShutdown(fd);
                 adb_close(fd);
             } else {
                 // If we couldn't connect to the server or had some other error,
@@ -271,6 +273,8 @@
         return false;
     }
 
+    ReadOrderlyShutdown(fd);
+    adb_close(fd);
     return true;
 }
 
@@ -286,5 +290,8 @@
         adb_close(fd);
         return false;
     }
+
+    ReadOrderlyShutdown(fd);
+    adb_close(fd);
     return true;
 }
diff --git a/adb/adb_io.cpp b/adb/adb_io.cpp
index 7788996..a37fbc0 100644
--- a/adb/adb_io.cpp
+++ b/adb/adb_io.cpp
@@ -137,3 +137,43 @@
 
     return WriteFdExactly(fd, str);
 }
+
+bool ReadOrderlyShutdown(int fd) {
+    char buf[16];
+
+    // Only call this function if you're sure that the peer does
+    // orderly/graceful shutdown of the socket, closing the socket so that
+    // adb_read() will return 0. If the peer keeps the socket open, adb_read()
+    // will never return.
+    int result = adb_read(fd, buf, sizeof(buf));
+    if (result == -1) {
+        // If errno is EAGAIN, that means this function was called on a
+        // nonblocking socket and it would have blocked (which would be bad
+        // because we'd probably block the main thread where nonblocking IO is
+        // done). Don't do that. If you have a nonblocking socket, use the
+        // fdevent APIs to get called on FDE_READ, and then call this function
+        // if you really need to, but it shouldn't be needed for server sockets.
+        CHECK_NE(errno, EAGAIN);
+
+        // Note that on Windows, orderly shutdown sometimes causes
+        // recv() == SOCKET_ERROR && WSAGetLastError() == WSAECONNRESET. That
+        // can be ignored.
+        return false;
+    } else if (result == 0) {
+        // Peer has performed an orderly/graceful shutdown.
+        return true;
+    } else {
+        // Unexpectedly received data. This is essentially a protocol error
+        // because you should not call this function unless you expect no more
+        // data. We don't repeatedly call adb_read() until we get zero because
+        // we don't know how long that would take, but we do know that the
+        // caller wants to close the socket soon.
+        VLOG(RWX) << "ReadOrderlyShutdown(" << fd << ") unexpectedly read "
+                  << dump_hex(buf, result);
+        // Shutdown the socket to prevent the caller from reading or writing to
+        // it which doesn't make sense if we just read and discarded some data.
+        adb_shutdown(fd);
+        errno = EINVAL;
+        return false;
+    }
+}
diff --git a/adb/adb_io.h b/adb/adb_io.h
index 9c3b2a5..aa550af 100644
--- a/adb/adb_io.h
+++ b/adb/adb_io.h
@@ -41,6 +41,24 @@
 // If this function fails, the contents of buf are undefined.
 bool ReadFdExactly(int fd, void* buf, size_t len);
 
+// Given a client socket, wait for orderly/graceful shutdown. Call this:
+//
+// * Before closing a client socket.
+// * Only when no more data is expected to come in.
+// * Only when the server is not waiting for data from the client (because then
+//   the client and server will deadlock waiting for each other).
+// * Only when the server is expected to close its socket right now.
+// * Don't call shutdown(SHUT_WR) before calling this because that will shutdown
+//   the client socket early, defeating the purpose of calling this.
+//
+// Waiting for orderly/graceful shutdown of the server socket will cause the
+// server socket to close before the client socket. That prevents the client
+// socket from staying in TIME_WAIT which eventually causes subsequent
+// connect()s from the client to fail with WSAEADDRINUSE on Windows.
+// Returns true if it is sure that orderly/graceful shutdown has occurred with
+// no additional data read from the server.
+bool ReadOrderlyShutdown(int fd);
+
 // Writes exactly len bytes from buf to fd.
 //
 // Returns false if there is an error or if the fd was closed before the write
diff --git a/adb/commandline.cpp b/adb/commandline.cpp
index a720859..eff9876 100644
--- a/adb/commandline.cpp
+++ b/adb/commandline.cpp
@@ -649,6 +649,7 @@
             std::string error;
             adb_status(fd, &error);
             fprintf(stderr,"* failed to write data '%s' *\n", error.c_str());
+            adb_close(fd);
             return -1;
         }
         sz -= xfer;
@@ -664,6 +665,7 @@
 
     if (!adb_status(fd, &error)) {
         fprintf(stderr,"* error response '%s' *\n", error.c_str());
+        adb_close(fd);
         return -1;
     }
 
@@ -1468,6 +1470,7 @@
         } else {
             // Successfully connected, kill command sent, okay status came back.
             // Server should exit() in a moment, if not already.
+            ReadOrderlyShutdown(fd);
             adb_close(fd);
             return 0;
         }
diff --git a/adb/console.cpp b/adb/console.cpp
index ba5a72b..5a9c6ab 100644
--- a/adb/console.cpp
+++ b/adb/console.cpp
@@ -25,6 +25,7 @@
 
 #include "adb.h"
 #include "adb_client.h"
+#include "adb_io.h"
 
 // Return the console port of the currently connected emulator (if any) or -1 if
 // there is no emulator, and -2 if there is more than one.
@@ -87,14 +88,20 @@
         return 1;
     }
 
+    std::string commands;
+
     for (int i = 1; i < argc; i++) {
-        adb_write(fd, argv[i], strlen(argv[i]));
-        adb_write(fd, i == argc - 1 ? "\n" : " ", 1);
+        commands.append(argv[i]);
+        commands.append(i == argc - 1 ? "\n" : " ");
     }
 
-    const char disconnect_command[] = "quit\n";
-    if (adb_write(fd, disconnect_command, sizeof(disconnect_command) - 1) == -1) {
-        LOG(FATAL) << "Could not finalize emulator command";
+    commands.append("quit\n");
+
+    if (!WriteFdExactly(fd, commands)) {
+        fprintf(stderr, "error: cannot write to emulator: %s\n",
+                strerror(errno));
+        adb_close(fd);
+        return 1;
     }
 
     // Drain output that the emulator console has sent us to prevent a problem
@@ -106,11 +113,14 @@
     do {
         char buf[BUFSIZ];
         result = adb_read(fd, buf, sizeof(buf));
-        // Keep reading until zero bytes (EOF) or an error. If 'adb emu kill'
-        // is executed, the emulator calls exit() which causes adb to get
-        // ECONNRESET. Any other emu command is followed by the quit command
-        // that we sent above, and that causes the emulator to close the socket
-        // which should cause zero bytes (EOF) to be returned.
+        // Keep reading until zero bytes (orderly/graceful shutdown) or an
+        // error. If 'adb emu kill' is executed, the emulator calls exit() with
+        // the socket open (and shutdown(SD_SEND) was not called), which causes
+        // Windows to send a TCP RST segment which causes adb to get ECONNRESET.
+        // Any other emu command is followed by the quit command that we
+        // appended above, and that causes the emulator to close the socket
+        // which should cause zero bytes (orderly/graceful shutdown) to be
+        // returned.
     } while (result > 0);
 
     adb_close(fd);
diff --git a/adb/file_sync_client.cpp b/adb/file_sync_client.cpp
index b7e835b..8c9f7a6 100644
--- a/adb/file_sync_client.cpp
+++ b/adb/file_sync_client.cpp
@@ -65,7 +65,15 @@
     ~SyncConnection() {
         if (!IsValid()) return;
 
-        SendQuit();
+        if (SendQuit()) {
+            // We sent a quit command, so the server should be doing orderly
+            // shutdown soon. But if we encountered an error while we were using
+            // the connection, the server might still be sending data (before
+            // doing orderly shutdown), in which case we won't wait for all of
+            // the data nor the coming orderly shutdown. In the common success
+            // case, this will wait for the server to do orderly shutdown.
+            ReadOrderlyShutdown(fd);
+        }
         adb_close(fd);
     }
 
@@ -201,8 +209,8 @@
 
     LinePrinter line_printer_;
 
-    void SendQuit() {
-        SendRequest(ID_QUIT, ""); // TODO: add a SendResponse?
+    bool SendQuit() {
+        return SendRequest(ID_QUIT, ""); // TODO: add a SendResponse?
     }
 
     static uint64_t CurrentTimeMs() {
diff --git a/adb/test_adb.py b/adb/test_adb.py
index 5bda22f..0f1b034 100644
--- a/adb/test_adb.py
+++ b/adb/test_adb.py
@@ -21,8 +21,11 @@
 """
 from __future__ import print_function
 
+import contextlib
 import os
 import random
+import socket
+import struct
 import subprocess
 import threading
 import unittest
@@ -140,6 +143,71 @@
             subprocess.check_output(['adb', '-P', str(port), 'kill-server'],
                                     stderr=subprocess.STDOUT)
 
+    # Use SO_LINGER to cause TCP RST segment to be sent on socket close.
+    def _reset_socket_on_close(self, sock):
+        # The linger structure is two shorts on Windows, but two ints on Unix.
+        linger_format = 'hh' if os.name == 'nt' else 'ii'
+        l_onoff = 1
+        l_linger = 0
+
+        sock.setsockopt(socket.SOL_SOCKET, socket.SO_LINGER,
+                        struct.pack(linger_format, l_onoff, l_linger))
+        # Verify that we set the linger structure properly by retrieving it.
+        linger = sock.getsockopt(socket.SOL_SOCKET, socket.SO_LINGER, 16)
+        self.assertEqual((l_onoff, l_linger),
+                         struct.unpack_from(linger_format, linger))
+
+    def test_emu_kill(self):
+        """Ensure that adb emu kill works.
+
+        Bug: https://code.google.com/p/android/issues/detail?id=21021
+        """
+        port = 12345
+
+        with contextlib.closing(
+                socket.socket(socket.AF_INET, socket.SOCK_STREAM)) as listener:
+            # Use SO_REUSEADDR so subsequent runs of the test can grab the port
+            # even if it is in TIME_WAIT.
+            listener.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
+            listener.bind(('127.0.0.1', port))
+            listener.listen(4)
+
+            # Now that listening has started, start adb emu kill, telling it to
+            # connect to our mock emulator.
+            p = subprocess.Popen(
+                ['adb', '-s', 'emulator-' + str(port), 'emu', 'kill'],
+                stderr=subprocess.STDOUT)
+
+            accepted_connection, addr = listener.accept()
+            with contextlib.closing(accepted_connection) as conn:
+                # If WSAECONNABORTED (10053) is raised by any socket calls,
+                # then adb probably isn't reading the data that we sent it.
+                conn.sendall('Android Console: type \'help\' for a list ' +
+                                'of commands\r\n')
+                conn.sendall('OK\r\n')
+
+                with contextlib.closing(conn.makefile()) as f:
+                    self.assertEqual('kill\n', f.readline())
+                    self.assertEqual('quit\n', f.readline())
+
+                conn.sendall('OK: killing emulator, bye bye\r\n')
+
+                # Use SO_LINGER to send TCP RST segment to test whether adb
+                # ignores WSAECONNRESET on Windows. This happens with the
+                # real emulator because it just calls exit() without closing
+                # the socket or calling shutdown(SD_SEND). At process
+                # termination, Windows sends a TCP RST segment for every
+                # open socket that shutdown(SD_SEND) wasn't used on.
+                self._reset_socket_on_close(conn)
+
+            # Wait for adb to finish, so we can check return code.
+            p.communicate()
+
+            # If this fails, adb probably isn't ignoring WSAECONNRESET when
+            # reading the response from the adb emu kill command (on Windows).
+            self.assertEqual(0, p.returncode)
+
+
 def main():
     random.seed(0)
     if len(adb.get_devices()) > 0: