Log RTM_NEWLINK RTNL messages and possible problems in shill (temporary)

Several recent bug reports document a problem where attached USB
Ethernet adapters are not correctly detected by shill. The issue
persists until reboot once it has happened, but is not generally
reproducable. Logs show that the kernel correctly loads the appropriate
driver, so the issue is believed to be caused by communication problems
between the kernel and shill (via NETLINK_ROUTE sockets).

This patch adds more detailed logging upon receiving netlink messages
with the RTM_NEWLINK type in shill. The output is intentionally hacked
in at the earliest possible point to be able to catch decoding errors.
Also added additional logging to a possible netlink attribute decoding
failure code path and to the later device creation (which might
misidentify the link's "technology" parameter).

This patch is intended to be temporary. It shall run on production
systems until more reports of the problem (with now more detailed info)
show up, which will hopefully help us nail it down. It should be rolled
back as soon as enough data is available. The additional log burden is
expected to be small (compared to shill's existing amount of log
messages).

BUG=chrome-os-partner:14952,chromium-os:35479
TEST=Add new network interfaces to the system (e.g. by plugging in a USB
Ethernet adapter). Watch RTM_NEWLINK(...) and device creation messages
appear in /var/log/messages.

Change-Id: Ib8c0c423de0060d97b45da5a7bde6d03be4daadf
Signed-off-by: Julius Werner <jwerner@chromium.org>
Reviewed-on: https://gerrit.chromium.org/gerrit/35983
Reviewed-by: Paul Stewart <pstew@chromium.org>
diff --git a/device_info.cc b/device_info.cc
index 975d44d..81dad1f 100644
--- a/device_info.cc
+++ b/device_info.cc
@@ -368,6 +368,11 @@
   DeviceRefPtr device;
   delayed_devices_.erase(interface_index);
 
+  // temporary debug output
+  // TODO(jwerner@chromium.org): Remove after gathering data (crosbug 35479)
+  LOG(INFO) << "Creating new " << Technology::NameFromIdentifier(technology)
+      << " device(" << interface_index << "): " << link_name;
+
   switch (technology) {
     case Technology::kCellular:
       // Cellular devices are managed by ModemInfo.
diff --git a/rtnl_handler.cc b/rtnl_handler.cc
index 7190822..19eb460 100644
--- a/rtnl_handler.cc
+++ b/rtnl_handler.cc
@@ -224,6 +224,24 @@
 
   while (buf < end) {
     struct nlmsghdr *hdr = reinterpret_cast<struct nlmsghdr *>(buf);
+
+    // temporary debug output
+    // TODO(jwerner@chromium.org): Remove after gathering data (crosbug 35479)
+    if (static_cast<unsigned int>(end - buf) >= sizeof(struct nlmsghdr) &&
+        hdr->nlmsg_type == RTM_NEWLINK) {
+      if (NLMSG_OK(hdr, static_cast<unsigned int>(end - buf)) &&
+          hdr->nlmsg_len <= static_cast<unsigned int>(end - buf) &&
+          hdr->nlmsg_len >= NLMSG_LENGTH(sizeof(struct ifinfomsg))) {
+        struct ifinfomsg *info =
+            reinterpret_cast<struct ifinfomsg *>(NLMSG_DATA(hdr));
+        LOG(INFO) << "RTM_NEWLINK(idx=" << info->ifi_index << ", sender=" <<
+            hdr->nlmsg_pid << ", flags=" << std::hex << info->ifi_flags << ")";
+      } else {
+        LOG(INFO) << "Broken RTM_NEWLINK packet: " <<
+            ByteString(buf, end - buf).HexEncode();
+      }
+    }
+
     if (!NLMSG_OK(hdr, static_cast<unsigned int>(end - buf)))
       break;
 
diff --git a/rtnl_message.cc b/rtnl_message.cc
index 319fc6f..a2d85c1 100644
--- a/rtnl_message.cc
+++ b/rtnl_message.cc
@@ -122,6 +122,12 @@
   }
 
   if (attr_length) {
+    // temporary debug output
+    // TODO(jwerner@chromium.org): Remove after gathering data (crosbug 35479)
+    if (hdr->hdr.nlmsg_type == RTM_NEWLINK)
+      LOG(INFO) << "Broken RTM_NEWLINK attributes: " << ByteString((char *)
+          IFLA_RTA(NLMSG_DATA(&hdr->hdr)), IFLA_PAYLOAD(&hdr->hdr)).HexEncode();
+
     // We hit a parse error while going through the attributes
     attributes_.clear();
     return false;