Improve performance by using malloc() over calloc() in critical places

As pointed out by Regis Hanna, a considerable performance gain can be
achieved by using malloc() over calloc() when allocating netlink message
buffers. This is likely due to the fact that we use a complete page for
each message.
diff --git a/lib/msg.c b/lib/msg.c
index 3d4fbc6..c5cb7b4 100644
--- a/lib/msg.c
+++ b/lib/msg.c
@@ -372,10 +372,12 @@
 	if (!nm)
 		goto errout;
 
-	nm->nm_nlh = calloc(1, len);
+	nm->nm_nlh = malloc(len);
 	if (!nm->nm_nlh)
 		goto errout;
 
+	memset(nm->nm_nlh, 0, sizeof(struct nlmsghdr));
+
 	nm->nm_protocol = -1;
 	nm->nm_size = len;
 	nm->nm_nlh->nlmsg_len = nlmsg_total_size(0);