trace, argdist: Treat small USDT arguments correctly

trace and argdist currently only work correctly for USDT arguments
whose size is exactly 8 bytes. Smaller types, such as chars, shorts,
ints (signed or unsigned) are not treated correctly. The reason is
that the produced program would invoke the `bpf_usdt_readarg` helper
with the address of a u64 local variable, and then cast that variable
to the user-specified type derived from the format string. However,
the `bpf_usdt_readarg` rewriting then passes `sizeof(u64)` to the
generated `bpf_..._readarg` macro, which then fails to read anything
because the provided size doesn't match the argument size it knows
about.

The fix is fairly easy: instead of declaring a u64 unconditionally
and reading into that variable with `bpf_usdt_readarg`, declare a
variable that has the correct type according to what we know about
the USDT probe.
diff --git a/tools/trace.py b/tools/trace.py
index 3a957e9..f2a8741 100755
--- a/tools/trace.py
+++ b/tools/trace.py
@@ -321,9 +321,12 @@
                 expr = self.values[idx].strip()
                 text = ""
                 if self.probe_type == "u" and expr[0:3] == "arg":
-                        text = ("        u64 %s = 0;\n" +
+                        arg_index = int(expr[3])
+                        arg_ctype = self.usdt.get_probe_arg_ctype(
+                                self.usdt_name, arg_index - 1)
+                        text = ("        %s %s = 0;\n" +
                                 "        bpf_usdt_readarg(%s, ctx, &%s);\n") \
-                                % (expr, expr[3], expr)
+                                % (arg_ctype, expr, expr[3], expr)
 
                 if field_type == "s":
                         return text + """