ext4slower
diff --git a/man/man8/ext4slower.8 b/man/man8/ext4slower.8
new file mode 100644
index 0000000..7591f28
--- /dev/null
+++ b/man/man8/ext4slower.8
@@ -0,0 +1,113 @@
+.TH ext4slower 8  "2016-02-11" "USER COMMANDS"
+.SH NAME
+ext4slower \- Trace slow ext4 file operations, with per-event details.
+.SH SYNOPSIS
+.B ext4slower [\-h] [\-j] [\-p PID] [min_ms]
+.SH DESCRIPTION
+This tool traces common ext4 file operations: reads, writes, opens, and
+syncs. It measures the time spent in these operations, and prints details
+for each that exceeded a threshold.
+
+WARNING: See the OVERHEAD section.
+
+By default, a minimum millisecond threshold of 10 is used. If a threshold of 0
+is used, all events are printed (warning: verbose).
+
+Since this works by tracing the ext4_file_operations interface functions, it
+will need updating to match any changes to these functions.
+
+Since this uses BPF, only the root user can use this tool.
+.SH REQUIREMENTS
+CONFIG_BPF and bcc.
+.SH OPTIONS
+\-p PID
+Trace this PID only.
+.TP
+min_ms
+Minimum I/O latency (duration) to trace, in milliseconds. Default is 10 ms.
+.SH EXAMPLES
+.TP
+Trace synchronous file reads and writes slower than 10 ms:
+#
+.B ext4slower
+.TP
+Trace slower than 1 ms:
+#
+.B ext4slower 1
+.TP
+Trace slower than 1 ms, and output just the fields in parsable format (csv):
+#
+.B ext4slower \-j 1
+.TP
+Trace all file reads and writes (warning: the output will be verbose):
+#
+.B ext4slower 0
+.TP
+Trace slower than 1 ms, for PID 181 only:
+#
+.B ext4slower \-p 181 1
+.SH FIELDS
+.TP
+TIME(s)
+Time of I/O completion since the first I/O seen, in seconds.
+.TP
+COMM
+Process name.
+.TP
+PID
+Process ID.
+.TP
+T
+Type of operation. R == read, W == write, O == open, S == fsync.
+.TP
+OFF_KB
+File offset for the I/O, in Kbytes.
+.TP
+BYTES
+Size of I/O, in bytes.
+.TP
+LAT(ms)
+Latency (duration) of I/O, measured from when it was issued by VFS to the
+filesystem, to when it completed. This time is inclusive of block device I/O,
+file system CPU cycles, file system locks, run queue latency, etc. It's a more
+accurate measure of the latency suffered by applications performing file
+system I/O, than to measure this down at the block device interface.
+.TP
+FILENAME
+A cached kernel file name (comes from dentry->d_iname).
+.TP
+ENDTIME_us
+Completion timestamp, microseconds (\-j only).
+.TP
+OFFSET_b
+File offset, bytes (\-j only).
+.TP
+LATENCY_us
+Latency (duration) of the I/O, in microseconds (\-j only).
+.SH OVERHEAD
+This adds low-overhead instrumentation to these ext4 operations,
+including reads and writes from the file system cache. Such reads and writes
+can be very frequent (depending on the workload; eg, 1M/sec), at which
+point the overhead of this tool (even if it prints no "slower" events) can
+begin to become significant. Measure and quantify before use. If this
+continues to be a problem, consider switching to a tool that prints in-kernel
+summaries only.
+.PP
+Note that the overhead of this tool should be less than fileslower(8), as
+this tool targets ext4 functions only, and not all file read/write paths
+(which can include socket I/O).
+.SH SOURCE
+This is from bcc.
+.IP
+https://github.com/iovisor/bcc
+.PP
+Also look in the bcc distribution for a companion _examples.txt file containing
+example usage, output, and commentary for this tool.
+.SH OS
+Linux
+.SH STABILITY
+Unstable - in development.
+.SH AUTHOR
+Brendan Gregg
+.SH SEE ALSO
+biosnoop(8), funccount(8), fileslower(8)