Retry: [llvm-profdata] Speed up merging by using a thread pool
Add a "-j" option to llvm-profdata to control the number of threads used.
Auto-detect NumThreads when it isn't specified, and avoid spawning threads when
they wouldn't be beneficial.
I tested this patch using a raw profile produced by clang (147MB). Here is the
time taken to merge 4 copies together on my laptop:
No thread pool: 112.87s user 5.92s system 97% cpu 2:01.08 total
With 2 threads: 134.99s user 26.54s system 164% cpu 1:33.31 total
Changes since the initial commit:
- When handling odd-length inputs, call ThreadPool::wait() before merging the
last profile. Should fix a race/off-by-one (see r275937).
Differential Revision: https://reviews.llvm.org/D22438
llvm-svn: 275938
diff --git a/llvm/docs/CommandGuide/llvm-profdata.rst b/llvm/docs/CommandGuide/llvm-profdata.rst
index f5508b5..bae0ff7 100644
--- a/llvm/docs/CommandGuide/llvm-profdata.rst
+++ b/llvm/docs/CommandGuide/llvm-profdata.rst
@@ -106,6 +106,11 @@
conjunction with -instr. Defaults to false, since it can inhibit compiler
optimization during PGO.
+.. option:: -num-threads=N, -j=N
+
+ Use N threads to perform profile merging. When N=0, llvm-profdata auto-detects
+ an appropriate number of threads to use. This is the default.
+
EXAMPLES
^^^^^^^^
Basic Usage