Improve callgrind performance by 4 to 8% using UNLIKELY
Performance improvements from 4 to 8% obtained on amd64 on the perf tests by:
1. using UNLIKELY inside tracing macros
2. avoid calling CLG_(switch_thread)(tid) on the hot patch setup_bbcc
unless tid differs from CLG_(current_tid).
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@12939 a5019735-40e9-0310-863c-91ae7b9d1cf9
diff --git a/callgrind/bbcc.c b/callgrind/bbcc.c
index 22dc16f..ad8a76d 100644
--- a/callgrind/bbcc.c
+++ b/callgrind/bbcc.c
@@ -571,7 +571,12 @@
*/
tid = VG_(get_running_tid)();
#if 1
- CLG_(switch_thread)(tid);
+ /* CLG_(switch_thread) is a no-op when tid is equal to CLG_(current_tid).
+ * As this is on the hot path, we only call CLG_(switch_thread)(tid)
+ * if tid differs from the CLG_(current_tid).
+ */
+ if (UNLIKELY(tid != CLG_(current_tid)))
+ CLG_(switch_thread)(tid);
#else
CLG_ASSERT(VG_(get_running_tid)() == CLG_(current_tid));
#endif