Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (162 commits)
tracing/kprobes: unregister_trace_probe needs to be called under mutex
perf: expose event__process function
perf events: Fix mmap offset determination
perf, powerpc: fsl_emb: Restore setting perf_sample_data.period
perf, powerpc: Convert the FSL driver to use local64_t
perf tools: Don't keep unreferenced maps when unmaps are detected
perf session: Invalidate last_match when removing threads from rb_tree
perf session: Free the ref_reloc_sym memory at the right place
x86,mmiotrace: Add support for tracing STOS instruction
perf, sched migration: Librarize task states and event headers helpers
perf, sched migration: Librarize the GUI class
perf, sched migration: Make the GUI class client agnostic
perf, sched migration: Make it vertically scrollable
perf, sched migration: Parameterize cpu height and spacing
perf, sched migration: Fix key bindings
perf, sched migration: Ignore unhandled task states
perf, sched migration: Handle ignored migrate out events
perf: New migration tool overview
tracing: Drop cpparg() macro
perf: Use tracepoint_synchronize_unregister() to flush any pending tracepoint call
...
Fix up trivial conflicts in Makefile and drivers/cpufreq/cpufreq.c
diff --git a/Documentation/ABI/testing/debugfs-kmemtrace b/Documentation/ABI/testing/debugfs-kmemtrace
deleted file mode 100644
index 5e6a92a..0000000
--- a/Documentation/ABI/testing/debugfs-kmemtrace
+++ /dev/null
@@ -1,71 +0,0 @@
-What: /sys/kernel/debug/kmemtrace/
-Date: July 2008
-Contact: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
-Description:
-
-In kmemtrace-enabled kernels, the following files are created:
-
-/sys/kernel/debug/kmemtrace/
- cpu<n> (0400) Per-CPU tracing data, see below. (binary)
- total_overruns (0400) Total number of bytes which were dropped from
- cpu<n> files because of full buffer condition,
- non-binary. (text)
- abi_version (0400) Kernel's kmemtrace ABI version. (text)
-
-Each per-CPU file should be read according to the relay interface. That is,
-the reader should set affinity to that specific CPU and, as currently done by
-the userspace application (though there are other methods), use poll() with
-an infinite timeout before every read(). Otherwise, erroneous data may be
-read. The binary data has the following _core_ format:
-
- Event ID (1 byte) Unsigned integer, one of:
- 0 - represents an allocation (KMEMTRACE_EVENT_ALLOC)
- 1 - represents a freeing of previously allocated memory
- (KMEMTRACE_EVENT_FREE)
- Type ID (1 byte) Unsigned integer, one of:
- 0 - this is a kmalloc() / kfree()
- 1 - this is a kmem_cache_alloc() / kmem_cache_free()
- 2 - this is a __get_free_pages() et al.
- Event size (2 bytes) Unsigned integer representing the
- size of this event. Used to extend
- kmemtrace. Discard the bytes you
- don't know about.
- Sequence number (4 bytes) Signed integer used to reorder data
- logged on SMP machines. Wraparound
- must be taken into account, although
- it is unlikely.
- Caller address (8 bytes) Return address to the caller.
- Pointer to mem (8 bytes) Pointer to target memory area. Can be
- NULL, but not all such calls might be
- recorded.
-
-In case of KMEMTRACE_EVENT_ALLOC events, the next fields follow:
-
- Requested bytes (8 bytes) Total number of requested bytes,
- unsigned, must not be zero.
- Allocated bytes (8 bytes) Total number of actually allocated
- bytes, unsigned, must not be lower
- than requested bytes.
- Requested flags (4 bytes) GFP flags supplied by the caller.
- Target CPU (4 bytes) Signed integer, valid for event id 1.
- If equal to -1, target CPU is the same
- as origin CPU, but the reverse might
- not be true.
-
-The data is made available in the same endianness the machine has.
-
-Other event ids and type ids may be defined and added. Other fields may be
-added by increasing event size, but see below for details.
-Every modification to the ABI, including new id definitions, are followed
-by bumping the ABI version by one.
-
-Adding new data to the packet (features) is done at the end of the mandatory
-data:
- Feature size (2 byte)
- Feature ID (1 byte)
- Feature data (Feature size - 3 bytes)
-
-
-Users:
- kmemtrace-user - git://repo.or.cz/kmemtrace-user.git
-
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index f72ba72..f20c7ab 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -1816,6 +1816,8 @@
nousb [USB] Disable the USB subsystem
+ nowatchdog [KNL] Disable the lockup detector.
+
nowb [ARM]
nox2apic [X86-64,APIC] Do not enable x2APIC mode.
diff --git a/Documentation/trace/ftrace-design.txt b/Documentation/trace/ftrace-design.txt
index f1f81af..dc52bd4 100644
--- a/Documentation/trace/ftrace-design.txt
+++ b/Documentation/trace/ftrace-design.txt
@@ -13,6 +13,9 @@
want more explanation of a feature in terms of common code, review the common
ftrace.txt file.
+Ideally, everyone who wishes to retain performance while supporting tracing in
+their kernel should make it all the way to dynamic ftrace support.
+
Prerequisites
-------------
@@ -215,7 +218,7 @@
exiting of a function. On exit, the value is compared and if it does not
match, then it will panic the kernel. This is largely a sanity check for bad
code generation with gcc. If gcc for your port sanely updates the frame
-pointer under different opitmization levels, then ignore this option.
+pointer under different optimization levels, then ignore this option.
However, adding support for it isn't terribly difficult. In your assembly code
that calls prepare_ftrace_return(), pass the frame pointer as the 3rd argument.
@@ -234,7 +237,7 @@
HAVE_SYSCALL_TRACEPOINTS
----------------------
+------------------------
You need very few things to get the syscalls tracing in an arch.
@@ -250,12 +253,152 @@
HAVE_FTRACE_MCOUNT_RECORD
-------------------------
-See scripts/recordmcount.pl for more info.
-
-<details to be filled>
+See scripts/recordmcount.pl for more info. Just fill in the arch-specific
+details for how to locate the addresses of mcount call sites via objdump.
+This option doesn't make much sense without also implementing dynamic ftrace.
HAVE_DYNAMIC_FTRACE
----------------------
+-------------------
+
+You will first need HAVE_FTRACE_MCOUNT_RECORD and HAVE_FUNCTION_TRACER, so
+scroll your reader back up if you got over eager.
+
+Once those are out of the way, you will need to implement:
+ - asm/ftrace.h:
+ - MCOUNT_ADDR
+ - ftrace_call_adjust()
+ - struct dyn_arch_ftrace{}
+ - asm code:
+ - mcount() (new stub)
+ - ftrace_caller()
+ - ftrace_call()
+ - ftrace_stub()
+ - C code:
+ - ftrace_dyn_arch_init()
+ - ftrace_make_nop()
+ - ftrace_make_call()
+ - ftrace_update_ftrace_func()
+
+First you will need to fill out some arch details in your asm/ftrace.h.
+
+Define MCOUNT_ADDR as the address of your mcount symbol similar to:
+ #define MCOUNT_ADDR ((unsigned long)mcount)
+Since no one else will have a decl for that function, you will need to:
+ extern void mcount(void);
+
+You will also need the helper function ftrace_call_adjust(). Most people
+will be able to stub it out like so:
+ static inline unsigned long ftrace_call_adjust(unsigned long addr)
+ {
+ return addr;
+ }
+<details to be filled>
+
+Lastly you will need the custom dyn_arch_ftrace structure. If you need
+some extra state when runtime patching arbitrary call sites, this is the
+place. For now though, create an empty struct:
+ struct dyn_arch_ftrace {
+ /* No extra data needed */
+ };
+
+With the header out of the way, we can fill out the assembly code. While we
+did already create a mcount() function earlier, dynamic ftrace only wants a
+stub function. This is because the mcount() will only be used during boot
+and then all references to it will be patched out never to return. Instead,
+the guts of the old mcount() will be used to create a new ftrace_caller()
+function. Because the two are hard to merge, it will most likely be a lot
+easier to have two separate definitions split up by #ifdefs. Same goes for
+the ftrace_stub() as that will now be inlined in ftrace_caller().
+
+Before we get confused anymore, let's check out some pseudo code so you can
+implement your own stuff in assembly:
+
+void mcount(void)
+{
+ return;
+}
+
+void ftrace_caller(void)
+{
+ /* implement HAVE_FUNCTION_TRACE_MCOUNT_TEST if you desire */
+
+ /* save all state needed by the ABI (see paragraph above) */
+
+ unsigned long frompc = ...;
+ unsigned long selfpc = <return address> - MCOUNT_INSN_SIZE;
+
+ftrace_call:
+ ftrace_stub(frompc, selfpc);
+
+ /* restore all state needed by the ABI */
+
+ftrace_stub:
+ return;
+}
+
+This might look a little odd at first, but keep in mind that we will be runtime
+patching multiple things. First, only functions that we actually want to trace
+will be patched to call ftrace_caller(). Second, since we only have one tracer
+active at a time, we will patch the ftrace_caller() function itself to call the
+specific tracer in question. That is the point of the ftrace_call label.
+
+With that in mind, let's move on to the C code that will actually be doing the
+runtime patching. You'll need a little knowledge of your arch's opcodes in
+order to make it through the next section.
+
+Every arch has an init callback function. If you need to do something early on
+to initialize some state, this is the time to do that. Otherwise, this simple
+function below should be sufficient for most people:
+
+int __init ftrace_dyn_arch_init(void *data)
+{
+ /* return value is done indirectly via data */
+ *(unsigned long *)data = 0;
+
+ return 0;
+}
+
+There are two functions that are used to do runtime patching of arbitrary
+functions. The first is used to turn the mcount call site into a nop (which
+is what helps us retain runtime performance when not tracing). The second is
+used to turn the mcount call site into a call to an arbitrary location (but
+typically that is ftracer_caller()). See the general function definition in
+linux/ftrace.h for the functions:
+ ftrace_make_nop()
+ ftrace_make_call()
+The rec->ip value is the address of the mcount call site that was collected
+by the scripts/recordmcount.pl during build time.
+
+The last function is used to do runtime patching of the active tracer. This
+will be modifying the assembly code at the location of the ftrace_call symbol
+inside of the ftrace_caller() function. So you should have sufficient padding
+at that location to support the new function calls you'll be inserting. Some
+people will be using a "call" type instruction while others will be using a
+"branch" type instruction. Specifically, the function is:
+ ftrace_update_ftrace_func()
+
+
+HAVE_DYNAMIC_FTRACE + HAVE_FUNCTION_GRAPH_TRACER
+------------------------------------------------
+
+The function grapher needs a few tweaks in order to work with dynamic ftrace.
+Basically, you will need to:
+ - update:
+ - ftrace_caller()
+ - ftrace_graph_call()
+ - ftrace_graph_caller()
+ - implement:
+ - ftrace_enable_ftrace_graph_caller()
+ - ftrace_disable_ftrace_graph_caller()
<details to be filled>
+Quick notes:
+ - add a nop stub after the ftrace_call location named ftrace_graph_call;
+ stub needs to be large enough to support a call to ftrace_graph_caller()
+ - update ftrace_graph_caller() to work with being called by the new
+ ftrace_caller() since some semantics may have changed
+ - ftrace_enable_ftrace_graph_caller() will runtime patch the
+ ftrace_graph_call location with a call to ftrace_graph_caller()
+ - ftrace_disable_ftrace_graph_caller() will runtime patch the
+ ftrace_graph_call location with nops
diff --git a/Documentation/trace/kmemtrace.txt b/Documentation/trace/kmemtrace.txt
deleted file mode 100644
index 6308735..0000000
--- a/Documentation/trace/kmemtrace.txt
+++ /dev/null
@@ -1,126 +0,0 @@
- kmemtrace - Kernel Memory Tracer
-
- by Eduard - Gabriel Munteanu
- <eduard.munteanu@linux360.ro>
-
-I. Introduction
-===============
-
-kmemtrace helps kernel developers figure out two things:
-1) how different allocators (SLAB, SLUB etc.) perform
-2) how kernel code allocates memory and how much
-
-To do this, we trace every allocation and export information to the userspace
-through the relay interface. We export things such as the number of requested
-bytes, the number of bytes actually allocated (i.e. including internal
-fragmentation), whether this is a slab allocation or a plain kmalloc() and so
-on.
-
-The actual analysis is performed by a userspace tool (see section III for
-details on where to get it from). It logs the data exported by the kernel,
-processes it and (as of writing this) can provide the following information:
-- the total amount of memory allocated and fragmentation per call-site
-- the amount of memory allocated and fragmentation per allocation
-- total memory allocated and fragmentation in the collected dataset
-- number of cross-CPU allocation and frees (makes sense in NUMA environments)
-
-Moreover, it can potentially find inconsistent and erroneous behavior in
-kernel code, such as using slab free functions on kmalloc'ed memory or
-allocating less memory than requested (but not truly failed allocations).
-
-kmemtrace also makes provisions for tracing on some arch and analysing the
-data on another.
-
-II. Design and goals
-====================
-
-kmemtrace was designed to handle rather large amounts of data. Thus, it uses
-the relay interface to export whatever is logged to userspace, which then
-stores it. Analysis and reporting is done asynchronously, that is, after the
-data is collected and stored. By design, it allows one to log and analyse
-on different machines and different arches.
-
-As of writing this, the ABI is not considered stable, though it might not
-change much. However, no guarantees are made about compatibility yet. When
-deemed stable, the ABI should still allow easy extension while maintaining
-backward compatibility. This is described further in Documentation/ABI.
-
-Summary of design goals:
- - allow logging and analysis to be done across different machines
- - be fast and anticipate usage in high-load environments (*)
- - be reasonably extensible
- - make it possible for GNU/Linux distributions to have kmemtrace
- included in their repositories
-
-(*) - one of the reasons Pekka Enberg's original userspace data analysis
- tool's code was rewritten from Perl to C (although this is more than a
- simple conversion)
-
-
-III. Quick usage guide
-======================
-
-1) Get a kernel that supports kmemtrace and build it accordingly (i.e. enable
-CONFIG_KMEMTRACE).
-
-2) Get the userspace tool and build it:
-$ git clone git://repo.or.cz/kmemtrace-user.git # current repository
-$ cd kmemtrace-user/
-$ ./autogen.sh
-$ ./configure
-$ make
-
-3) Boot the kmemtrace-enabled kernel if you haven't, preferably in the
-'single' runlevel (so that relay buffers don't fill up easily), and run
-kmemtrace:
-# '$' does not mean user, but root here.
-$ mount -t debugfs none /sys/kernel/debug
-$ mount -t proc none /proc
-$ cd path/to/kmemtrace-user/
-$ ./kmemtraced
-Wait a bit, then stop it with CTRL+C.
-$ cat /sys/kernel/debug/kmemtrace/total_overruns # Check if we didn't
- # overrun, should
- # be zero.
-$ (Optionally) [Run kmemtrace_check separately on each cpu[0-9]*.out file to
- check its correctness]
-$ ./kmemtrace-report
-
-Now you should have a nice and short summary of how the allocator performs.
-
-IV. FAQ and known issues
-========================
-
-Q: 'cat /sys/kernel/debug/kmemtrace/total_overruns' is non-zero, how do I fix
-this? Should I worry?
-A: If it's non-zero, this affects kmemtrace's accuracy, depending on how
-large the number is. You can fix it by supplying a higher
-'kmemtrace.subbufs=N' kernel parameter.
----
-
-Q: kmemtrace_check reports errors, how do I fix this? Should I worry?
-A: This is a bug and should be reported. It can occur for a variety of
-reasons:
- - possible bugs in relay code
- - possible misuse of relay by kmemtrace
- - timestamps being collected unorderly
-Or you may fix it yourself and send us a patch.
----
-
-Q: kmemtrace_report shows many errors, how do I fix this? Should I worry?
-A: This is a known issue and I'm working on it. These might be true errors
-in kernel code, which may have inconsistent behavior (e.g. allocating memory
-with kmem_cache_alloc() and freeing it with kfree()). Pekka Enberg pointed
-out this behavior may work with SLAB, but may fail with other allocators.
-
-It may also be due to lack of tracing in some unusual allocator functions.
-
-We don't want bug reports regarding this issue yet.
----
-
-V. See also
-===========
-
-Documentation/kernel-parameters.txt
-Documentation/ABI/testing/debugfs-kmemtrace
-
diff --git a/Documentation/trace/kprobetrace.txt b/Documentation/trace/kprobetrace.txt
index ec94748..5f77d94 100644
--- a/Documentation/trace/kprobetrace.txt
+++ b/Documentation/trace/kprobetrace.txt
@@ -42,7 +42,7 @@
+|-offs(FETCHARG) : Fetch memory at FETCHARG +|- offs address.(**)
NAME=FETCHARG : Set NAME as the argument name of FETCHARG.
FETCHARG:TYPE : Set TYPE as the type of FETCHARG. Currently, basic types
- (u8/u16/u32/u64/s8/s16/s32/s64) are supported.
+ (u8/u16/u32/u64/s8/s16/s32/s64) and string are supported.
(*) only for return probe.
(**) this is useful for fetching a field of data structures.
diff --git a/MAINTAINERS b/MAINTAINERS
index 11e34d5..100a3f5 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3403,13 +3403,6 @@
F: mm/kmemleak.c
F: mm/kmemleak-test.c
-KMEMTRACE
-M: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
-S: Maintained
-F: Documentation/trace/kmemtrace.txt
-F: include/linux/kmemtrace.h
-F: kernel/trace/kmemtrace.c
-
KPROBES
M: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
M: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
@@ -5685,7 +5678,7 @@
M: Steven Rostedt <rostedt@goodmis.org>
M: Frederic Weisbecker <fweisbec@gmail.com>
M: Ingo Molnar <mingo@redhat.com>
-T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip.git tracing/core
+T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip.git perf/core
S: Maintained
F: Documentation/trace/ftrace.txt
F: arch/*/*/*/ftrace.h
diff --git a/Makefile b/Makefile
index 66c94aa..7431c28 100644
--- a/Makefile
+++ b/Makefile
@@ -420,7 +420,7 @@
no-dot-config-targets := clean mrproper distclean \
cscope TAGS tags help %docs check% coccicheck \
include/linux/version.h headers_% \
- kernelversion
+ kernelversion %src-pkg
config-targets := 0
mixed-targets := 0
@@ -1168,6 +1168,8 @@
# rpm target kept for backward compatibility
package-dir := $(srctree)/scripts/package
+%src-pkg: FORCE
+ $(Q)$(MAKE) $(build)=$(package-dir) $@
%pkg: include/config/kernel.release FORCE
$(Q)$(MAKE) $(build)=$(package-dir) $@
rpm: include/config/kernel.release FORCE
diff --git a/arch/Kconfig b/arch/Kconfig
index acda512..4877a8c 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -151,4 +151,11 @@
config HAVE_USER_RETURN_NOTIFIER
bool
+config HAVE_PERF_EVENTS_NMI
+ bool
+ help
+ System hardware can generate an NMI using the perf event
+ subsystem. Also has support for calculating CPU cycle events
+ to determine how many clock cycles in a given period.
+
source "kernel/gcov/Kconfig"
diff --git a/arch/alpha/include/asm/local64.h b/arch/alpha/include/asm/local64.h
new file mode 100644
index 0000000..36c93b5
--- /dev/null
+++ b/arch/alpha/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
diff --git a/arch/arm/include/asm/local64.h b/arch/arm/include/asm/local64.h
new file mode 100644
index 0000000..36c93b5
--- /dev/null
+++ b/arch/arm/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
index de12536..417c392 100644
--- a/arch/arm/kernel/perf_event.c
+++ b/arch/arm/kernel/perf_event.c
@@ -164,20 +164,20 @@
struct hw_perf_event *hwc,
int idx)
{
- s64 left = atomic64_read(&hwc->period_left);
+ s64 left = local64_read(&hwc->period_left);
s64 period = hwc->sample_period;
int ret = 0;
if (unlikely(left <= -period)) {
left = period;
- atomic64_set(&hwc->period_left, left);
+ local64_set(&hwc->period_left, left);
hwc->last_period = period;
ret = 1;
}
if (unlikely(left <= 0)) {
left += period;
- atomic64_set(&hwc->period_left, left);
+ local64_set(&hwc->period_left, left);
hwc->last_period = period;
ret = 1;
}
@@ -185,7 +185,7 @@
if (left > (s64)armpmu->max_period)
left = armpmu->max_period;
- atomic64_set(&hwc->prev_count, (u64)-left);
+ local64_set(&hwc->prev_count, (u64)-left);
armpmu->write_counter(idx, (u64)(-left) & 0xffffffff);
@@ -204,18 +204,18 @@
u64 delta;
again:
- prev_raw_count = atomic64_read(&hwc->prev_count);
+ prev_raw_count = local64_read(&hwc->prev_count);
new_raw_count = armpmu->read_counter(idx);
- if (atomic64_cmpxchg(&hwc->prev_count, prev_raw_count,
+ if (local64_cmpxchg(&hwc->prev_count, prev_raw_count,
new_raw_count) != prev_raw_count)
goto again;
delta = (new_raw_count << shift) - (prev_raw_count << shift);
delta >>= shift;
- atomic64_add(delta, &event->count);
- atomic64_sub(delta, &hwc->period_left);
+ local64_add(delta, &event->count);
+ local64_sub(delta, &hwc->period_left);
return new_raw_count;
}
@@ -478,7 +478,7 @@
if (!hwc->sample_period) {
hwc->sample_period = armpmu->max_period;
hwc->last_period = hwc->sample_period;
- atomic64_set(&hwc->period_left, hwc->sample_period);
+ local64_set(&hwc->period_left, hwc->sample_period);
}
err = 0;
diff --git a/arch/avr32/include/asm/local64.h b/arch/avr32/include/asm/local64.h
new file mode 100644
index 0000000..36c93b5
--- /dev/null
+++ b/arch/avr32/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
diff --git a/arch/blackfin/include/asm/local64.h b/arch/blackfin/include/asm/local64.h
new file mode 100644
index 0000000..36c93b5
--- /dev/null
+++ b/arch/blackfin/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
diff --git a/arch/cris/include/asm/local64.h b/arch/cris/include/asm/local64.h
new file mode 100644
index 0000000..36c93b5
--- /dev/null
+++ b/arch/cris/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
diff --git a/arch/frv/include/asm/local64.h b/arch/frv/include/asm/local64.h
new file mode 100644
index 0000000..36c93b5
--- /dev/null
+++ b/arch/frv/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
diff --git a/arch/frv/kernel/local64.h b/arch/frv/kernel/local64.h
new file mode 100644
index 0000000..36c93b5
--- /dev/null
+++ b/arch/frv/kernel/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
diff --git a/arch/h8300/include/asm/local64.h b/arch/h8300/include/asm/local64.h
new file mode 100644
index 0000000..36c93b5
--- /dev/null
+++ b/arch/h8300/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
diff --git a/arch/ia64/include/asm/local64.h b/arch/ia64/include/asm/local64.h
new file mode 100644
index 0000000..36c93b5
--- /dev/null
+++ b/arch/ia64/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
diff --git a/arch/m32r/include/asm/local64.h b/arch/m32r/include/asm/local64.h
new file mode 100644
index 0000000..36c93b5
--- /dev/null
+++ b/arch/m32r/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
diff --git a/arch/m68k/include/asm/local64.h b/arch/m68k/include/asm/local64.h
new file mode 100644
index 0000000..36c93b5
--- /dev/null
+++ b/arch/m68k/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
diff --git a/arch/microblaze/include/asm/local64.h b/arch/microblaze/include/asm/local64.h
new file mode 100644
index 0000000..36c93b5
--- /dev/null
+++ b/arch/microblaze/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
diff --git a/arch/mips/include/asm/local64.h b/arch/mips/include/asm/local64.h
new file mode 100644
index 0000000..36c93b5
--- /dev/null
+++ b/arch/mips/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
diff --git a/arch/mn10300/include/asm/local64.h b/arch/mn10300/include/asm/local64.h
new file mode 100644
index 0000000..36c93b5
--- /dev/null
+++ b/arch/mn10300/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
diff --git a/arch/parisc/include/asm/local64.h b/arch/parisc/include/asm/local64.h
new file mode 100644
index 0000000..36c93b5
--- /dev/null
+++ b/arch/parisc/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
diff --git a/arch/powerpc/include/asm/local64.h b/arch/powerpc/include/asm/local64.h
new file mode 100644
index 0000000..36c93b5
--- /dev/null
+++ b/arch/powerpc/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
diff --git a/arch/powerpc/include/asm/perf_event.h b/arch/powerpc/include/asm/perf_event.h
index e6d4ce6..5c16b89 100644
--- a/arch/powerpc/include/asm/perf_event.h
+++ b/arch/powerpc/include/asm/perf_event.h
@@ -21,3 +21,15 @@
#ifdef CONFIG_FSL_EMB_PERF_EVENT
#include <asm/perf_event_fsl_emb.h>
#endif
+
+#ifdef CONFIG_PERF_EVENTS
+#include <asm/ptrace.h>
+#include <asm/reg.h>
+
+#define perf_arch_fetch_caller_regs(regs, __ip) \
+ do { \
+ (regs)->nip = __ip; \
+ (regs)->gpr[1] = *(unsigned long *)__get_SP(); \
+ asm volatile("mfmsr %0" : "=r" ((regs)->msr)); \
+ } while (0)
+#endif
diff --git a/arch/powerpc/kernel/misc.S b/arch/powerpc/kernel/misc.S
index 22e507c..2d29752 100644
--- a/arch/powerpc/kernel/misc.S
+++ b/arch/powerpc/kernel/misc.S
@@ -127,29 +127,3 @@
_GLOBAL(__restore_cpu_power7)
/* place holder */
blr
-
-/*
- * Get a minimal set of registers for our caller's nth caller.
- * r3 = regs pointer, r5 = n.
- *
- * We only get R1 (stack pointer), NIP (next instruction pointer)
- * and LR (link register). These are all we can get in the
- * general case without doing complicated stack unwinding, but
- * fortunately they are enough to do a stack backtrace, which
- * is all we need them for.
- */
-_GLOBAL(perf_arch_fetch_caller_regs)
- mr r6,r1
- cmpwi r5,0
- mflr r4
- ble 2f
- mtctr r5
-1: PPC_LL r6,0(r6)
- bdnz 1b
- PPC_LL r4,PPC_LR_STKOFF(r6)
-2: PPC_LL r7,0(r6)
- PPC_LL r7,PPC_LR_STKOFF(r7)
- PPC_STL r6,GPR1-STACK_FRAME_OVERHEAD(r3)
- PPC_STL r4,_NIP-STACK_FRAME_OVERHEAD(r3)
- PPC_STL r7,_LINK-STACK_FRAME_OVERHEAD(r3)
- blr
diff --git a/arch/powerpc/kernel/perf_event.c b/arch/powerpc/kernel/perf_event.c
index 5c14ffe..d301a30 100644
--- a/arch/powerpc/kernel/perf_event.c
+++ b/arch/powerpc/kernel/perf_event.c
@@ -410,15 +410,15 @@
* Therefore we treat them like NMIs.
*/
do {
- prev = atomic64_read(&event->hw.prev_count);
+ prev = local64_read(&event->hw.prev_count);
barrier();
val = read_pmc(event->hw.idx);
- } while (atomic64_cmpxchg(&event->hw.prev_count, prev, val) != prev);
+ } while (local64_cmpxchg(&event->hw.prev_count, prev, val) != prev);
/* The counters are only 32 bits wide */
delta = (val - prev) & 0xfffffffful;
- atomic64_add(delta, &event->count);
- atomic64_sub(delta, &event->hw.period_left);
+ local64_add(delta, &event->count);
+ local64_sub(delta, &event->hw.period_left);
}
/*
@@ -444,10 +444,10 @@
if (!event->hw.idx)
continue;
val = (event->hw.idx == 5) ? pmc5 : pmc6;
- prev = atomic64_read(&event->hw.prev_count);
+ prev = local64_read(&event->hw.prev_count);
event->hw.idx = 0;
delta = (val - prev) & 0xfffffffful;
- atomic64_add(delta, &event->count);
+ local64_add(delta, &event->count);
}
}
@@ -462,7 +462,7 @@
event = cpuhw->limited_counter[i];
event->hw.idx = cpuhw->limited_hwidx[i];
val = (event->hw.idx == 5) ? pmc5 : pmc6;
- atomic64_set(&event->hw.prev_count, val);
+ local64_set(&event->hw.prev_count, val);
perf_event_update_userpage(event);
}
}
@@ -666,11 +666,11 @@
}
val = 0;
if (event->hw.sample_period) {
- left = atomic64_read(&event->hw.period_left);
+ left = local64_read(&event->hw.period_left);
if (left < 0x80000000L)
val = 0x80000000L - left;
}
- atomic64_set(&event->hw.prev_count, val);
+ local64_set(&event->hw.prev_count, val);
event->hw.idx = idx;
write_pmc(idx, val);
perf_event_update_userpage(event);
@@ -754,7 +754,7 @@
* skip the schedulability test here, it will be peformed
* at commit time(->commit_txn) as a whole
*/
- if (cpuhw->group_flag & PERF_EVENT_TXN_STARTED)
+ if (cpuhw->group_flag & PERF_EVENT_TXN)
goto nocheck;
if (check_excludes(cpuhw->event, cpuhw->flags, n0, 1))
@@ -845,8 +845,8 @@
if (left < 0x80000000L)
val = 0x80000000L - left;
write_pmc(event->hw.idx, val);
- atomic64_set(&event->hw.prev_count, val);
- atomic64_set(&event->hw.period_left, left);
+ local64_set(&event->hw.prev_count, val);
+ local64_set(&event->hw.period_left, left);
perf_event_update_userpage(event);
perf_enable();
local_irq_restore(flags);
@@ -861,7 +861,7 @@
{
struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
- cpuhw->group_flag |= PERF_EVENT_TXN_STARTED;
+ cpuhw->group_flag |= PERF_EVENT_TXN;
cpuhw->n_txn_start = cpuhw->n_events;
}
@@ -874,7 +874,7 @@
{
struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
- cpuhw->group_flag &= ~PERF_EVENT_TXN_STARTED;
+ cpuhw->group_flag &= ~PERF_EVENT_TXN;
}
/*
@@ -900,6 +900,7 @@
for (i = cpuhw->n_txn_start; i < n; ++i)
cpuhw->event[i]->hw.config = cpuhw->events[i];
+ cpuhw->group_flag &= ~PERF_EVENT_TXN;
return 0;
}
@@ -1111,7 +1112,7 @@
event->hw.config = events[n];
event->hw.event_base = cflags[n];
event->hw.last_period = event->hw.sample_period;
- atomic64_set(&event->hw.period_left, event->hw.last_period);
+ local64_set(&event->hw.period_left, event->hw.last_period);
/*
* See if we need to reserve the PMU.
@@ -1149,16 +1150,16 @@
int record = 0;
/* we don't have to worry about interrupts here */
- prev = atomic64_read(&event->hw.prev_count);
+ prev = local64_read(&event->hw.prev_count);
delta = (val - prev) & 0xfffffffful;
- atomic64_add(delta, &event->count);
+ local64_add(delta, &event->count);
/*
* See if the total period for this event has expired,
* and update for the next period.
*/
val = 0;
- left = atomic64_read(&event->hw.period_left) - delta;
+ left = local64_read(&event->hw.period_left) - delta;
if (period) {
if (left <= 0) {
left += period;
@@ -1196,8 +1197,8 @@
}
write_pmc(event->hw.idx, val);
- atomic64_set(&event->hw.prev_count, val);
- atomic64_set(&event->hw.period_left, left);
+ local64_set(&event->hw.prev_count, val);
+ local64_set(&event->hw.period_left, left);
perf_event_update_userpage(event);
}
diff --git a/arch/powerpc/kernel/perf_event_fsl_emb.c b/arch/powerpc/kernel/perf_event_fsl_emb.c
index babccee..1ba4547 100644
--- a/arch/powerpc/kernel/perf_event_fsl_emb.c
+++ b/arch/powerpc/kernel/perf_event_fsl_emb.c
@@ -162,15 +162,15 @@
* Therefore we treat them like NMIs.
*/
do {
- prev = atomic64_read(&event->hw.prev_count);
+ prev = local64_read(&event->hw.prev_count);
barrier();
val = read_pmc(event->hw.idx);
- } while (atomic64_cmpxchg(&event->hw.prev_count, prev, val) != prev);
+ } while (local64_cmpxchg(&event->hw.prev_count, prev, val) != prev);
/* The counters are only 32 bits wide */
delta = (val - prev) & 0xfffffffful;
- atomic64_add(delta, &event->count);
- atomic64_sub(delta, &event->hw.period_left);
+ local64_add(delta, &event->count);
+ local64_sub(delta, &event->hw.period_left);
}
/*
@@ -296,11 +296,11 @@
val = 0;
if (event->hw.sample_period) {
- s64 left = atomic64_read(&event->hw.period_left);
+ s64 left = local64_read(&event->hw.period_left);
if (left < 0x80000000L)
val = 0x80000000L - left;
}
- atomic64_set(&event->hw.prev_count, val);
+ local64_set(&event->hw.prev_count, val);
write_pmc(i, val);
perf_event_update_userpage(event);
@@ -371,8 +371,8 @@
if (left < 0x80000000L)
val = 0x80000000L - left;
write_pmc(event->hw.idx, val);
- atomic64_set(&event->hw.prev_count, val);
- atomic64_set(&event->hw.period_left, left);
+ local64_set(&event->hw.prev_count, val);
+ local64_set(&event->hw.period_left, left);
perf_event_update_userpage(event);
perf_enable();
local_irq_restore(flags);
@@ -500,7 +500,7 @@
return ERR_PTR(-ENOTSUPP);
event->hw.last_period = event->hw.sample_period;
- atomic64_set(&event->hw.period_left, event->hw.last_period);
+ local64_set(&event->hw.period_left, event->hw.last_period);
/*
* See if we need to reserve the PMU.
@@ -541,16 +541,16 @@
int record = 0;
/* we don't have to worry about interrupts here */
- prev = atomic64_read(&event->hw.prev_count);
+ prev = local64_read(&event->hw.prev_count);
delta = (val - prev) & 0xfffffffful;
- atomic64_add(delta, &event->count);
+ local64_add(delta, &event->count);
/*
* See if the total period for this event has expired,
* and update for the next period.
*/
val = 0;
- left = atomic64_read(&event->hw.period_left) - delta;
+ left = local64_read(&event->hw.period_left) - delta;
if (period) {
if (left <= 0) {
left += period;
@@ -569,6 +569,7 @@
struct perf_sample_data data;
perf_sample_data_init(&data, 0);
+ data.period = event->hw.last_period;
if (perf_event_overflow(event, nmi, &data, regs)) {
/*
@@ -584,8 +585,8 @@
}
write_pmc(event->hw.idx, val);
- atomic64_set(&event->hw.prev_count, val);
- atomic64_set(&event->hw.period_left, left);
+ local64_set(&event->hw.prev_count, val);
+ local64_set(&event->hw.period_left, left);
perf_event_update_userpage(event);
}
diff --git a/arch/s390/include/asm/local64.h b/arch/s390/include/asm/local64.h
new file mode 100644
index 0000000..36c93b5
--- /dev/null
+++ b/arch/s390/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
diff --git a/arch/score/include/asm/local64.h b/arch/score/include/asm/local64.h
new file mode 100644
index 0000000..36c93b5
--- /dev/null
+++ b/arch/score/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
diff --git a/arch/sh/include/asm/local64.h b/arch/sh/include/asm/local64.h
new file mode 100644
index 0000000..36c93b5
--- /dev/null
+++ b/arch/sh/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
diff --git a/arch/sh/kernel/perf_event.c b/arch/sh/kernel/perf_event.c
index 81b6de4..7a3dc35 100644
--- a/arch/sh/kernel/perf_event.c
+++ b/arch/sh/kernel/perf_event.c
@@ -185,10 +185,10 @@
* this is the simplest approach for maintaining consistency.
*/
again:
- prev_raw_count = atomic64_read(&hwc->prev_count);
+ prev_raw_count = local64_read(&hwc->prev_count);
new_raw_count = sh_pmu->read(idx);
- if (atomic64_cmpxchg(&hwc->prev_count, prev_raw_count,
+ if (local64_cmpxchg(&hwc->prev_count, prev_raw_count,
new_raw_count) != prev_raw_count)
goto again;
@@ -203,7 +203,7 @@
delta = (new_raw_count << shift) - (prev_raw_count << shift);
delta >>= shift;
- atomic64_add(delta, &event->count);
+ local64_add(delta, &event->count);
}
static void sh_pmu_disable(struct perf_event *event)
diff --git a/arch/sparc/include/asm/local64.h b/arch/sparc/include/asm/local64.h
new file mode 100644
index 0000000..36c93b5
--- /dev/null
+++ b/arch/sparc/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
diff --git a/arch/sparc/include/asm/perf_event.h b/arch/sparc/include/asm/perf_event.h
index 7e26698..74c4e0c 100644
--- a/arch/sparc/include/asm/perf_event.h
+++ b/arch/sparc/include/asm/perf_event.h
@@ -6,7 +6,15 @@
#define PERF_EVENT_INDEX_OFFSET 0
#ifdef CONFIG_PERF_EVENTS
+#include <asm/ptrace.h>
+
extern void init_hw_perf_events(void);
+
+extern void
+__perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip, int skip);
+
+#define perf_arch_fetch_caller_regs(pt_regs, ip) \
+ __perf_arch_fetch_caller_regs(pt_regs, ip, 1);
#else
static inline void init_hw_perf_events(void) { }
#endif
diff --git a/arch/sparc/kernel/helpers.S b/arch/sparc/kernel/helpers.S
index 92090cc..682fee0 100644
--- a/arch/sparc/kernel/helpers.S
+++ b/arch/sparc/kernel/helpers.S
@@ -47,9 +47,9 @@
.size stack_trace_flush,.-stack_trace_flush
#ifdef CONFIG_PERF_EVENTS
- .globl perf_arch_fetch_caller_regs
- .type perf_arch_fetch_caller_regs,#function
-perf_arch_fetch_caller_regs:
+ .globl __perf_arch_fetch_caller_regs
+ .type __perf_arch_fetch_caller_regs,#function
+__perf_arch_fetch_caller_regs:
/* We always read the %pstate into %o5 since we will use
* that to construct a fake %tstate to store into the regs.
*/
diff --git a/arch/sparc/kernel/perf_event.c b/arch/sparc/kernel/perf_event.c
index 44faabc..357ced3 100644
--- a/arch/sparc/kernel/perf_event.c
+++ b/arch/sparc/kernel/perf_event.c
@@ -572,18 +572,18 @@
s64 delta;
again:
- prev_raw_count = atomic64_read(&hwc->prev_count);
+ prev_raw_count = local64_read(&hwc->prev_count);
new_raw_count = read_pmc(idx);
- if (atomic64_cmpxchg(&hwc->prev_count, prev_raw_count,
+ if (local64_cmpxchg(&hwc->prev_count, prev_raw_count,
new_raw_count) != prev_raw_count)
goto again;
delta = (new_raw_count << shift) - (prev_raw_count << shift);
delta >>= shift;
- atomic64_add(delta, &event->count);
- atomic64_sub(delta, &hwc->period_left);
+ local64_add(delta, &event->count);
+ local64_sub(delta, &hwc->period_left);
return new_raw_count;
}
@@ -591,27 +591,27 @@
static int sparc_perf_event_set_period(struct perf_event *event,
struct hw_perf_event *hwc, int idx)
{
- s64 left = atomic64_read(&hwc->period_left);
+ s64 left = local64_read(&hwc->period_left);
s64 period = hwc->sample_period;
int ret = 0;
if (unlikely(left <= -period)) {
left = period;
- atomic64_set(&hwc->period_left, left);
+ local64_set(&hwc->period_left, left);
hwc->last_period = period;
ret = 1;
}
if (unlikely(left <= 0)) {
left += period;
- atomic64_set(&hwc->period_left, left);
+ local64_set(&hwc->period_left, left);
hwc->last_period = period;
ret = 1;
}
if (left > MAX_PERIOD)
left = MAX_PERIOD;
- atomic64_set(&hwc->prev_count, (u64)-left);
+ local64_set(&hwc->prev_count, (u64)-left);
write_pmc(idx, (u64)(-left) & 0xffffffff);
@@ -1006,7 +1006,7 @@
* skip the schedulability test here, it will be peformed
* at commit time(->commit_txn) as a whole
*/
- if (cpuc->group_flag & PERF_EVENT_TXN_STARTED)
+ if (cpuc->group_flag & PERF_EVENT_TXN)
goto nocheck;
if (check_excludes(cpuc->event, n0, 1))
@@ -1088,7 +1088,7 @@
if (!hwc->sample_period) {
hwc->sample_period = MAX_PERIOD;
hwc->last_period = hwc->sample_period;
- atomic64_set(&hwc->period_left, hwc->sample_period);
+ local64_set(&hwc->period_left, hwc->sample_period);
}
return 0;
@@ -1103,7 +1103,7 @@
{
struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
- cpuhw->group_flag |= PERF_EVENT_TXN_STARTED;
+ cpuhw->group_flag |= PERF_EVENT_TXN;
}
/*
@@ -1115,7 +1115,7 @@
{
struct cpu_hw_events *cpuhw = &__get_cpu_var(cpu_hw_events);
- cpuhw->group_flag &= ~PERF_EVENT_TXN_STARTED;
+ cpuhw->group_flag &= ~PERF_EVENT_TXN;
}
/*
@@ -1138,6 +1138,7 @@
if (sparc_check_constraints(cpuc->event, cpuc->events, n))
return -EAGAIN;
+ cpuc->group_flag &= ~PERF_EVENT_TXN;
return 0;
}
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index dcb0593..6f77afa 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -55,6 +55,7 @@
select HAVE_HW_BREAKPOINT
select HAVE_MIXED_BREAKPOINTS_REGS
select PERF_EVENTS
+ select HAVE_PERF_EVENTS_NMI
select ANON_INODES
select HAVE_ARCH_KMEMCHECK
select HAVE_USER_RETURN_NOTIFIER
diff --git a/arch/x86/include/asm/hw_breakpoint.h b/arch/x86/include/asm/hw_breakpoint.h
index 9422553..528a11e 100644
--- a/arch/x86/include/asm/hw_breakpoint.h
+++ b/arch/x86/include/asm/hw_breakpoint.h
@@ -20,10 +20,10 @@
#include <linux/list.h>
/* Available HW breakpoint length encodings */
+#define X86_BREAKPOINT_LEN_X 0x00
#define X86_BREAKPOINT_LEN_1 0x40
#define X86_BREAKPOINT_LEN_2 0x44
#define X86_BREAKPOINT_LEN_4 0x4c
-#define X86_BREAKPOINT_LEN_EXECUTE 0x40
#ifdef CONFIG_X86_64
#define X86_BREAKPOINT_LEN_8 0x48
diff --git a/arch/x86/include/asm/local64.h b/arch/x86/include/asm/local64.h
new file mode 100644
index 0000000..36c93b5
--- /dev/null
+++ b/arch/x86/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
diff --git a/arch/x86/include/asm/nmi.h b/arch/x86/include/asm/nmi.h
index 93da9c3..932f0f8 100644
--- a/arch/x86/include/asm/nmi.h
+++ b/arch/x86/include/asm/nmi.h
@@ -17,7 +17,9 @@
extern void die_nmi(char *str, struct pt_regs *regs, int do_panic);
extern int check_nmi_watchdog(void);
+#if !defined(CONFIG_LOCKUP_DETECTOR)
extern int nmi_watchdog_enabled;
+#endif
extern int avail_to_resrv_perfctr_nmi_bit(unsigned int);
extern int reserve_perfctr_nmi(unsigned int);
extern void release_perfctr_nmi(unsigned int);
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 254883d..6e742cc 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -68,8 +68,9 @@
union cpuid10_edx {
struct {
- unsigned int num_counters_fixed:4;
- unsigned int reserved:28;
+ unsigned int num_counters_fixed:5;
+ unsigned int bit_width_fixed:8;
+ unsigned int reserved:19;
} split;
unsigned int full;
};
@@ -140,6 +141,19 @@
extern unsigned long perf_misc_flags(struct pt_regs *regs);
#define perf_misc_flags(regs) perf_misc_flags(regs)
+#include <asm/stacktrace.h>
+
+/*
+ * We abuse bit 3 from flags to pass exact information, see perf_misc_flags
+ * and the comment with PERF_EFLAGS_EXACT.
+ */
+#define perf_arch_fetch_caller_regs(regs, __ip) { \
+ (regs)->ip = (__ip); \
+ (regs)->bp = caller_frame_pointer(); \
+ (regs)->cs = __KERNEL_CS; \
+ regs->flags = 0; \
+}
+
#else
static inline void init_hw_perf_events(void) { }
static inline void perf_events_lapic_init(void) { }
diff --git a/arch/x86/include/asm/perf_event_p4.h b/arch/x86/include/asm/perf_event_p4.h
index 64a8ebf..def5007 100644
--- a/arch/x86/include/asm/perf_event_p4.h
+++ b/arch/x86/include/asm/perf_event_p4.h
@@ -19,7 +19,6 @@
#define ARCH_P4_RESERVED_ESCR (2) /* IQ_ESCR(0,1) not always present */
#define ARCH_P4_MAX_ESCR (ARCH_P4_TOTAL_ESCR - ARCH_P4_RESERVED_ESCR)
#define ARCH_P4_MAX_CCCR (18)
-#define ARCH_P4_MAX_COUNTER (ARCH_P4_MAX_CCCR / 2)
#define P4_ESCR_EVENT_MASK 0x7e000000U
#define P4_ESCR_EVENT_SHIFT 25
@@ -71,10 +70,6 @@
#define P4_CCCR_THRESHOLD(v) ((v) << P4_CCCR_THRESHOLD_SHIFT)
#define P4_CCCR_ESEL(v) ((v) << P4_CCCR_ESCR_SELECT_SHIFT)
-/* Custom bits in reerved CCCR area */
-#define P4_CCCR_CACHE_OPS_MASK 0x0000003fU
-
-
/* Non HT mask */
#define P4_CCCR_MASK \
(P4_CCCR_OVF | \
@@ -106,8 +101,7 @@
* ESCR and CCCR but rather an only packed value should
* be unpacked and written to a proper addresses
*
- * the base idea is to pack as much info as
- * possible
+ * the base idea is to pack as much info as possible
*/
#define p4_config_pack_escr(v) (((u64)(v)) << 32)
#define p4_config_pack_cccr(v) (((u64)(v)) & 0xffffffffULL)
@@ -130,8 +124,6 @@
t; \
})
-#define p4_config_unpack_cache_event(v) (((u64)(v)) & P4_CCCR_CACHE_OPS_MASK)
-
#define P4_CONFIG_HT_SHIFT 63
#define P4_CONFIG_HT (1ULL << P4_CONFIG_HT_SHIFT)
@@ -214,6 +206,12 @@
return escr;
}
+/*
+ * This are the events which should be used in "Event Select"
+ * field of ESCR register, they are like unique keys which allow
+ * the kernel to determinate which CCCR and COUNTER should be
+ * used to track an event
+ */
enum P4_EVENTS {
P4_EVENT_TC_DELIVER_MODE,
P4_EVENT_BPU_FETCH_REQUEST,
@@ -561,7 +559,7 @@
* a caller should use P4_ESCR_EMASK_NAME helper to
* pick the EventMask needed, for example
*
- * P4_ESCR_EMASK_NAME(P4_EVENT_TC_DELIVER_MODE, DD)
+ * P4_ESCR_EMASK_BIT(P4_EVENT_TC_DELIVER_MODE, DD)
*/
enum P4_ESCR_EMASKS {
P4_GEN_ESCR_EMASK(P4_EVENT_TC_DELIVER_MODE, DD, 0),
@@ -753,43 +751,50 @@
P4_GEN_ESCR_EMASK(P4_EVENT_INSTR_COMPLETED, BOGUS, 1),
};
-/* P4 PEBS: stale for a while */
-#define P4_PEBS_METRIC_MASK 0x00001fffU
-#define P4_PEBS_UOB_TAG 0x01000000U
-#define P4_PEBS_ENABLE 0x02000000U
+/*
+ * P4 PEBS specifics (Replay Event only)
+ *
+ * Format (bits):
+ * 0-6: metric from P4_PEBS_METRIC enum
+ * 7 : reserved
+ * 8 : reserved
+ * 9-11 : reserved
+ *
+ * Note we have UOP and PEBS bits reserved for now
+ * just in case if we will need them once
+ */
+#define P4_PEBS_CONFIG_ENABLE (1 << 7)
+#define P4_PEBS_CONFIG_UOP_TAG (1 << 8)
+#define P4_PEBS_CONFIG_METRIC_MASK 0x3f
+#define P4_PEBS_CONFIG_MASK 0xff
-/* Replay metrics for MSR_IA32_PEBS_ENABLE and MSR_P4_PEBS_MATRIX_VERT */
-#define P4_PEBS__1stl_cache_load_miss_retired 0x3000001
-#define P4_PEBS__2ndl_cache_load_miss_retired 0x3000002
-#define P4_PEBS__dtlb_load_miss_retired 0x3000004
-#define P4_PEBS__dtlb_store_miss_retired 0x3000004
-#define P4_PEBS__dtlb_all_miss_retired 0x3000004
-#define P4_PEBS__tagged_mispred_branch 0x3018000
-#define P4_PEBS__mob_load_replay_retired 0x3000200
-#define P4_PEBS__split_load_retired 0x3000400
-#define P4_PEBS__split_store_retired 0x3000400
+/*
+ * mem: Only counters MSR_IQ_COUNTER4 (16) and
+ * MSR_IQ_COUNTER5 (17) are allowed for PEBS sampling
+ */
+#define P4_PEBS_ENABLE 0x02000000U
+#define P4_PEBS_ENABLE_UOP_TAG 0x01000000U
-#define P4_VERT__1stl_cache_load_miss_retired 0x0000001
-#define P4_VERT__2ndl_cache_load_miss_retired 0x0000001
-#define P4_VERT__dtlb_load_miss_retired 0x0000001
-#define P4_VERT__dtlb_store_miss_retired 0x0000002
-#define P4_VERT__dtlb_all_miss_retired 0x0000003
-#define P4_VERT__tagged_mispred_branch 0x0000010
-#define P4_VERT__mob_load_replay_retired 0x0000001
-#define P4_VERT__split_load_retired 0x0000001
-#define P4_VERT__split_store_retired 0x0000002
+#define p4_config_unpack_metric(v) (((u64)(v)) & P4_PEBS_CONFIG_METRIC_MASK)
+#define p4_config_unpack_pebs(v) (((u64)(v)) & P4_PEBS_CONFIG_MASK)
-enum P4_CACHE_EVENTS {
- P4_CACHE__NONE,
+#define p4_config_pebs_has(v, mask) (p4_config_unpack_pebs(v) & (mask))
- P4_CACHE__1stl_cache_load_miss_retired,
- P4_CACHE__2ndl_cache_load_miss_retired,
- P4_CACHE__dtlb_load_miss_retired,
- P4_CACHE__dtlb_store_miss_retired,
- P4_CACHE__itlb_reference_hit,
- P4_CACHE__itlb_reference_miss,
+enum P4_PEBS_METRIC {
+ P4_PEBS_METRIC__none,
- P4_CACHE__MAX
+ P4_PEBS_METRIC__1stl_cache_load_miss_retired,
+ P4_PEBS_METRIC__2ndl_cache_load_miss_retired,
+ P4_PEBS_METRIC__dtlb_load_miss_retired,
+ P4_PEBS_METRIC__dtlb_store_miss_retired,
+ P4_PEBS_METRIC__dtlb_all_miss_retired,
+ P4_PEBS_METRIC__tagged_mispred_branch,
+ P4_PEBS_METRIC__mob_load_replay_retired,
+ P4_PEBS_METRIC__split_load_retired,
+ P4_PEBS_METRIC__split_store_retired,
+
+ P4_PEBS_METRIC__max
};
#endif /* PERF_EVENT_P4_H */
+
diff --git a/arch/x86/include/asm/stacktrace.h b/arch/x86/include/asm/stacktrace.h
index 4dab78e..2b16a2a 100644
--- a/arch/x86/include/asm/stacktrace.h
+++ b/arch/x86/include/asm/stacktrace.h
@@ -1,6 +1,13 @@
+/*
+ * Copyright (C) 1991, 1992 Linus Torvalds
+ * Copyright (C) 2000, 2001, 2002 Andi Kleen, SuSE Labs
+ */
+
#ifndef _ASM_X86_STACKTRACE_H
#define _ASM_X86_STACKTRACE_H
+#include <linux/uaccess.h>
+
extern int kstack_depth_to_print;
struct thread_info;
@@ -42,4 +49,46 @@
unsigned long *stack, unsigned long bp,
const struct stacktrace_ops *ops, void *data);
+#ifdef CONFIG_X86_32
+#define STACKSLOTS_PER_LINE 8
+#define get_bp(bp) asm("movl %%ebp, %0" : "=r" (bp) :)
+#else
+#define STACKSLOTS_PER_LINE 4
+#define get_bp(bp) asm("movq %%rbp, %0" : "=r" (bp) :)
+#endif
+
+extern void
+show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
+ unsigned long *stack, unsigned long bp, char *log_lvl);
+
+extern void
+show_stack_log_lvl(struct task_struct *task, struct pt_regs *regs,
+ unsigned long *sp, unsigned long bp, char *log_lvl);
+
+extern unsigned int code_bytes;
+
+/* The form of the top of the frame on the stack */
+struct stack_frame {
+ struct stack_frame *next_frame;
+ unsigned long return_address;
+};
+
+struct stack_frame_ia32 {
+ u32 next_frame;
+ u32 return_address;
+};
+
+static inline unsigned long caller_frame_pointer(void)
+{
+ struct stack_frame *frame;
+
+ get_bp(frame);
+
+#ifdef CONFIG_FRAME_POINTER
+ frame = frame->next_frame;
+#endif
+
+ return (unsigned long)frame;
+}
+
#endif /* _ASM_X86_STACKTRACE_H */
diff --git a/arch/x86/kernel/apic/Makefile b/arch/x86/kernel/apic/Makefile
index 565c1bf..910f20b4 100644
--- a/arch/x86/kernel/apic/Makefile
+++ b/arch/x86/kernel/apic/Makefile
@@ -2,7 +2,12 @@
# Makefile for local APIC drivers and for the IO-APIC code
#
-obj-$(CONFIG_X86_LOCAL_APIC) += apic.o apic_noop.o probe_$(BITS).o ipi.o nmi.o
+obj-$(CONFIG_X86_LOCAL_APIC) += apic.o apic_noop.o probe_$(BITS).o ipi.o
+ifneq ($(CONFIG_HARDLOCKUP_DETECTOR),y)
+obj-$(CONFIG_X86_LOCAL_APIC) += nmi.o
+endif
+obj-$(CONFIG_HARDLOCKUP_DETECTOR) += hw_nmi.o
+
obj-$(CONFIG_X86_IO_APIC) += io_apic.o
obj-$(CONFIG_SMP) += ipi.o
diff --git a/arch/x86/kernel/apic/hw_nmi.c b/arch/x86/kernel/apic/hw_nmi.c
new file mode 100644
index 0000000..cefd694
--- /dev/null
+++ b/arch/x86/kernel/apic/hw_nmi.c
@@ -0,0 +1,107 @@
+/*
+ * HW NMI watchdog support
+ *
+ * started by Don Zickus, Copyright (C) 2010 Red Hat, Inc.
+ *
+ * Arch specific calls to support NMI watchdog
+ *
+ * Bits copied from original nmi.c file
+ *
+ */
+#include <asm/apic.h>
+
+#include <linux/cpumask.h>
+#include <linux/kdebug.h>
+#include <linux/notifier.h>
+#include <linux/kprobes.h>
+#include <linux/nmi.h>
+#include <linux/module.h>
+
+/* For reliability, we're prepared to waste bits here. */
+static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly;
+
+u64 hw_nmi_get_sample_period(void)
+{
+ return (u64)(cpu_khz) * 1000 * 60;
+}
+
+#ifdef ARCH_HAS_NMI_WATCHDOG
+void arch_trigger_all_cpu_backtrace(void)
+{
+ int i;
+
+ cpumask_copy(to_cpumask(backtrace_mask), cpu_online_mask);
+
+ printk(KERN_INFO "sending NMI to all CPUs:\n");
+ apic->send_IPI_all(NMI_VECTOR);
+
+ /* Wait for up to 10 seconds for all CPUs to do the backtrace */
+ for (i = 0; i < 10 * 1000; i++) {
+ if (cpumask_empty(to_cpumask(backtrace_mask)))
+ break;
+ mdelay(1);
+ }
+}
+
+static int __kprobes
+arch_trigger_all_cpu_backtrace_handler(struct notifier_block *self,
+ unsigned long cmd, void *__args)
+{
+ struct die_args *args = __args;
+ struct pt_regs *regs;
+ int cpu = smp_processor_id();
+
+ switch (cmd) {
+ case DIE_NMI:
+ case DIE_NMI_IPI:
+ break;
+
+ default:
+ return NOTIFY_DONE;
+ }
+
+ regs = args->regs;
+
+ if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) {
+ static arch_spinlock_t lock = __ARCH_SPIN_LOCK_UNLOCKED;
+
+ arch_spin_lock(&lock);
+ printk(KERN_WARNING "NMI backtrace for cpu %d\n", cpu);
+ show_regs(regs);
+ dump_stack();
+ arch_spin_unlock(&lock);
+ cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));
+ return NOTIFY_STOP;
+ }
+
+ return NOTIFY_DONE;
+}
+
+static __read_mostly struct notifier_block backtrace_notifier = {
+ .notifier_call = arch_trigger_all_cpu_backtrace_handler,
+ .next = NULL,
+ .priority = 1
+};
+
+static int __init register_trigger_all_cpu_backtrace(void)
+{
+ register_die_notifier(&backtrace_notifier);
+ return 0;
+}
+early_initcall(register_trigger_all_cpu_backtrace);
+#endif
+
+/* STUB calls to mimic old nmi_watchdog behaviour */
+#if defined(CONFIG_X86_LOCAL_APIC)
+unsigned int nmi_watchdog = NMI_NONE;
+EXPORT_SYMBOL(nmi_watchdog);
+void acpi_nmi_enable(void) { return; }
+void acpi_nmi_disable(void) { return; }
+#endif
+atomic_t nmi_active = ATOMIC_INIT(0); /* oprofile uses this */
+EXPORT_SYMBOL(nmi_active);
+int unknown_nmi_panic;
+void cpu_nmi_set_wd_enabled(void) { return; }
+void stop_apic_nmi_watchdog(void *unused) { return; }
+void setup_apic_nmi_watchdog(void *unused) { return; }
+int __init check_nmi_watchdog(void) { return 0; }
diff --git a/arch/x86/kernel/apic/nmi.c b/arch/x86/kernel/apic/nmi.c
index 1edaf15..a43f71c 100644
--- a/arch/x86/kernel/apic/nmi.c
+++ b/arch/x86/kernel/apic/nmi.c
@@ -401,13 +401,6 @@
int cpu = smp_processor_id();
int rc = 0;
- /* check for other users first */
- if (notify_die(DIE_NMI, "nmi", regs, reason, 2, SIGINT)
- == NOTIFY_STOP) {
- rc = 1;
- touched = 1;
- }
-
sum = get_timer_irqs(cpu);
if (__get_cpu_var(nmi_touch)) {
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 5db5b7d..f2da20f 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -220,6 +220,7 @@
struct perf_event *event);
struct event_constraint *event_constraints;
void (*quirks)(void);
+ int perfctr_second_write;
int (*cpu_prepare)(int cpu);
void (*cpu_starting)(int cpu);
@@ -295,10 +296,10 @@
* count to the generic event atomically:
*/
again:
- prev_raw_count = atomic64_read(&hwc->prev_count);
+ prev_raw_count = local64_read(&hwc->prev_count);
rdmsrl(hwc->event_base + idx, new_raw_count);
- if (atomic64_cmpxchg(&hwc->prev_count, prev_raw_count,
+ if (local64_cmpxchg(&hwc->prev_count, prev_raw_count,
new_raw_count) != prev_raw_count)
goto again;
@@ -313,8 +314,8 @@
delta = (new_raw_count << shift) - (prev_raw_count << shift);
delta >>= shift;
- atomic64_add(delta, &event->count);
- atomic64_sub(delta, &hwc->period_left);
+ local64_add(delta, &event->count);
+ local64_sub(delta, &hwc->period_left);
return new_raw_count;
}
@@ -438,7 +439,7 @@
if (!hwc->sample_period) {
hwc->sample_period = x86_pmu.max_period;
hwc->last_period = hwc->sample_period;
- atomic64_set(&hwc->period_left, hwc->sample_period);
+ local64_set(&hwc->period_left, hwc->sample_period);
} else {
/*
* If we have a PMU initialized but no APIC
@@ -885,7 +886,7 @@
x86_perf_event_set_period(struct perf_event *event)
{
struct hw_perf_event *hwc = &event->hw;
- s64 left = atomic64_read(&hwc->period_left);
+ s64 left = local64_read(&hwc->period_left);
s64 period = hwc->sample_period;
int ret = 0, idx = hwc->idx;
@@ -897,14 +898,14 @@
*/
if (unlikely(left <= -period)) {
left = period;
- atomic64_set(&hwc->period_left, left);
+ local64_set(&hwc->period_left, left);
hwc->last_period = period;
ret = 1;
}
if (unlikely(left <= 0)) {
left += period;
- atomic64_set(&hwc->period_left, left);
+ local64_set(&hwc->period_left, left);
hwc->last_period = period;
ret = 1;
}
@@ -923,10 +924,19 @@
* The hw event starts counting from this event offset,
* mark it to be able to extra future deltas:
*/
- atomic64_set(&hwc->prev_count, (u64)-left);
+ local64_set(&hwc->prev_count, (u64)-left);
- wrmsrl(hwc->event_base + idx,
+ wrmsrl(hwc->event_base + idx, (u64)(-left) & x86_pmu.cntval_mask);
+
+ /*
+ * Due to erratum on certan cpu we need
+ * a second write to be sure the register
+ * is updated properly
+ */
+ if (x86_pmu.perfctr_second_write) {
+ wrmsrl(hwc->event_base + idx,
(u64)(-left) & x86_pmu.cntval_mask);
+ }
perf_event_update_userpage(event);
@@ -969,7 +979,7 @@
* skip the schedulability test here, it will be peformed
* at commit time(->commit_txn) as a whole
*/
- if (cpuc->group_flag & PERF_EVENT_TXN_STARTED)
+ if (cpuc->group_flag & PERF_EVENT_TXN)
goto out;
ret = x86_pmu.schedule_events(cpuc, n, assign);
@@ -1096,7 +1106,7 @@
* The events never got scheduled and ->cancel_txn will truncate
* the event_list.
*/
- if (cpuc->group_flag & PERF_EVENT_TXN_STARTED)
+ if (cpuc->group_flag & PERF_EVENT_TXN)
return;
x86_pmu_stop(event);
@@ -1388,7 +1398,7 @@
{
struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
- cpuc->group_flag |= PERF_EVENT_TXN_STARTED;
+ cpuc->group_flag |= PERF_EVENT_TXN;
cpuc->n_txn = 0;
}
@@ -1401,7 +1411,7 @@
{
struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
- cpuc->group_flag &= ~PERF_EVENT_TXN_STARTED;
+ cpuc->group_flag &= ~PERF_EVENT_TXN;
/*
* Truncate the collected events.
*/
@@ -1435,11 +1445,7 @@
*/
memcpy(cpuc->assign, assign, n*sizeof(int));
- /*
- * Clear out the txn count so that ->cancel_txn() which gets
- * run after ->commit_txn() doesn't undo things.
- */
- cpuc->n_txn = 0;
+ cpuc->group_flag &= ~PERF_EVENT_TXN;
return 0;
}
@@ -1607,8 +1613,6 @@
.walk_stack = print_context_stack_bp,
};
-#include "../dumpstack.h"
-
static void
perf_callchain_kernel(struct pt_regs *regs, struct perf_callchain_entry *entry)
{
@@ -1730,22 +1734,6 @@
return entry;
}
-void perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip, int skip)
-{
- regs->ip = ip;
- /*
- * perf_arch_fetch_caller_regs adds another call, we need to increment
- * the skip level
- */
- regs->bp = rewind_frame_pointer(skip + 1);
- regs->cs = __KERNEL_CS;
- /*
- * We abuse bit 3 to pass exact information, see perf_misc_flags
- * and the comment with PERF_EFLAGS_EXACT.
- */
- regs->flags = 0;
-}
-
unsigned long perf_instruction_pointer(struct pt_regs *regs)
{
unsigned long ip;
diff --git a/arch/x86/kernel/cpu/perf_event_p4.c b/arch/x86/kernel/cpu/perf_event_p4.c
index ae85d69..107711b 100644
--- a/arch/x86/kernel/cpu/perf_event_p4.c
+++ b/arch/x86/kernel/cpu/perf_event_p4.c
@@ -21,22 +21,36 @@
char cntr[2][P4_CNTR_LIMIT]; /* counter index (offset), -1 on abscence */
};
-struct p4_cache_event_bind {
+struct p4_pebs_bind {
unsigned int metric_pebs;
unsigned int metric_vert;
};
-#define P4_GEN_CACHE_EVENT_BIND(name) \
- [P4_CACHE__##name] = { \
- .metric_pebs = P4_PEBS__##name, \
- .metric_vert = P4_VERT__##name, \
+/* it sets P4_PEBS_ENABLE_UOP_TAG as well */
+#define P4_GEN_PEBS_BIND(name, pebs, vert) \
+ [P4_PEBS_METRIC__##name] = { \
+ .metric_pebs = pebs | P4_PEBS_ENABLE_UOP_TAG, \
+ .metric_vert = vert, \
}
-static struct p4_cache_event_bind p4_cache_event_bind_map[] = {
- P4_GEN_CACHE_EVENT_BIND(1stl_cache_load_miss_retired),
- P4_GEN_CACHE_EVENT_BIND(2ndl_cache_load_miss_retired),
- P4_GEN_CACHE_EVENT_BIND(dtlb_load_miss_retired),
- P4_GEN_CACHE_EVENT_BIND(dtlb_store_miss_retired),
+/*
+ * note we have P4_PEBS_ENABLE_UOP_TAG always set here
+ *
+ * it's needed for mapping P4_PEBS_CONFIG_METRIC_MASK bits of
+ * event configuration to find out which values are to be
+ * written into MSR_IA32_PEBS_ENABLE and MSR_P4_PEBS_MATRIX_VERT
+ * resgisters
+ */
+static struct p4_pebs_bind p4_pebs_bind_map[] = {
+ P4_GEN_PEBS_BIND(1stl_cache_load_miss_retired, 0x0000001, 0x0000001),
+ P4_GEN_PEBS_BIND(2ndl_cache_load_miss_retired, 0x0000002, 0x0000001),
+ P4_GEN_PEBS_BIND(dtlb_load_miss_retired, 0x0000004, 0x0000001),
+ P4_GEN_PEBS_BIND(dtlb_store_miss_retired, 0x0000004, 0x0000002),
+ P4_GEN_PEBS_BIND(dtlb_all_miss_retired, 0x0000004, 0x0000003),
+ P4_GEN_PEBS_BIND(tagged_mispred_branch, 0x0018000, 0x0000010),
+ P4_GEN_PEBS_BIND(mob_load_replay_retired, 0x0000200, 0x0000001),
+ P4_GEN_PEBS_BIND(split_load_retired, 0x0000400, 0x0000001),
+ P4_GEN_PEBS_BIND(split_store_retired, 0x0000400, 0x0000002),
};
/*
@@ -281,10 +295,10 @@
},
};
-#define P4_GEN_CACHE_EVENT(event, bit, cache_event) \
+#define P4_GEN_CACHE_EVENT(event, bit, metric) \
p4_config_pack_escr(P4_ESCR_EVENT(event) | \
P4_ESCR_EMASK_BIT(event, bit)) | \
- p4_config_pack_cccr(cache_event | \
+ p4_config_pack_cccr(metric | \
P4_CCCR_ESEL(P4_OPCODE_ESEL(P4_OPCODE(event))))
static __initconst const u64 p4_hw_cache_event_ids
@@ -296,34 +310,34 @@
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = 0x0,
[ C(RESULT_MISS) ] = P4_GEN_CACHE_EVENT(P4_EVENT_REPLAY_EVENT, NBOGUS,
- P4_CACHE__1stl_cache_load_miss_retired),
+ P4_PEBS_METRIC__1stl_cache_load_miss_retired),
},
},
[ C(LL ) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = 0x0,
[ C(RESULT_MISS) ] = P4_GEN_CACHE_EVENT(P4_EVENT_REPLAY_EVENT, NBOGUS,
- P4_CACHE__2ndl_cache_load_miss_retired),
+ P4_PEBS_METRIC__2ndl_cache_load_miss_retired),
},
},
[ C(DTLB) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = 0x0,
[ C(RESULT_MISS) ] = P4_GEN_CACHE_EVENT(P4_EVENT_REPLAY_EVENT, NBOGUS,
- P4_CACHE__dtlb_load_miss_retired),
+ P4_PEBS_METRIC__dtlb_load_miss_retired),
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = 0x0,
[ C(RESULT_MISS) ] = P4_GEN_CACHE_EVENT(P4_EVENT_REPLAY_EVENT, NBOGUS,
- P4_CACHE__dtlb_store_miss_retired),
+ P4_PEBS_METRIC__dtlb_store_miss_retired),
},
},
[ C(ITLB) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = P4_GEN_CACHE_EVENT(P4_EVENT_ITLB_REFERENCE, HIT,
- P4_CACHE__itlb_reference_hit),
+ P4_PEBS_METRIC__none),
[ C(RESULT_MISS) ] = P4_GEN_CACHE_EVENT(P4_EVENT_ITLB_REFERENCE, MISS,
- P4_CACHE__itlb_reference_miss),
+ P4_PEBS_METRIC__none),
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = -1,
@@ -414,11 +428,37 @@
return config;
}
+static int p4_validate_raw_event(struct perf_event *event)
+{
+ unsigned int v;
+
+ /* user data may have out-of-bound event index */
+ v = p4_config_unpack_event(event->attr.config);
+ if (v >= ARRAY_SIZE(p4_event_bind_map)) {
+ pr_warning("P4 PMU: Unknown event code: %d\n", v);
+ return -EINVAL;
+ }
+
+ /*
+ * it may have some screwed PEBS bits
+ */
+ if (p4_config_pebs_has(event->attr.config, P4_PEBS_CONFIG_ENABLE)) {
+ pr_warning("P4 PMU: PEBS are not supported yet\n");
+ return -EINVAL;
+ }
+ v = p4_config_unpack_metric(event->attr.config);
+ if (v >= ARRAY_SIZE(p4_pebs_bind_map)) {
+ pr_warning("P4 PMU: Unknown metric code: %d\n", v);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
static int p4_hw_config(struct perf_event *event)
{
int cpu = get_cpu();
int rc = 0;
- unsigned int evnt;
u32 escr, cccr;
/*
@@ -438,12 +478,9 @@
if (event->attr.type == PERF_TYPE_RAW) {
- /* user data may have out-of-bound event index */
- evnt = p4_config_unpack_event(event->attr.config);
- if (evnt >= ARRAY_SIZE(p4_event_bind_map)) {
- rc = -EINVAL;
+ rc = p4_validate_raw_event(event);
+ if (rc)
goto out;
- }
/*
* We don't control raw events so it's up to the caller
@@ -451,12 +488,15 @@
* on HT machine but allow HT-compatible specifics to be
* passed on)
*
+ * Note that for RAW events we allow user to use P4_CCCR_RESERVED
+ * bits since we keep additional info here (for cache events and etc)
+ *
* XXX: HT wide things should check perf_paranoid_cpu() &&
* CAP_SYS_ADMIN
*/
event->hw.config |= event->attr.config &
(p4_config_pack_escr(P4_ESCR_MASK_HT) |
- p4_config_pack_cccr(P4_CCCR_MASK_HT));
+ p4_config_pack_cccr(P4_CCCR_MASK_HT | P4_CCCR_RESERVED));
}
rc = x86_setup_perfctr(event);
@@ -482,6 +522,29 @@
return overflow;
}
+static void p4_pmu_disable_pebs(void)
+{
+ /*
+ * FIXME
+ *
+ * It's still allowed that two threads setup same cache
+ * events so we can't simply clear metrics until we knew
+ * noone is depending on us, so we need kind of counter
+ * for "ReplayEvent" users.
+ *
+ * What is more complex -- RAW events, if user (for some
+ * reason) will pass some cache event metric with improper
+ * event opcode -- it's fine from hardware point of view
+ * but completely nonsence from "meaning" of such action.
+ *
+ * So at moment let leave metrics turned on forever -- it's
+ * ok for now but need to be revisited!
+ *
+ * (void)checking_wrmsrl(MSR_IA32_PEBS_ENABLE, (u64)0);
+ * (void)checking_wrmsrl(MSR_P4_PEBS_MATRIX_VERT, (u64)0);
+ */
+}
+
static inline void p4_pmu_disable_event(struct perf_event *event)
{
struct hw_perf_event *hwc = &event->hw;
@@ -507,6 +570,26 @@
continue;
p4_pmu_disable_event(event);
}
+
+ p4_pmu_disable_pebs();
+}
+
+/* configuration must be valid */
+static void p4_pmu_enable_pebs(u64 config)
+{
+ struct p4_pebs_bind *bind;
+ unsigned int idx;
+
+ BUILD_BUG_ON(P4_PEBS_METRIC__max > P4_PEBS_CONFIG_METRIC_MASK);
+
+ idx = p4_config_unpack_metric(config);
+ if (idx == P4_PEBS_METRIC__none)
+ return;
+
+ bind = &p4_pebs_bind_map[idx];
+
+ (void)checking_wrmsrl(MSR_IA32_PEBS_ENABLE, (u64)bind->metric_pebs);
+ (void)checking_wrmsrl(MSR_P4_PEBS_MATRIX_VERT, (u64)bind->metric_vert);
}
static void p4_pmu_enable_event(struct perf_event *event)
@@ -515,9 +598,7 @@
int thread = p4_ht_config_thread(hwc->config);
u64 escr_conf = p4_config_unpack_escr(p4_clear_ht_bit(hwc->config));
unsigned int idx = p4_config_unpack_event(hwc->config);
- unsigned int idx_cache = p4_config_unpack_cache_event(hwc->config);
struct p4_event_bind *bind;
- struct p4_cache_event_bind *bind_cache;
u64 escr_addr, cccr;
bind = &p4_event_bind_map[idx];
@@ -537,16 +618,10 @@
cccr = p4_config_unpack_cccr(hwc->config);
/*
- * it could be Cache event so that we need to
- * set metrics into additional MSRs
+ * it could be Cache event so we need to write metrics
+ * into additional MSRs
*/
- BUILD_BUG_ON(P4_CACHE__MAX > P4_CCCR_CACHE_OPS_MASK);
- if (idx_cache > P4_CACHE__NONE &&
- idx_cache < ARRAY_SIZE(p4_cache_event_bind_map)) {
- bind_cache = &p4_cache_event_bind_map[idx_cache];
- (void)checking_wrmsrl(MSR_IA32_PEBS_ENABLE, (u64)bind_cache->metric_pebs);
- (void)checking_wrmsrl(MSR_P4_PEBS_MATRIX_VERT, (u64)bind_cache->metric_vert);
- }
+ p4_pmu_enable_pebs(hwc->config);
(void)checking_wrmsrl(escr_addr, escr_conf);
(void)checking_wrmsrl(hwc->config_base + hwc->idx,
@@ -829,6 +904,15 @@
.max_period = (1ULL << 39) - 1,
.hw_config = p4_hw_config,
.schedule_events = p4_pmu_schedule_events,
+ /*
+ * This handles erratum N15 in intel doc 249199-029,
+ * the counter may not be updated correctly on write
+ * so we need a second write operation to do the trick
+ * (the official workaround didn't work)
+ *
+ * the former idea is taken from OProfile code
+ */
+ .perfctr_second_write = 1,
};
static __init int p4_pmu_init(void)
diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index c89a386..6e8752c 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -18,7 +18,6 @@
#include <asm/stacktrace.h>
-#include "dumpstack.h"
int panic_on_unrecovered_nmi;
int panic_on_io_nmi;
diff --git a/arch/x86/kernel/dumpstack.h b/arch/x86/kernel/dumpstack.h
deleted file mode 100644
index e1a93be..0000000
--- a/arch/x86/kernel/dumpstack.h
+++ /dev/null
@@ -1,56 +0,0 @@
-/*
- * Copyright (C) 1991, 1992 Linus Torvalds
- * Copyright (C) 2000, 2001, 2002 Andi Kleen, SuSE Labs
- */
-
-#ifndef DUMPSTACK_H
-#define DUMPSTACK_H
-
-#ifdef CONFIG_X86_32
-#define STACKSLOTS_PER_LINE 8
-#define get_bp(bp) asm("movl %%ebp, %0" : "=r" (bp) :)
-#else
-#define STACKSLOTS_PER_LINE 4
-#define get_bp(bp) asm("movq %%rbp, %0" : "=r" (bp) :)
-#endif
-
-#include <linux/uaccess.h>
-
-extern void
-show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
- unsigned long *stack, unsigned long bp, char *log_lvl);
-
-extern void
-show_stack_log_lvl(struct task_struct *task, struct pt_regs *regs,
- unsigned long *sp, unsigned long bp, char *log_lvl);
-
-extern unsigned int code_bytes;
-
-/* The form of the top of the frame on the stack */
-struct stack_frame {
- struct stack_frame *next_frame;
- unsigned long return_address;
-};
-
-struct stack_frame_ia32 {
- u32 next_frame;
- u32 return_address;
-};
-
-static inline unsigned long rewind_frame_pointer(int n)
-{
- struct stack_frame *frame;
-
- get_bp(frame);
-
-#ifdef CONFIG_FRAME_POINTER
- while (n--) {
- if (probe_kernel_address(&frame->next_frame, frame))
- break;
- }
-#endif
-
- return (unsigned long)frame;
-}
-
-#endif /* DUMPSTACK_H */
diff --git a/arch/x86/kernel/dumpstack_32.c b/arch/x86/kernel/dumpstack_32.c
index 11540a1..0f6376f 100644
--- a/arch/x86/kernel/dumpstack_32.c
+++ b/arch/x86/kernel/dumpstack_32.c
@@ -16,8 +16,6 @@
#include <asm/stacktrace.h>
-#include "dumpstack.h"
-
void dump_trace(struct task_struct *task, struct pt_regs *regs,
unsigned long *stack, unsigned long bp,
diff --git a/arch/x86/kernel/dumpstack_64.c b/arch/x86/kernel/dumpstack_64.c
index 272c9f1..57a21f1 100644
--- a/arch/x86/kernel/dumpstack_64.c
+++ b/arch/x86/kernel/dumpstack_64.c
@@ -16,7 +16,6 @@
#include <asm/stacktrace.h>
-#include "dumpstack.h"
#define N_EXCEPTION_STACKS_END \
(N_EXCEPTION_STACKS + DEBUG_STKSZ/EXCEPTION_STKSZ - 2)
diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index a8f1b80..a474ec3 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -208,6 +208,9 @@
{
/* Len */
switch (x86_len) {
+ case X86_BREAKPOINT_LEN_X:
+ *gen_len = sizeof(long);
+ break;
case X86_BREAKPOINT_LEN_1:
*gen_len = HW_BREAKPOINT_LEN_1;
break;
@@ -251,6 +254,29 @@
info->address = bp->attr.bp_addr;
+ /* Type */
+ switch (bp->attr.bp_type) {
+ case HW_BREAKPOINT_W:
+ info->type = X86_BREAKPOINT_WRITE;
+ break;
+ case HW_BREAKPOINT_W | HW_BREAKPOINT_R:
+ info->type = X86_BREAKPOINT_RW;
+ break;
+ case HW_BREAKPOINT_X:
+ info->type = X86_BREAKPOINT_EXECUTE;
+ /*
+ * x86 inst breakpoints need to have a specific undefined len.
+ * But we still need to check userspace is not trying to setup
+ * an unsupported length, to get a range breakpoint for example.
+ */
+ if (bp->attr.bp_len == sizeof(long)) {
+ info->len = X86_BREAKPOINT_LEN_X;
+ return 0;
+ }
+ default:
+ return -EINVAL;
+ }
+
/* Len */
switch (bp->attr.bp_len) {
case HW_BREAKPOINT_LEN_1:
@@ -271,21 +297,6 @@
return -EINVAL;
}
- /* Type */
- switch (bp->attr.bp_type) {
- case HW_BREAKPOINT_W:
- info->type = X86_BREAKPOINT_WRITE;
- break;
- case HW_BREAKPOINT_W | HW_BREAKPOINT_R:
- info->type = X86_BREAKPOINT_RW;
- break;
- case HW_BREAKPOINT_X:
- info->type = X86_BREAKPOINT_EXECUTE;
- break;
- default:
- return -EINVAL;
- }
-
return 0;
}
/*
@@ -305,6 +316,9 @@
ret = -EINVAL;
switch (info->len) {
+ case X86_BREAKPOINT_LEN_X:
+ align = sizeof(long) -1;
+ break;
case X86_BREAKPOINT_LEN_1:
align = 0;
break;
@@ -466,6 +480,13 @@
perf_bp_event(bp, args->regs);
+ /*
+ * Set up resume flag to avoid breakpoint recursion when
+ * returning back to origin.
+ */
+ if (bp->hw.info.type == X86_BREAKPOINT_EXECUTE)
+ args->regs->flags |= X86_EFLAGS_RF;
+
rcu_read_unlock();
}
/*
diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c
index 675879b..1bfb6cf 100644
--- a/arch/x86/kernel/kprobes.c
+++ b/arch/x86/kernel/kprobes.c
@@ -126,16 +126,22 @@
}
/*
- * Check for the REX prefix which can only exist on X86_64
- * X86_32 always returns 0
+ * Skip the prefixes of the instruction.
*/
-static int __kprobes is_REX_prefix(kprobe_opcode_t *insn)
+static kprobe_opcode_t *__kprobes skip_prefixes(kprobe_opcode_t *insn)
{
+ insn_attr_t attr;
+
+ attr = inat_get_opcode_attribute((insn_byte_t)*insn);
+ while (inat_is_legacy_prefix(attr)) {
+ insn++;
+ attr = inat_get_opcode_attribute((insn_byte_t)*insn);
+ }
#ifdef CONFIG_X86_64
- if ((*insn & 0xf0) == 0x40)
- return 1;
+ if (inat_is_rex_prefix(attr))
+ insn++;
#endif
- return 0;
+ return insn;
}
/*
@@ -272,6 +278,9 @@
*/
static int __kprobes is_IF_modifier(kprobe_opcode_t *insn)
{
+ /* Skip prefixes */
+ insn = skip_prefixes(insn);
+
switch (*insn) {
case 0xfa: /* cli */
case 0xfb: /* sti */
@@ -280,13 +289,6 @@
return 1;
}
- /*
- * on X86_64, 0x40-0x4f are REX prefixes so we need to look
- * at the next byte instead.. but of course not recurse infinitely
- */
- if (is_REX_prefix(insn))
- return is_IF_modifier(++insn);
-
return 0;
}
@@ -803,9 +805,8 @@
unsigned long orig_ip = (unsigned long)p->addr;
kprobe_opcode_t *insn = p->ainsn.insn;
- /*skip the REX prefix*/
- if (is_REX_prefix(insn))
- insn++;
+ /* Skip prefixes */
+ insn = skip_prefixes(insn);
regs->flags &= ~X86_EFLAGS_TF;
switch (*insn) {
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 8d12878..96586c3 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -57,6 +57,8 @@
#include <asm/syscalls.h>
#include <asm/debugreg.h>
+#include <trace/events/power.h>
+
asmlinkage void ret_from_fork(void) __asm__("ret_from_fork");
/*
@@ -111,6 +113,8 @@
stop_critical_timings();
pm_idle();
start_critical_timings();
+
+ trace_power_end(smp_processor_id());
}
tick_nohz_restart_sched_tick();
preempt_enable_no_resched();
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 3c2422a..3d9ea53 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -51,6 +51,8 @@
#include <asm/syscalls.h>
#include <asm/debugreg.h>
+#include <trace/events/power.h>
+
asmlinkage extern void ret_from_fork(void);
DEFINE_PER_CPU(unsigned long, old_rsp);
@@ -138,6 +140,9 @@
stop_critical_timings();
pm_idle();
start_critical_timings();
+
+ trace_power_end(smp_processor_id());
+
/* In many cases the interrupt that ended idle
has already called exit_idle. But some idle
loops can be woken up without interrupt. */
diff --git a/arch/x86/kernel/stacktrace.c b/arch/x86/kernel/stacktrace.c
index 922eefb..b53c525 100644
--- a/arch/x86/kernel/stacktrace.c
+++ b/arch/x86/kernel/stacktrace.c
@@ -23,11 +23,16 @@
return 0;
}
-static void save_stack_address(void *data, unsigned long addr, int reliable)
+static void
+__save_stack_address(void *data, unsigned long addr, bool reliable, bool nosched)
{
struct stack_trace *trace = data;
+#ifdef CONFIG_FRAME_POINTER
if (!reliable)
return;
+#endif
+ if (nosched && in_sched_functions(addr))
+ return;
if (trace->skip > 0) {
trace->skip--;
return;
@@ -36,20 +41,15 @@
trace->entries[trace->nr_entries++] = addr;
}
+static void save_stack_address(void *data, unsigned long addr, int reliable)
+{
+ return __save_stack_address(data, addr, reliable, false);
+}
+
static void
save_stack_address_nosched(void *data, unsigned long addr, int reliable)
{
- struct stack_trace *trace = (struct stack_trace *)data;
- if (!reliable)
- return;
- if (in_sched_functions(addr))
- return;
- if (trace->skip > 0) {
- trace->skip--;
- return;
- }
- if (trace->nr_entries < trace->max_entries)
- trace->entries[trace->nr_entries++] = addr;
+ return __save_stack_address(data, addr, reliable, true);
}
static const struct stacktrace_ops save_stack_ops = {
@@ -96,12 +96,13 @@
/* Userspace stacktrace - based on kernel/trace/trace_sysprof.c */
-struct stack_frame {
+struct stack_frame_user {
const void __user *next_fp;
unsigned long ret_addr;
};
-static int copy_stack_frame(const void __user *fp, struct stack_frame *frame)
+static int
+copy_stack_frame(const void __user *fp, struct stack_frame_user *frame)
{
int ret;
@@ -126,7 +127,7 @@
trace->entries[trace->nr_entries++] = regs->ip;
while (trace->nr_entries < trace->max_entries) {
- struct stack_frame frame;
+ struct stack_frame_user frame;
frame.next_fp = NULL;
frame.ret_addr = 0;
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 725ef4d..60788de 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -392,7 +392,13 @@
if (notify_die(DIE_NMI_IPI, "nmi_ipi", regs, reason, 2, SIGINT)
== NOTIFY_STOP)
return;
+
#ifdef CONFIG_X86_LOCAL_APIC
+ if (notify_die(DIE_NMI, "nmi", regs, reason, 2, SIGINT)
+ == NOTIFY_STOP)
+ return;
+
+#ifndef CONFIG_LOCKUP_DETECTOR
/*
* Ok, so this is none of the documented NMI sources,
* so it must be the NMI watchdog.
@@ -400,6 +406,7 @@
if (nmi_watchdog_tick(regs, reason))
return;
if (!do_nmi_callback(regs, cpu))
+#endif /* !CONFIG_LOCKUP_DETECTOR */
unknown_nmi_error(reason, regs);
#else
unknown_nmi_error(reason, regs);
diff --git a/arch/x86/mm/pf_in.c b/arch/x86/mm/pf_in.c
index 308e325..38e6d17 100644
--- a/arch/x86/mm/pf_in.c
+++ b/arch/x86/mm/pf_in.c
@@ -40,16 +40,16 @@
static unsigned int reg_rop[] = {
0x8A, 0x8B, 0xB60F, 0xB70F, 0xBE0F, 0xBF0F
};
-static unsigned int reg_wop[] = { 0x88, 0x89 };
+static unsigned int reg_wop[] = { 0x88, 0x89, 0xAA, 0xAB };
static unsigned int imm_wop[] = { 0xC6, 0xC7 };
/* IA32 Manual 3, 3-432*/
-static unsigned int rw8[] = { 0x88, 0x8A, 0xC6 };
+static unsigned int rw8[] = { 0x88, 0x8A, 0xC6, 0xAA };
static unsigned int rw32[] = {
- 0x89, 0x8B, 0xC7, 0xB60F, 0xB70F, 0xBE0F, 0xBF0F
+ 0x89, 0x8B, 0xC7, 0xB60F, 0xB70F, 0xBE0F, 0xBF0F, 0xAB
};
-static unsigned int mw8[] = { 0x88, 0x8A, 0xC6, 0xB60F, 0xBE0F };
+static unsigned int mw8[] = { 0x88, 0x8A, 0xC6, 0xB60F, 0xBE0F, 0xAA };
static unsigned int mw16[] = { 0xB70F, 0xBF0F };
-static unsigned int mw32[] = { 0x89, 0x8B, 0xC7 };
+static unsigned int mw32[] = { 0x89, 0x8B, 0xC7, 0xAB };
static unsigned int mw64[] = {};
#else /* not __i386__ */
static unsigned char prefix_codes[] = {
@@ -63,20 +63,20 @@
static unsigned int reg_rop[] = {
0x8A, 0x8B, 0xB60F, 0xB70F, 0xBE0F, 0xBF0F
};
-static unsigned int reg_wop[] = { 0x88, 0x89 };
+static unsigned int reg_wop[] = { 0x88, 0x89, 0xAA, 0xAB };
static unsigned int imm_wop[] = { 0xC6, 0xC7 };
-static unsigned int rw8[] = { 0xC6, 0x88, 0x8A };
+static unsigned int rw8[] = { 0xC6, 0x88, 0x8A, 0xAA };
static unsigned int rw32[] = {
- 0xC7, 0x89, 0x8B, 0xB60F, 0xB70F, 0xBE0F, 0xBF0F
+ 0xC7, 0x89, 0x8B, 0xB60F, 0xB70F, 0xBE0F, 0xBF0F, 0xAB
};
/* 8 bit only */
-static unsigned int mw8[] = { 0xC6, 0x88, 0x8A, 0xB60F, 0xBE0F };
+static unsigned int mw8[] = { 0xC6, 0x88, 0x8A, 0xB60F, 0xBE0F, 0xAA };
/* 16 bit only */
static unsigned int mw16[] = { 0xB70F, 0xBF0F };
/* 16 or 32 bit */
static unsigned int mw32[] = { 0xC7 };
/* 16, 32 or 64 bit */
-static unsigned int mw64[] = { 0x89, 0x8B };
+static unsigned int mw64[] = { 0x89, 0x8B, 0xAB };
#endif /* not __i386__ */
struct prefix_bits {
@@ -410,7 +410,6 @@
unsigned long get_ins_reg_val(unsigned long ins_addr, struct pt_regs *regs)
{
unsigned int opcode;
- unsigned char mod_rm;
int reg;
unsigned char *p;
struct prefix_bits prf;
@@ -437,8 +436,13 @@
goto err;
do_work:
- mod_rm = *p;
- reg = ((mod_rm >> 3) & 0x7) | (prf.rexr << 3);
+ /* for STOS, source register is fixed */
+ if (opcode == 0xAA || opcode == 0xAB) {
+ reg = arg_AX;
+ } else {
+ unsigned char mod_rm = *p;
+ reg = ((mod_rm >> 3) & 0x7) | (prf.rexr << 3);
+ }
switch (get_ins_reg_width(ins_addr)) {
case 1:
return *get_reg_w8(reg, prf.rex, regs);
diff --git a/arch/x86/oprofile/nmi_int.c b/arch/x86/oprofile/nmi_int.c
index b28d2f1..1ba67dc 100644
--- a/arch/x86/oprofile/nmi_int.c
+++ b/arch/x86/oprofile/nmi_int.c
@@ -634,6 +634,18 @@
if (force_arch_perfmon && cpu_has_arch_perfmon)
return 0;
+ /*
+ * Documentation on identifying Intel processors by CPU family
+ * and model can be found in the Intel Software Developer's
+ * Manuals (SDM):
+ *
+ * http://www.intel.com/products/processor/manuals/
+ *
+ * As of May 2010 the documentation for this was in the:
+ * "Intel 64 and IA-32 Architectures Software Developer's
+ * Manual Volume 3B: System Programming Guide", "Table B-1
+ * CPUID Signature Values of DisplayFamily_DisplayModel".
+ */
switch (cpu_model) {
case 0 ... 2:
*cpu_type = "i386/ppro";
@@ -655,12 +667,12 @@
case 15: case 23:
*cpu_type = "i386/core_2";
break;
+ case 0x1a:
case 0x2e:
- case 26:
spec = &op_arch_perfmon_spec;
*cpu_type = "i386/core_i7";
break;
- case 28:
+ case 0x1c:
*cpu_type = "i386/atom";
break;
default:
diff --git a/arch/xtensa/include/asm/local64.h b/arch/xtensa/include/asm/local64.h
new file mode 100644
index 0000000..36c93b5
--- /dev/null
+++ b/arch/xtensa/include/asm/local64.h
@@ -0,0 +1 @@
+#include <asm-generic/local64.h>
diff --git a/drivers/oprofile/event_buffer.c b/drivers/oprofile/event_buffer.c
index 5df60a6..dd87e86 100644
--- a/drivers/oprofile/event_buffer.c
+++ b/drivers/oprofile/event_buffer.c
@@ -135,7 +135,7 @@
* echo 1 >/dev/oprofile/enable
*/
- return 0;
+ return nonseekable_open(inode, file);
fail:
dcookie_unregister(file->private_data);
@@ -205,4 +205,5 @@
.open = event_buffer_open,
.release = event_buffer_release,
.read = event_buffer_read,
+ .llseek = no_llseek,
};
diff --git a/fs/exec.c b/fs/exec.c
index e19de6a..97d91a0 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -653,6 +653,7 @@
else
stack_base = vma->vm_start - stack_expand;
#endif
+ current->mm->start_stack = bprm->p;
ret = expand_stack(vma, stack_base);
if (ret)
ret = -EFAULT;
diff --git a/include/asm-generic/local64.h b/include/asm-generic/local64.h
new file mode 100644
index 0000000..02ac760
--- /dev/null
+++ b/include/asm-generic/local64.h
@@ -0,0 +1,96 @@
+#ifndef _ASM_GENERIC_LOCAL64_H
+#define _ASM_GENERIC_LOCAL64_H
+
+#include <linux/percpu.h>
+#include <asm/types.h>
+
+/*
+ * A signed long type for operations which are atomic for a single CPU.
+ * Usually used in combination with per-cpu variables.
+ *
+ * This is the default implementation, which uses atomic64_t. Which is
+ * rather pointless. The whole point behind local64_t is that some processors
+ * can perform atomic adds and subtracts in a manner which is atomic wrt IRQs
+ * running on this CPU. local64_t allows exploitation of such capabilities.
+ */
+
+/* Implement in terms of atomics. */
+
+#if BITS_PER_LONG == 64
+
+#include <asm/local.h>
+
+typedef struct {
+ local_t a;
+} local64_t;
+
+#define LOCAL64_INIT(i) { LOCAL_INIT(i) }
+
+#define local64_read(l) local_read(&(l)->a)
+#define local64_set(l,i) local_set((&(l)->a),(i))
+#define local64_inc(l) local_inc(&(l)->a)
+#define local64_dec(l) local_dec(&(l)->a)
+#define local64_add(i,l) local_add((i),(&(l)->a))
+#define local64_sub(i,l) local_sub((i),(&(l)->a))
+
+#define local64_sub_and_test(i, l) local_sub_and_test((i), (&(l)->a))
+#define local64_dec_and_test(l) local_dec_and_test(&(l)->a)
+#define local64_inc_and_test(l) local_inc_and_test(&(l)->a)
+#define local64_add_negative(i, l) local_add_negative((i), (&(l)->a))
+#define local64_add_return(i, l) local_add_return((i), (&(l)->a))
+#define local64_sub_return(i, l) local_sub_return((i), (&(l)->a))
+#define local64_inc_return(l) local_inc_return(&(l)->a)
+
+#define local64_cmpxchg(l, o, n) local_cmpxchg((&(l)->a), (o), (n))
+#define local64_xchg(l, n) local_xchg((&(l)->a), (n))
+#define local64_add_unless(l, _a, u) local_add_unless((&(l)->a), (_a), (u))
+#define local64_inc_not_zero(l) local_inc_not_zero(&(l)->a)
+
+/* Non-atomic variants, ie. preemption disabled and won't be touched
+ * in interrupt, etc. Some archs can optimize this case well. */
+#define __local64_inc(l) local64_set((l), local64_read(l) + 1)
+#define __local64_dec(l) local64_set((l), local64_read(l) - 1)
+#define __local64_add(i,l) local64_set((l), local64_read(l) + (i))
+#define __local64_sub(i,l) local64_set((l), local64_read(l) - (i))
+
+#else /* BITS_PER_LONG != 64 */
+
+#include <asm/atomic.h>
+
+/* Don't use typedef: don't want them to be mixed with atomic_t's. */
+typedef struct {
+ atomic64_t a;
+} local64_t;
+
+#define LOCAL64_INIT(i) { ATOMIC_LONG_INIT(i) }
+
+#define local64_read(l) atomic64_read(&(l)->a)
+#define local64_set(l,i) atomic64_set((&(l)->a),(i))
+#define local64_inc(l) atomic64_inc(&(l)->a)
+#define local64_dec(l) atomic64_dec(&(l)->a)
+#define local64_add(i,l) atomic64_add((i),(&(l)->a))
+#define local64_sub(i,l) atomic64_sub((i),(&(l)->a))
+
+#define local64_sub_and_test(i, l) atomic64_sub_and_test((i), (&(l)->a))
+#define local64_dec_and_test(l) atomic64_dec_and_test(&(l)->a)
+#define local64_inc_and_test(l) atomic64_inc_and_test(&(l)->a)
+#define local64_add_negative(i, l) atomic64_add_negative((i), (&(l)->a))
+#define local64_add_return(i, l) atomic64_add_return((i), (&(l)->a))
+#define local64_sub_return(i, l) atomic64_sub_return((i), (&(l)->a))
+#define local64_inc_return(l) atomic64_inc_return(&(l)->a)
+
+#define local64_cmpxchg(l, o, n) atomic64_cmpxchg((&(l)->a), (o), (n))
+#define local64_xchg(l, n) atomic64_xchg((&(l)->a), (n))
+#define local64_add_unless(l, _a, u) atomic64_add_unless((&(l)->a), (_a), (u))
+#define local64_inc_not_zero(l) atomic64_inc_not_zero(&(l)->a)
+
+/* Non-atomic variants, ie. preemption disabled and won't be touched
+ * in interrupt, etc. Some archs can optimize this case well. */
+#define __local64_inc(l) local64_set((l), local64_read(l) + 1)
+#define __local64_dec(l) local64_set((l), local64_read(l) - 1)
+#define __local64_add(i,l) local64_set((l), local64_read(l) + (i))
+#define __local64_sub(i,l) local64_set((l), local64_read(l) - (i))
+
+#endif /* BITS_PER_LONG != 64 */
+
+#endif /* _ASM_GENERIC_LOCAL64_H */
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 4e7ae60..8a92a17 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -156,10 +156,6 @@
CPU_KEEP(exit.data) \
MEM_KEEP(init.data) \
MEM_KEEP(exit.data) \
- . = ALIGN(8); \
- VMLINUX_SYMBOL(__start___markers) = .; \
- *(__markers) \
- VMLINUX_SYMBOL(__stop___markers) = .; \
. = ALIGN(32); \
VMLINUX_SYMBOL(__start___tracepoints) = .; \
*(__tracepoints) \
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 41e4633..dcd6a7c 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -1,3 +1,8 @@
+/*
+ * Ftrace header. For implementation details beyond the random comments
+ * scattered below, see: Documentation/trace/ftrace-design.txt
+ */
+
#ifndef _LINUX_FTRACE_H
#define _LINUX_FTRACE_H
diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index 3167f2d..02b8b24 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -11,8 +11,6 @@
struct tracer;
struct dentry;
-DECLARE_PER_CPU(struct trace_seq, ftrace_event_seq);
-
struct trace_print_flags {
unsigned long mask;
const char *name;
@@ -58,6 +56,9 @@
struct ring_buffer_iter *buffer_iter[NR_CPUS];
unsigned long iter_flags;
+ /* trace_seq for __print_flags() and __print_symbolic() etc. */
+ struct trace_seq tmp_seq;
+
/* The below is zeroed out in pipe_read */
struct trace_seq seq;
struct trace_entry *ent;
@@ -146,14 +147,19 @@
int (*raw_init)(struct ftrace_event_call *);
};
+extern int ftrace_event_reg(struct ftrace_event_call *event,
+ enum trace_reg type);
+
enum {
TRACE_EVENT_FL_ENABLED_BIT,
TRACE_EVENT_FL_FILTERED_BIT,
+ TRACE_EVENT_FL_RECORDED_CMD_BIT,
};
enum {
- TRACE_EVENT_FL_ENABLED = (1 << TRACE_EVENT_FL_ENABLED_BIT),
- TRACE_EVENT_FL_FILTERED = (1 << TRACE_EVENT_FL_FILTERED_BIT),
+ TRACE_EVENT_FL_ENABLED = (1 << TRACE_EVENT_FL_ENABLED_BIT),
+ TRACE_EVENT_FL_FILTERED = (1 << TRACE_EVENT_FL_FILTERED_BIT),
+ TRACE_EVENT_FL_RECORDED_CMD = (1 << TRACE_EVENT_FL_RECORDED_CMD_BIT),
};
struct ftrace_event_call {
@@ -171,6 +177,7 @@
* 32 bit flags:
* bit 1: enabled
* bit 2: filter_active
+ * bit 3: enabled cmd record
*
* Changes to flags must hold the event_mutex.
*
@@ -257,8 +264,7 @@
perf_trace_buf_submit(void *raw_data, int size, int rctx, u64 addr,
u64 count, struct pt_regs *regs, void *head)
{
- perf_tp_event(addr, count, raw_data, size, regs, head);
- perf_swevent_put_recursion_context(rctx);
+ perf_tp_event(addr, count, raw_data, size, regs, head, rctx);
}
#endif
diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 5de838b..38e462e 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -513,9 +513,6 @@
extern void tracing_stop(void);
extern void ftrace_off_permanent(void);
-extern void
-ftrace_special(unsigned long arg1, unsigned long arg2, unsigned long arg3);
-
static inline void __attribute__ ((format (printf, 1, 2)))
____trace_printk_check_format(const char *fmt, ...)
{
@@ -591,8 +588,6 @@
extern void ftrace_dump(enum ftrace_dump_mode oops_dump_mode);
#else
-static inline void
-ftrace_special(unsigned long arg1, unsigned long arg2, unsigned long arg3) { }
static inline int
trace_printk(const char *fmt, ...) __attribute__ ((format (printf, 1, 2)));
diff --git a/include/linux/kmemtrace.h b/include/linux/kmemtrace.h
deleted file mode 100644
index b616d39..0000000
--- a/include/linux/kmemtrace.h
+++ /dev/null
@@ -1,25 +0,0 @@
-/*
- * Copyright (C) 2008 Eduard - Gabriel Munteanu
- *
- * This file is released under GPL version 2.
- */
-
-#ifndef _LINUX_KMEMTRACE_H
-#define _LINUX_KMEMTRACE_H
-
-#ifdef __KERNEL__
-
-#include <trace/events/kmem.h>
-
-#ifdef CONFIG_KMEMTRACE
-extern void kmemtrace_init(void);
-#else
-static inline void kmemtrace_init(void)
-{
-}
-#endif
-
-#endif /* __KERNEL__ */
-
-#endif /* _LINUX_KMEMTRACE_H */
-
diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index b752e80..06aab5e 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -20,10 +20,14 @@
extern void acpi_nmi_disable(void);
extern void acpi_nmi_enable(void);
#else
+#ifndef CONFIG_HARDLOCKUP_DETECTOR
static inline void touch_nmi_watchdog(void)
{
touch_softlockup_watchdog();
}
+#else
+extern void touch_nmi_watchdog(void);
+#endif
static inline void acpi_nmi_disable(void) { }
static inline void acpi_nmi_enable(void) { }
#endif
@@ -47,4 +51,13 @@
}
#endif
+#ifdef CONFIG_LOCKUP_DETECTOR
+int hw_nmi_is_cpu_stuck(struct pt_regs *);
+u64 hw_nmi_get_sample_period(void);
+extern int watchdog_enabled;
+struct ctl_table;
+extern int proc_dowatchdog_enabled(struct ctl_table *, int ,
+ void __user *, size_t *, loff_t *);
+#endif
+
#endif
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 5d0266d..937495c 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -214,8 +214,9 @@
* See also PERF_RECORD_MISC_EXACT_IP
*/
precise_ip : 2, /* skid constraint */
+ mmap_data : 1, /* non-exec mmap data */
- __reserved_1 : 47;
+ __reserved_1 : 46;
union {
__u32 wakeup_events; /* wakeup every n events */
@@ -461,6 +462,7 @@
#ifdef CONFIG_PERF_EVENTS
# include <asm/perf_event.h>
+# include <asm/local64.h>
#endif
struct perf_guest_info_callbacks {
@@ -531,14 +533,16 @@
struct hrtimer hrtimer;
};
#ifdef CONFIG_HAVE_HW_BREAKPOINT
- /* breakpoint */
- struct arch_hw_breakpoint info;
+ struct { /* breakpoint */
+ struct arch_hw_breakpoint info;
+ struct list_head bp_list;
+ };
#endif
};
- atomic64_t prev_count;
+ local64_t prev_count;
u64 sample_period;
u64 last_period;
- atomic64_t period_left;
+ local64_t period_left;
u64 interrupts;
u64 freq_time_stamp;
@@ -548,7 +552,10 @@
struct perf_event;
-#define PERF_EVENT_TXN_STARTED 1
+/*
+ * Common implementation detail of pmu::{start,commit,cancel}_txn
+ */
+#define PERF_EVENT_TXN 0x1
/**
* struct pmu - generic performance monitoring unit
@@ -562,14 +569,28 @@
void (*unthrottle) (struct perf_event *event);
/*
- * group events scheduling is treated as a transaction,
- * add group events as a whole and perform one schedulability test.
- * If test fails, roll back the whole group
+ * Group events scheduling is treated as a transaction, add group
+ * events as a whole and perform one schedulability test. If the test
+ * fails, roll back the whole group
*/
+ /*
+ * Start the transaction, after this ->enable() doesn't need
+ * to do schedulability tests.
+ */
void (*start_txn) (const struct pmu *pmu);
- void (*cancel_txn) (const struct pmu *pmu);
+ /*
+ * If ->start_txn() disabled the ->enable() schedulability test
+ * then ->commit_txn() is required to perform one. On success
+ * the transaction is closed. On error the transaction is kept
+ * open until ->cancel_txn() is called.
+ */
int (*commit_txn) (const struct pmu *pmu);
+ /*
+ * Will cancel the transaction, assumes ->disable() is called for
+ * each successfull ->enable() during the transaction.
+ */
+ void (*cancel_txn) (const struct pmu *pmu);
};
/**
@@ -584,7 +605,9 @@
struct file;
-struct perf_mmap_data {
+#define PERF_BUFFER_WRITABLE 0x01
+
+struct perf_buffer {
atomic_t refcount;
struct rcu_head rcu_head;
#ifdef CONFIG_PERF_USE_VMALLOC
@@ -650,7 +673,8 @@
enum perf_event_active_state state;
unsigned int attach_state;
- atomic64_t count;
+ local64_t count;
+ atomic64_t child_count;
/*
* These are the total time in nanoseconds that the event
@@ -709,7 +733,7 @@
atomic_t mmap_count;
int mmap_locked;
struct user_struct *mmap_user;
- struct perf_mmap_data *data;
+ struct perf_buffer *buffer;
/* poll related */
wait_queue_head_t waitq;
@@ -807,7 +831,7 @@
struct perf_output_handle {
struct perf_event *event;
- struct perf_mmap_data *data;
+ struct perf_buffer *buffer;
unsigned long wakeup;
unsigned long size;
void *addr;
@@ -910,8 +934,10 @@
extern void __perf_sw_event(u32, u64, int, struct pt_regs *, u64);
-extern void
-perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip, int skip);
+#ifndef perf_arch_fetch_caller_regs
+static inline void
+perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip) { }
+#endif
/*
* Take a snapshot of the regs. Skip ip and frame pointer to
@@ -921,31 +947,11 @@
* - bp for callchains
* - eflags, for future purposes, just in case
*/
-static inline void perf_fetch_caller_regs(struct pt_regs *regs, int skip)
+static inline void perf_fetch_caller_regs(struct pt_regs *regs)
{
- unsigned long ip;
-
memset(regs, 0, sizeof(*regs));
- switch (skip) {
- case 1 :
- ip = CALLER_ADDR0;
- break;
- case 2 :
- ip = CALLER_ADDR1;
- break;
- case 3 :
- ip = CALLER_ADDR2;
- break;
- case 4:
- ip = CALLER_ADDR3;
- break;
- /* No need to support further for now */
- default:
- ip = 0;
- }
-
- return perf_arch_fetch_caller_regs(regs, ip, skip);
+ perf_arch_fetch_caller_regs(regs, CALLER_ADDR0);
}
static inline void
@@ -955,21 +961,14 @@
struct pt_regs hot_regs;
if (!regs) {
- perf_fetch_caller_regs(&hot_regs, 1);
+ perf_fetch_caller_regs(&hot_regs);
regs = &hot_regs;
}
__perf_sw_event(event_id, nr, nmi, regs, addr);
}
}
-extern void __perf_event_mmap(struct vm_area_struct *vma);
-
-static inline void perf_event_mmap(struct vm_area_struct *vma)
-{
- if (vma->vm_flags & VM_EXEC)
- __perf_event_mmap(vma);
-}
-
+extern void perf_event_mmap(struct vm_area_struct *vma);
extern struct perf_guest_info_callbacks *perf_guest_cbs;
extern int perf_register_guest_info_callbacks(struct perf_guest_info_callbacks *callbacks);
extern int perf_unregister_guest_info_callbacks(struct perf_guest_info_callbacks *callbacks);
@@ -1001,7 +1000,7 @@
extern void perf_event_init(void);
extern void perf_tp_event(u64 addr, u64 count, void *record,
int entry_size, struct pt_regs *regs,
- struct hlist_head *head);
+ struct hlist_head *head, int rctx);
extern void perf_bp_event(struct perf_event *event, void *data);
#ifndef perf_misc_flags
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 0478888..3992f50 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -316,20 +316,16 @@
extern void sched_show_task(struct task_struct *p);
-#ifdef CONFIG_DETECT_SOFTLOCKUP
-extern void softlockup_tick(void);
+#ifdef CONFIG_LOCKUP_DETECTOR
extern void touch_softlockup_watchdog(void);
extern void touch_softlockup_watchdog_sync(void);
extern void touch_all_softlockup_watchdogs(void);
-extern int proc_dosoftlockup_thresh(struct ctl_table *table, int write,
- void __user *buffer,
- size_t *lenp, loff_t *ppos);
+extern int proc_dowatchdog_thresh(struct ctl_table *table, int write,
+ void __user *buffer,
+ size_t *lenp, loff_t *ppos);
extern unsigned int softlockup_panic;
extern int softlockup_thresh;
#else
-static inline void softlockup_tick(void)
-{
-}
static inline void touch_softlockup_watchdog(void)
{
}
@@ -2435,18 +2431,6 @@
#endif /* CONFIG_SMP */
-#ifdef CONFIG_TRACING
-extern void
-__trace_special(void *__tr, void *__data,
- unsigned long arg1, unsigned long arg2, unsigned long arg3);
-#else
-static inline void
-__trace_special(void *__tr, void *__data,
- unsigned long arg1, unsigned long arg2, unsigned long arg3)
-{
-}
-#endif
-
extern long sched_setaffinity(pid_t pid, const struct cpumask *new_mask);
extern long sched_getaffinity(pid_t pid, struct cpumask *mask);
diff --git a/include/linux/slab_def.h b/include/linux/slab_def.h
index 1812dac..1acfa73 100644
--- a/include/linux/slab_def.h
+++ b/include/linux/slab_def.h
@@ -14,7 +14,8 @@
#include <asm/page.h> /* kmalloc_sizes.h needs PAGE_SIZE */
#include <asm/cache.h> /* kmalloc_sizes.h needs L1_CACHE_BYTES */
#include <linux/compiler.h>
-#include <linux/kmemtrace.h>
+
+#include <trace/events/kmem.h>
#ifndef ARCH_KMALLOC_MINALIGN
/*
diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
index 4ba59cf..6447a72 100644
--- a/include/linux/slub_def.h
+++ b/include/linux/slub_def.h
@@ -10,9 +10,10 @@
#include <linux/gfp.h>
#include <linux/workqueue.h>
#include <linux/kobject.h>
-#include <linux/kmemtrace.h>
#include <linux/kmemleak.h>
+#include <trace/events/kmem.h>
+
enum stat_item {
ALLOC_FASTPATH, /* Allocation from cpu slab */
ALLOC_SLOWPATH, /* Allocation by getting a new cpu slab */
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 13ebb54..a6bfd13 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -167,7 +167,6 @@
.enter_event = &event_enter_##sname, \
.exit_event = &event_exit_##sname, \
.enter_fields = LIST_HEAD_INIT(__syscall_meta_##sname.enter_fields), \
- .exit_fields = LIST_HEAD_INIT(__syscall_meta_##sname.exit_fields), \
};
#define SYSCALL_DEFINE0(sname) \
@@ -182,7 +181,6 @@
.enter_event = &event_enter__##sname, \
.exit_event = &event_exit__##sname, \
.enter_fields = LIST_HEAD_INIT(__syscall_meta__##sname.enter_fields), \
- .exit_fields = LIST_HEAD_INIT(__syscall_meta__##sname.exit_fields), \
}; \
asmlinkage long sys_##sname(void)
#else
diff --git a/include/trace/boot.h b/include/trace/boot.h
deleted file mode 100644
index 088ea08..0000000
--- a/include/trace/boot.h
+++ /dev/null
@@ -1,60 +0,0 @@
-#ifndef _LINUX_TRACE_BOOT_H
-#define _LINUX_TRACE_BOOT_H
-
-#include <linux/module.h>
-#include <linux/kallsyms.h>
-#include <linux/init.h>
-
-/*
- * Structure which defines the trace of an initcall
- * while it is called.
- * You don't have to fill the func field since it is
- * only used internally by the tracer.
- */
-struct boot_trace_call {
- pid_t caller;
- char func[KSYM_SYMBOL_LEN];
-};
-
-/*
- * Structure which defines the trace of an initcall
- * while it returns.
- */
-struct boot_trace_ret {
- char func[KSYM_SYMBOL_LEN];
- int result;
- unsigned long long duration; /* nsecs */
-};
-
-#ifdef CONFIG_BOOT_TRACER
-/* Append the traces on the ring-buffer */
-extern void trace_boot_call(struct boot_trace_call *bt, initcall_t fn);
-extern void trace_boot_ret(struct boot_trace_ret *bt, initcall_t fn);
-
-/* Tells the tracer that smp_pre_initcall is finished.
- * So we can start the tracing
- */
-extern void start_boot_trace(void);
-
-/* Resume the tracing of other necessary events
- * such as sched switches
- */
-extern void enable_boot_trace(void);
-
-/* Suspend this tracing. Actually, only sched_switches tracing have
- * to be suspended. Initcalls doesn't need it.)
- */
-extern void disable_boot_trace(void);
-#else
-static inline
-void trace_boot_call(struct boot_trace_call *bt, initcall_t fn) { }
-
-static inline
-void trace_boot_ret(struct boot_trace_ret *bt, initcall_t fn) { }
-
-static inline void start_boot_trace(void) { }
-static inline void enable_boot_trace(void) { }
-static inline void disable_boot_trace(void) { }
-#endif /* CONFIG_BOOT_TRACER */
-
-#endif /* __LINUX_TRACE_BOOT_H */
diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
index b9e1dd6..9208c92 100644
--- a/include/trace/events/sched.h
+++ b/include/trace/events/sched.h
@@ -50,31 +50,6 @@
);
/*
- * Tracepoint for waiting on task to unschedule:
- */
-TRACE_EVENT(sched_wait_task,
-
- TP_PROTO(struct task_struct *p),
-
- TP_ARGS(p),
-
- TP_STRUCT__entry(
- __array( char, comm, TASK_COMM_LEN )
- __field( pid_t, pid )
- __field( int, prio )
- ),
-
- TP_fast_assign(
- memcpy(__entry->comm, p->comm, TASK_COMM_LEN);
- __entry->pid = p->pid;
- __entry->prio = p->prio;
- ),
-
- TP_printk("comm=%s pid=%d prio=%d",
- __entry->comm, __entry->pid, __entry->prio)
-);
-
-/*
* Tracepoint for waking up a task:
*/
DECLARE_EVENT_CLASS(sched_wakeup_template,
@@ -240,6 +215,13 @@
TP_ARGS(p));
/*
+ * Tracepoint for waiting on task to unschedule:
+ */
+DEFINE_EVENT(sched_process_template, sched_wait_task,
+ TP_PROTO(struct task_struct *p),
+ TP_ARGS(p));
+
+/*
* Tracepoint for a waiting task:
*/
TRACE_EVENT(sched_process_wait,
diff --git a/include/trace/events/timer.h b/include/trace/events/timer.h
index 9496b96..c624126 100644
--- a/include/trace/events/timer.h
+++ b/include/trace/events/timer.h
@@ -8,11 +8,7 @@
#include <linux/hrtimer.h>
#include <linux/timer.h>
-/**
- * timer_init - called when the timer is initialized
- * @timer: pointer to struct timer_list
- */
-TRACE_EVENT(timer_init,
+DECLARE_EVENT_CLASS(timer_class,
TP_PROTO(struct timer_list *timer),
@@ -30,6 +26,17 @@
);
/**
+ * timer_init - called when the timer is initialized
+ * @timer: pointer to struct timer_list
+ */
+DEFINE_EVENT(timer_class, timer_init,
+
+ TP_PROTO(struct timer_list *timer),
+
+ TP_ARGS(timer)
+);
+
+/**
* timer_start - called when the timer is started
* @timer: pointer to struct timer_list
* @expires: the timers expiry time
@@ -94,42 +101,22 @@
* NOTE: Do NOT derefernce timer in TP_fast_assign. The pointer might
* be invalid. We solely track the pointer.
*/
-TRACE_EVENT(timer_expire_exit,
+DEFINE_EVENT(timer_class, timer_expire_exit,
TP_PROTO(struct timer_list *timer),
- TP_ARGS(timer),
-
- TP_STRUCT__entry(
- __field(void *, timer )
- ),
-
- TP_fast_assign(
- __entry->timer = timer;
- ),
-
- TP_printk("timer=%p", __entry->timer)
+ TP_ARGS(timer)
);
/**
* timer_cancel - called when the timer is canceled
* @timer: pointer to struct timer_list
*/
-TRACE_EVENT(timer_cancel,
+DEFINE_EVENT(timer_class, timer_cancel,
TP_PROTO(struct timer_list *timer),
- TP_ARGS(timer),
-
- TP_STRUCT__entry(
- __field( void *, timer )
- ),
-
- TP_fast_assign(
- __entry->timer = timer;
- ),
-
- TP_printk("timer=%p", __entry->timer)
+ TP_ARGS(timer)
);
/**
@@ -224,14 +211,7 @@
(unsigned long long)ktime_to_ns((ktime_t) { .tv64 = __entry->now }))
);
-/**
- * hrtimer_expire_exit - called immediately after the hrtimer callback returns
- * @timer: pointer to struct hrtimer
- *
- * When used in combination with the hrtimer_expire_entry tracepoint we can
- * determine the runtime of the callback function.
- */
-TRACE_EVENT(hrtimer_expire_exit,
+DECLARE_EVENT_CLASS(hrtimer_class,
TP_PROTO(struct hrtimer *hrtimer),
@@ -249,24 +229,28 @@
);
/**
- * hrtimer_cancel - called when the hrtimer is canceled
- * @hrtimer: pointer to struct hrtimer
+ * hrtimer_expire_exit - called immediately after the hrtimer callback returns
+ * @timer: pointer to struct hrtimer
+ *
+ * When used in combination with the hrtimer_expire_entry tracepoint we can
+ * determine the runtime of the callback function.
*/
-TRACE_EVENT(hrtimer_cancel,
+DEFINE_EVENT(hrtimer_class, hrtimer_expire_exit,
TP_PROTO(struct hrtimer *hrtimer),
- TP_ARGS(hrtimer),
+ TP_ARGS(hrtimer)
+);
- TP_STRUCT__entry(
- __field( void *, hrtimer )
- ),
+/**
+ * hrtimer_cancel - called when the hrtimer is canceled
+ * @hrtimer: pointer to struct hrtimer
+ */
+DEFINE_EVENT(hrtimer_class, hrtimer_cancel,
- TP_fast_assign(
- __entry->hrtimer = hrtimer;
- ),
+ TP_PROTO(struct hrtimer *hrtimer),
- TP_printk("hrtimer=%p", __entry->hrtimer)
+ TP_ARGS(hrtimer)
);
/**
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 5a64905..a9377c0 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -75,15 +75,12 @@
#define DEFINE_EVENT_PRINT(template, name, proto, args, print) \
DEFINE_EVENT(template, name, PARAMS(proto), PARAMS(args))
-#undef __cpparg
-#define __cpparg(arg...) arg
-
/* Callbacks are meaningless to ftrace. */
#undef TRACE_EVENT_FN
#define TRACE_EVENT_FN(name, proto, args, tstruct, \
assign, print, reg, unreg) \
- TRACE_EVENT(name, __cpparg(proto), __cpparg(args), \
- __cpparg(tstruct), __cpparg(assign), __cpparg(print)) \
+ TRACE_EVENT(name, PARAMS(proto), PARAMS(args), \
+ PARAMS(tstruct), PARAMS(assign), PARAMS(print)) \
#include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
@@ -145,7 +142,7 @@
* struct trace_seq *s = &iter->seq;
* struct ftrace_raw_<call> *field; <-- defined in stage 1
* struct trace_entry *entry;
- * struct trace_seq *p;
+ * struct trace_seq *p = &iter->tmp_seq;
* int ret;
*
* entry = iter->ent;
@@ -157,12 +154,10 @@
*
* field = (typeof(field))entry;
*
- * p = &get_cpu_var(ftrace_event_seq);
* trace_seq_init(p);
* ret = trace_seq_printf(s, "%s: ", <call>);
* if (ret)
* ret = trace_seq_printf(s, <TP_printk> "\n");
- * put_cpu();
* if (!ret)
* return TRACE_TYPE_PARTIAL_LINE;
*
@@ -216,7 +211,7 @@
struct trace_seq *s = &iter->seq; \
struct ftrace_raw_##call *field; \
struct trace_entry *entry; \
- struct trace_seq *p; \
+ struct trace_seq *p = &iter->tmp_seq; \
int ret; \
\
event = container_of(trace_event, struct ftrace_event_call, \
@@ -231,12 +226,10 @@
\
field = (typeof(field))entry; \
\
- p = &get_cpu_var(ftrace_event_seq); \
trace_seq_init(p); \
ret = trace_seq_printf(s, "%s: ", event->name); \
if (ret) \
ret = trace_seq_printf(s, print); \
- put_cpu(); \
if (!ret) \
return TRACE_TYPE_PARTIAL_LINE; \
\
@@ -255,7 +248,7 @@
struct trace_seq *s = &iter->seq; \
struct ftrace_raw_##template *field; \
struct trace_entry *entry; \
- struct trace_seq *p; \
+ struct trace_seq *p = &iter->tmp_seq; \
int ret; \
\
entry = iter->ent; \
@@ -267,12 +260,10 @@
\
field = (typeof(field))entry; \
\
- p = &get_cpu_var(ftrace_event_seq); \
trace_seq_init(p); \
ret = trace_seq_printf(s, "%s: ", #call); \
if (ret) \
ret = trace_seq_printf(s, print); \
- put_cpu(); \
if (!ret) \
return TRACE_TYPE_PARTIAL_LINE; \
\
@@ -439,6 +430,7 @@
* .fields = LIST_HEAD_INIT(event_class_##call.fields),
* .raw_init = trace_event_raw_init,
* .probe = ftrace_raw_event_##call,
+ * .reg = ftrace_event_reg,
* };
*
* static struct ftrace_event_call __used
@@ -567,6 +559,7 @@
.fields = LIST_HEAD_INIT(event_class_##call.fields),\
.raw_init = trace_event_raw_init, \
.probe = ftrace_raw_event_##call, \
+ .reg = ftrace_event_reg, \
_TRACE_PERF_INIT(call) \
};
@@ -705,7 +698,7 @@
int __data_size; \
int rctx; \
\
- perf_fetch_caller_regs(&__regs, 1); \
+ perf_fetch_caller_regs(&__regs); \
\
__data_size = ftrace_get_offsets_##call(&__data_offsets, args); \
__entry_size = ALIGN(__data_size + sizeof(*entry) + sizeof(u32),\
diff --git a/include/trace/syscall.h b/include/trace/syscall.h
index 257e089..31966a4 100644
--- a/include/trace/syscall.h
+++ b/include/trace/syscall.h
@@ -26,7 +26,6 @@
const char **types;
const char **args;
struct list_head enter_fields;
- struct list_head exit_fields;
struct ftrace_event_call *enter_event;
struct ftrace_event_call *exit_event;
diff --git a/init/main.c b/init/main.c
index 4ddb53f..b03a4c1 100644
--- a/init/main.c
+++ b/init/main.c
@@ -66,11 +66,9 @@
#include <linux/ftrace.h>
#include <linux/async.h>
#include <linux/kmemcheck.h>
-#include <linux/kmemtrace.h>
#include <linux/sfi.h>
#include <linux/shmem_fs.h>
#include <linux/slab.h>
-#include <trace/boot.h>
#include <asm/io.h>
#include <asm/bugs.h>
@@ -664,7 +662,6 @@
#endif
page_cgroup_init();
enable_debug_pagealloc();
- kmemtrace_init();
kmemleak_init();
debug_objects_mem_init();
idr_init_cache();
@@ -726,38 +723,33 @@
core_param(initcall_debug, initcall_debug, bool, 0644);
static char msgbuf[64];
-static struct boot_trace_call call;
-static struct boot_trace_ret ret;
int do_one_initcall(initcall_t fn)
{
int count = preempt_count();
ktime_t calltime, delta, rettime;
+ unsigned long long duration;
+ int ret;
if (initcall_debug) {
- call.caller = task_pid_nr(current);
- printk("calling %pF @ %i\n", fn, call.caller);
+ printk("calling %pF @ %i\n", fn, task_pid_nr(current));
calltime = ktime_get();
- trace_boot_call(&call, fn);
- enable_boot_trace();
}
- ret.result = fn();
+ ret = fn();
if (initcall_debug) {
- disable_boot_trace();
rettime = ktime_get();
delta = ktime_sub(rettime, calltime);
- ret.duration = (unsigned long long) ktime_to_ns(delta) >> 10;
- trace_boot_ret(&ret, fn);
- printk("initcall %pF returned %d after %Ld usecs\n", fn,
- ret.result, ret.duration);
+ duration = (unsigned long long) ktime_to_ns(delta) >> 10;
+ printk("initcall %pF returned %d after %lld usecs\n", fn,
+ ret, duration);
}
msgbuf[0] = 0;
- if (ret.result && ret.result != -ENODEV && initcall_debug)
- sprintf(msgbuf, "error code %d ", ret.result);
+ if (ret && ret != -ENODEV && initcall_debug)
+ sprintf(msgbuf, "error code %d ", ret);
if (preempt_count() != count) {
strlcat(msgbuf, "preemption imbalance ", sizeof(msgbuf));
@@ -771,7 +763,7 @@
printk("initcall %pF returned with %s\n", fn, msgbuf);
}
- return ret.result;
+ return ret;
}
@@ -895,7 +887,6 @@
smp_prepare_cpus(setup_max_cpus);
do_pre_smp_initcalls();
- start_boot_trace();
smp_init();
sched_init_smp();
diff --git a/kernel/Makefile b/kernel/Makefile
index 057472f..ce53fb2 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -76,8 +76,8 @@
obj-$(CONFIG_AUDIT_TREE) += audit_tree.o
obj-$(CONFIG_KPROBES) += kprobes.o
obj-$(CONFIG_KGDB) += debug/
-obj-$(CONFIG_DETECT_SOFTLOCKUP) += softlockup.o
obj-$(CONFIG_DETECT_HUNG_TASK) += hung_task.o
+obj-$(CONFIG_LOCKUP_DETECTOR) += watchdog.o
obj-$(CONFIG_GENERIC_HARDIRQS) += irq/
obj-$(CONFIG_SECCOMP) += seccomp.o
obj-$(CONFIG_RCU_TORTURE_TEST) += rcutorture.o
diff --git a/kernel/hw_breakpoint.c b/kernel/hw_breakpoint.c
index 71ed3ce..d71a987 100644
--- a/kernel/hw_breakpoint.c
+++ b/kernel/hw_breakpoint.c
@@ -41,6 +41,7 @@
#include <linux/sched.h>
#include <linux/init.h>
#include <linux/slab.h>
+#include <linux/list.h>
#include <linux/cpu.h>
#include <linux/smp.h>
@@ -62,6 +63,9 @@
static int nr_slots[TYPE_MAX];
+/* Keep track of the breakpoints attached to tasks */
+static LIST_HEAD(bp_task_head);
+
static int constraints_initialized;
/* Gather the number of total pinned and un-pinned bp in a cpuset */
@@ -103,33 +107,21 @@
return 0;
}
-static int task_bp_pinned(struct task_struct *tsk, enum bp_type_idx type)
+/*
+ * Count the number of breakpoints of the same type and same task.
+ * The given event must be not on the list.
+ */
+static int task_bp_pinned(struct perf_event *bp, enum bp_type_idx type)
{
- struct perf_event_context *ctx = tsk->perf_event_ctxp;
- struct list_head *list;
- struct perf_event *bp;
- unsigned long flags;
+ struct perf_event_context *ctx = bp->ctx;
+ struct perf_event *iter;
int count = 0;
- if (WARN_ONCE(!ctx, "No perf context for this task"))
- return 0;
-
- list = &ctx->event_list;
-
- raw_spin_lock_irqsave(&ctx->lock, flags);
-
- /*
- * The current breakpoint counter is not included in the list
- * at the open() callback time
- */
- list_for_each_entry(bp, list, event_entry) {
- if (bp->attr.type == PERF_TYPE_BREAKPOINT)
- if (find_slot_idx(bp) == type)
- count += hw_breakpoint_weight(bp);
+ list_for_each_entry(iter, &bp_task_head, hw.bp_list) {
+ if (iter->ctx == ctx && find_slot_idx(iter) == type)
+ count += hw_breakpoint_weight(iter);
}
- raw_spin_unlock_irqrestore(&ctx->lock, flags);
-
return count;
}
@@ -149,7 +141,7 @@
if (!tsk)
slots->pinned += max_task_bp_pinned(cpu, type);
else
- slots->pinned += task_bp_pinned(tsk, type);
+ slots->pinned += task_bp_pinned(bp, type);
slots->flexible = per_cpu(nr_bp_flexible[type], cpu);
return;
@@ -162,7 +154,7 @@
if (!tsk)
nr += max_task_bp_pinned(cpu, type);
else
- nr += task_bp_pinned(tsk, type);
+ nr += task_bp_pinned(bp, type);
if (nr > slots->pinned)
slots->pinned = nr;
@@ -188,7 +180,7 @@
/*
* Add a pinned breakpoint for the given task in our constraint table
*/
-static void toggle_bp_task_slot(struct task_struct *tsk, int cpu, bool enable,
+static void toggle_bp_task_slot(struct perf_event *bp, int cpu, bool enable,
enum bp_type_idx type, int weight)
{
unsigned int *tsk_pinned;
@@ -196,10 +188,11 @@
int old_idx = 0;
int idx = 0;
- old_count = task_bp_pinned(tsk, type);
+ old_count = task_bp_pinned(bp, type);
old_idx = old_count - 1;
idx = old_idx + weight;
+ /* tsk_pinned[n] is the number of tasks having n breakpoints */
tsk_pinned = per_cpu(nr_task_bp_pinned[type], cpu);
if (enable) {
tsk_pinned[idx]++;
@@ -222,23 +215,30 @@
int cpu = bp->cpu;
struct task_struct *tsk = bp->ctx->task;
- /* Pinned counter task profiling */
- if (tsk) {
- if (cpu >= 0) {
- toggle_bp_task_slot(tsk, cpu, enable, type, weight);
- return;
- }
+ /* Pinned counter cpu profiling */
+ if (!tsk) {
- for_each_online_cpu(cpu)
- toggle_bp_task_slot(tsk, cpu, enable, type, weight);
+ if (enable)
+ per_cpu(nr_cpu_bp_pinned[type], bp->cpu) += weight;
+ else
+ per_cpu(nr_cpu_bp_pinned[type], bp->cpu) -= weight;
return;
}
- /* Pinned counter cpu profiling */
+ /* Pinned counter task profiling */
+
+ if (!enable)
+ list_del(&bp->hw.bp_list);
+
+ if (cpu >= 0) {
+ toggle_bp_task_slot(bp, cpu, enable, type, weight);
+ } else {
+ for_each_online_cpu(cpu)
+ toggle_bp_task_slot(bp, cpu, enable, type, weight);
+ }
+
if (enable)
- per_cpu(nr_cpu_bp_pinned[type], bp->cpu) += weight;
- else
- per_cpu(nr_cpu_bp_pinned[type], bp->cpu) -= weight;
+ list_add_tail(&bp->hw.bp_list, &bp_task_head);
}
/*
@@ -312,6 +312,10 @@
weight = hw_breakpoint_weight(bp);
fetch_bp_busy_slots(&slots, bp, type);
+ /*
+ * Simulate the addition of this breakpoint to the constraints
+ * and see the result.
+ */
fetch_this_slot(&slots, weight);
/* Flexible counters need to keep at least one slot */
diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index ff86c55..c772a3d4 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -675,7 +675,6 @@
struct perf_event *event, *partial_group = NULL;
const struct pmu *pmu = group_event->pmu;
bool txn = false;
- int ret;
if (group_event->state == PERF_EVENT_STATE_OFF)
return 0;
@@ -703,15 +702,9 @@
}
}
- if (!txn)
+ if (!txn || !pmu->commit_txn(pmu))
return 0;
- ret = pmu->commit_txn(pmu);
- if (!ret) {
- pmu->cancel_txn(pmu);
- return 0;
- }
-
group_error:
/*
* Groups can be scheduled in as one unit only, so undo any
@@ -1155,9 +1148,9 @@
* In order to keep per-task stats reliable we need to flip the event
* values when we flip the contexts.
*/
- value = atomic64_read(&next_event->count);
- value = atomic64_xchg(&event->count, value);
- atomic64_set(&next_event->count, value);
+ value = local64_read(&next_event->count);
+ value = local64_xchg(&event->count, value);
+ local64_set(&next_event->count, value);
swap(event->total_time_enabled, next_event->total_time_enabled);
swap(event->total_time_running, next_event->total_time_running);
@@ -1547,10 +1540,10 @@
hwc->sample_period = sample_period;
- if (atomic64_read(&hwc->period_left) > 8*sample_period) {
+ if (local64_read(&hwc->period_left) > 8*sample_period) {
perf_disable();
perf_event_stop(event);
- atomic64_set(&hwc->period_left, 0);
+ local64_set(&hwc->period_left, 0);
perf_event_start(event);
perf_enable();
}
@@ -1591,7 +1584,7 @@
perf_disable();
event->pmu->read(event);
- now = atomic64_read(&event->count);
+ now = local64_read(&event->count);
delta = now - hwc->freq_count_stamp;
hwc->freq_count_stamp = now;
@@ -1743,6 +1736,11 @@
event->pmu->read(event);
}
+static inline u64 perf_event_count(struct perf_event *event)
+{
+ return local64_read(&event->count) + atomic64_read(&event->child_count);
+}
+
static u64 perf_event_read(struct perf_event *event)
{
/*
@@ -1762,7 +1760,7 @@
raw_spin_unlock_irqrestore(&ctx->lock, flags);
}
- return atomic64_read(&event->count);
+ return perf_event_count(event);
}
/*
@@ -1883,7 +1881,7 @@
}
static void perf_pending_sync(struct perf_event *event);
-static void perf_mmap_data_put(struct perf_mmap_data *data);
+static void perf_buffer_put(struct perf_buffer *buffer);
static void free_event(struct perf_event *event)
{
@@ -1891,7 +1889,7 @@
if (!event->parent) {
atomic_dec(&nr_events);
- if (event->attr.mmap)
+ if (event->attr.mmap || event->attr.mmap_data)
atomic_dec(&nr_mmap_events);
if (event->attr.comm)
atomic_dec(&nr_comm_events);
@@ -1899,9 +1897,9 @@
atomic_dec(&nr_task_events);
}
- if (event->data) {
- perf_mmap_data_put(event->data);
- event->data = NULL;
+ if (event->buffer) {
+ perf_buffer_put(event->buffer);
+ event->buffer = NULL;
}
if (event->destroy)
@@ -2126,13 +2124,13 @@
static unsigned int perf_poll(struct file *file, poll_table *wait)
{
struct perf_event *event = file->private_data;
- struct perf_mmap_data *data;
+ struct perf_buffer *buffer;
unsigned int events = POLL_HUP;
rcu_read_lock();
- data = rcu_dereference(event->data);
- if (data)
- events = atomic_xchg(&data->poll, 0);
+ buffer = rcu_dereference(event->buffer);
+ if (buffer)
+ events = atomic_xchg(&buffer->poll, 0);
rcu_read_unlock();
poll_wait(file, &event->waitq, wait);
@@ -2143,7 +2141,7 @@
static void perf_event_reset(struct perf_event *event)
{
(void)perf_event_read(event);
- atomic64_set(&event->count, 0);
+ local64_set(&event->count, 0);
perf_event_update_userpage(event);
}
@@ -2342,14 +2340,14 @@
void perf_event_update_userpage(struct perf_event *event)
{
struct perf_event_mmap_page *userpg;
- struct perf_mmap_data *data;
+ struct perf_buffer *buffer;
rcu_read_lock();
- data = rcu_dereference(event->data);
- if (!data)
+ buffer = rcu_dereference(event->buffer);
+ if (!buffer)
goto unlock;
- userpg = data->user_page;
+ userpg = buffer->user_page;
/*
* Disable preemption so as to not let the corresponding user-space
@@ -2359,9 +2357,9 @@
++userpg->lock;
barrier();
userpg->index = perf_event_index(event);
- userpg->offset = atomic64_read(&event->count);
+ userpg->offset = perf_event_count(event);
if (event->state == PERF_EVENT_STATE_ACTIVE)
- userpg->offset -= atomic64_read(&event->hw.prev_count);
+ userpg->offset -= local64_read(&event->hw.prev_count);
userpg->time_enabled = event->total_time_enabled +
atomic64_read(&event->child_total_time_enabled);
@@ -2376,6 +2374,25 @@
rcu_read_unlock();
}
+static unsigned long perf_data_size(struct perf_buffer *buffer);
+
+static void
+perf_buffer_init(struct perf_buffer *buffer, long watermark, int flags)
+{
+ long max_size = perf_data_size(buffer);
+
+ if (watermark)
+ buffer->watermark = min(max_size, watermark);
+
+ if (!buffer->watermark)
+ buffer->watermark = max_size / 2;
+
+ if (flags & PERF_BUFFER_WRITABLE)
+ buffer->writable = 1;
+
+ atomic_set(&buffer->refcount, 1);
+}
+
#ifndef CONFIG_PERF_USE_VMALLOC
/*
@@ -2383,15 +2400,15 @@
*/
static struct page *
-perf_mmap_to_page(struct perf_mmap_data *data, unsigned long pgoff)
+perf_mmap_to_page(struct perf_buffer *buffer, unsigned long pgoff)
{
- if (pgoff > data->nr_pages)
+ if (pgoff > buffer->nr_pages)
return NULL;
if (pgoff == 0)
- return virt_to_page(data->user_page);
+ return virt_to_page(buffer->user_page);
- return virt_to_page(data->data_pages[pgoff - 1]);
+ return virt_to_page(buffer->data_pages[pgoff - 1]);
}
static void *perf_mmap_alloc_page(int cpu)
@@ -2407,42 +2424,44 @@
return page_address(page);
}
-static struct perf_mmap_data *
-perf_mmap_data_alloc(struct perf_event *event, int nr_pages)
+static struct perf_buffer *
+perf_buffer_alloc(int nr_pages, long watermark, int cpu, int flags)
{
- struct perf_mmap_data *data;
+ struct perf_buffer *buffer;
unsigned long size;
int i;
- size = sizeof(struct perf_mmap_data);
+ size = sizeof(struct perf_buffer);
size += nr_pages * sizeof(void *);
- data = kzalloc(size, GFP_KERNEL);
- if (!data)
+ buffer = kzalloc(size, GFP_KERNEL);
+ if (!buffer)
goto fail;
- data->user_page = perf_mmap_alloc_page(event->cpu);
- if (!data->user_page)
+ buffer->user_page = perf_mmap_alloc_page(cpu);
+ if (!buffer->user_page)
goto fail_user_page;
for (i = 0; i < nr_pages; i++) {
- data->data_pages[i] = perf_mmap_alloc_page(event->cpu);
- if (!data->data_pages[i])
+ buffer->data_pages[i] = perf_mmap_alloc_page(cpu);
+ if (!buffer->data_pages[i])
goto fail_data_pages;
}
- data->nr_pages = nr_pages;
+ buffer->nr_pages = nr_pages;
- return data;
+ perf_buffer_init(buffer, watermark, flags);
+
+ return buffer;
fail_data_pages:
for (i--; i >= 0; i--)
- free_page((unsigned long)data->data_pages[i]);
+ free_page((unsigned long)buffer->data_pages[i]);
- free_page((unsigned long)data->user_page);
+ free_page((unsigned long)buffer->user_page);
fail_user_page:
- kfree(data);
+ kfree(buffer);
fail:
return NULL;
@@ -2456,17 +2475,17 @@
__free_page(page);
}
-static void perf_mmap_data_free(struct perf_mmap_data *data)
+static void perf_buffer_free(struct perf_buffer *buffer)
{
int i;
- perf_mmap_free_page((unsigned long)data->user_page);
- for (i = 0; i < data->nr_pages; i++)
- perf_mmap_free_page((unsigned long)data->data_pages[i]);
- kfree(data);
+ perf_mmap_free_page((unsigned long)buffer->user_page);
+ for (i = 0; i < buffer->nr_pages; i++)
+ perf_mmap_free_page((unsigned long)buffer->data_pages[i]);
+ kfree(buffer);
}
-static inline int page_order(struct perf_mmap_data *data)
+static inline int page_order(struct perf_buffer *buffer)
{
return 0;
}
@@ -2479,18 +2498,18 @@
* Required for architectures that have d-cache aliasing issues.
*/
-static inline int page_order(struct perf_mmap_data *data)
+static inline int page_order(struct perf_buffer *buffer)
{
- return data->page_order;
+ return buffer->page_order;
}
static struct page *
-perf_mmap_to_page(struct perf_mmap_data *data, unsigned long pgoff)
+perf_mmap_to_page(struct perf_buffer *buffer, unsigned long pgoff)
{
- if (pgoff > (1UL << page_order(data)))
+ if (pgoff > (1UL << page_order(buffer)))
return NULL;
- return vmalloc_to_page((void *)data->user_page + pgoff * PAGE_SIZE);
+ return vmalloc_to_page((void *)buffer->user_page + pgoff * PAGE_SIZE);
}
static void perf_mmap_unmark_page(void *addr)
@@ -2500,57 +2519,59 @@
page->mapping = NULL;
}
-static void perf_mmap_data_free_work(struct work_struct *work)
+static void perf_buffer_free_work(struct work_struct *work)
{
- struct perf_mmap_data *data;
+ struct perf_buffer *buffer;
void *base;
int i, nr;
- data = container_of(work, struct perf_mmap_data, work);
- nr = 1 << page_order(data);
+ buffer = container_of(work, struct perf_buffer, work);
+ nr = 1 << page_order(buffer);
- base = data->user_page;
+ base = buffer->user_page;
for (i = 0; i < nr + 1; i++)
perf_mmap_unmark_page(base + (i * PAGE_SIZE));
vfree(base);
- kfree(data);
+ kfree(buffer);
}
-static void perf_mmap_data_free(struct perf_mmap_data *data)
+static void perf_buffer_free(struct perf_buffer *buffer)
{
- schedule_work(&data->work);
+ schedule_work(&buffer->work);
}
-static struct perf_mmap_data *
-perf_mmap_data_alloc(struct perf_event *event, int nr_pages)
+static struct perf_buffer *
+perf_buffer_alloc(int nr_pages, long watermark, int cpu, int flags)
{
- struct perf_mmap_data *data;
+ struct perf_buffer *buffer;
unsigned long size;
void *all_buf;
- size = sizeof(struct perf_mmap_data);
+ size = sizeof(struct perf_buffer);
size += sizeof(void *);
- data = kzalloc(size, GFP_KERNEL);
- if (!data)
+ buffer = kzalloc(size, GFP_KERNEL);
+ if (!buffer)
goto fail;
- INIT_WORK(&data->work, perf_mmap_data_free_work);
+ INIT_WORK(&buffer->work, perf_buffer_free_work);
all_buf = vmalloc_user((nr_pages + 1) * PAGE_SIZE);
if (!all_buf)
goto fail_all_buf;
- data->user_page = all_buf;
- data->data_pages[0] = all_buf + PAGE_SIZE;
- data->page_order = ilog2(nr_pages);
- data->nr_pages = 1;
+ buffer->user_page = all_buf;
+ buffer->data_pages[0] = all_buf + PAGE_SIZE;
+ buffer->page_order = ilog2(nr_pages);
+ buffer->nr_pages = 1;
- return data;
+ perf_buffer_init(buffer, watermark, flags);
+
+ return buffer;
fail_all_buf:
- kfree(data);
+ kfree(buffer);
fail:
return NULL;
@@ -2558,15 +2579,15 @@
#endif
-static unsigned long perf_data_size(struct perf_mmap_data *data)
+static unsigned long perf_data_size(struct perf_buffer *buffer)
{
- return data->nr_pages << (PAGE_SHIFT + page_order(data));
+ return buffer->nr_pages << (PAGE_SHIFT + page_order(buffer));
}
static int perf_mmap_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
{
struct perf_event *event = vma->vm_file->private_data;
- struct perf_mmap_data *data;
+ struct perf_buffer *buffer;
int ret = VM_FAULT_SIGBUS;
if (vmf->flags & FAULT_FLAG_MKWRITE) {
@@ -2576,14 +2597,14 @@
}
rcu_read_lock();
- data = rcu_dereference(event->data);
- if (!data)
+ buffer = rcu_dereference(event->buffer);
+ if (!buffer)
goto unlock;
if (vmf->pgoff && (vmf->flags & FAULT_FLAG_WRITE))
goto unlock;
- vmf->page = perf_mmap_to_page(data, vmf->pgoff);
+ vmf->page = perf_mmap_to_page(buffer, vmf->pgoff);
if (!vmf->page)
goto unlock;
@@ -2598,52 +2619,35 @@
return ret;
}
-static void
-perf_mmap_data_init(struct perf_event *event, struct perf_mmap_data *data)
+static void perf_buffer_free_rcu(struct rcu_head *rcu_head)
{
- long max_size = perf_data_size(data);
+ struct perf_buffer *buffer;
- if (event->attr.watermark) {
- data->watermark = min_t(long, max_size,
- event->attr.wakeup_watermark);
- }
-
- if (!data->watermark)
- data->watermark = max_size / 2;
-
- atomic_set(&data->refcount, 1);
- rcu_assign_pointer(event->data, data);
+ buffer = container_of(rcu_head, struct perf_buffer, rcu_head);
+ perf_buffer_free(buffer);
}
-static void perf_mmap_data_free_rcu(struct rcu_head *rcu_head)
+static struct perf_buffer *perf_buffer_get(struct perf_event *event)
{
- struct perf_mmap_data *data;
-
- data = container_of(rcu_head, struct perf_mmap_data, rcu_head);
- perf_mmap_data_free(data);
-}
-
-static struct perf_mmap_data *perf_mmap_data_get(struct perf_event *event)
-{
- struct perf_mmap_data *data;
+ struct perf_buffer *buffer;
rcu_read_lock();
- data = rcu_dereference(event->data);
- if (data) {
- if (!atomic_inc_not_zero(&data->refcount))
- data = NULL;
+ buffer = rcu_dereference(event->buffer);
+ if (buffer) {
+ if (!atomic_inc_not_zero(&buffer->refcount))
+ buffer = NULL;
}
rcu_read_unlock();
- return data;
+ return buffer;
}
-static void perf_mmap_data_put(struct perf_mmap_data *data)
+static void perf_buffer_put(struct perf_buffer *buffer)
{
- if (!atomic_dec_and_test(&data->refcount))
+ if (!atomic_dec_and_test(&buffer->refcount))
return;
- call_rcu(&data->rcu_head, perf_mmap_data_free_rcu);
+ call_rcu(&buffer->rcu_head, perf_buffer_free_rcu);
}
static void perf_mmap_open(struct vm_area_struct *vma)
@@ -2658,16 +2662,16 @@
struct perf_event *event = vma->vm_file->private_data;
if (atomic_dec_and_mutex_lock(&event->mmap_count, &event->mmap_mutex)) {
- unsigned long size = perf_data_size(event->data);
+ unsigned long size = perf_data_size(event->buffer);
struct user_struct *user = event->mmap_user;
- struct perf_mmap_data *data = event->data;
+ struct perf_buffer *buffer = event->buffer;
atomic_long_sub((size >> PAGE_SHIFT) + 1, &user->locked_vm);
vma->vm_mm->locked_vm -= event->mmap_locked;
- rcu_assign_pointer(event->data, NULL);
+ rcu_assign_pointer(event->buffer, NULL);
mutex_unlock(&event->mmap_mutex);
- perf_mmap_data_put(data);
+ perf_buffer_put(buffer);
free_uid(user);
}
}
@@ -2685,11 +2689,11 @@
unsigned long user_locked, user_lock_limit;
struct user_struct *user = current_user();
unsigned long locked, lock_limit;
- struct perf_mmap_data *data;
+ struct perf_buffer *buffer;
unsigned long vma_size;
unsigned long nr_pages;
long user_extra, extra;
- int ret = 0;
+ int ret = 0, flags = 0;
/*
* Don't allow mmap() of inherited per-task counters. This would
@@ -2706,7 +2710,7 @@
nr_pages = (vma_size / PAGE_SIZE) - 1;
/*
- * If we have data pages ensure they're a power-of-two number, so we
+ * If we have buffer pages ensure they're a power-of-two number, so we
* can do bitmasks instead of modulo.
*/
if (nr_pages != 0 && !is_power_of_2(nr_pages))
@@ -2720,9 +2724,9 @@
WARN_ON_ONCE(event->ctx->parent_ctx);
mutex_lock(&event->mmap_mutex);
- if (event->data) {
- if (event->data->nr_pages == nr_pages)
- atomic_inc(&event->data->refcount);
+ if (event->buffer) {
+ if (event->buffer->nr_pages == nr_pages)
+ atomic_inc(&event->buffer->refcount);
else
ret = -EINVAL;
goto unlock;
@@ -2752,17 +2756,18 @@
goto unlock;
}
- WARN_ON(event->data);
+ WARN_ON(event->buffer);
- data = perf_mmap_data_alloc(event, nr_pages);
- if (!data) {
+ if (vma->vm_flags & VM_WRITE)
+ flags |= PERF_BUFFER_WRITABLE;
+
+ buffer = perf_buffer_alloc(nr_pages, event->attr.wakeup_watermark,
+ event->cpu, flags);
+ if (!buffer) {
ret = -ENOMEM;
goto unlock;
}
-
- perf_mmap_data_init(event, data);
- if (vma->vm_flags & VM_WRITE)
- event->data->writable = 1;
+ rcu_assign_pointer(event->buffer, buffer);
atomic_long_add(user_extra, &user->locked_vm);
event->mmap_locked = extra;
@@ -2941,11 +2946,6 @@
return NULL;
}
-__weak
-void perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip, int skip)
-{
-}
-
/*
* We assume there is only KVM supporting the callbacks.
@@ -2971,15 +2971,15 @@
/*
* Output
*/
-static bool perf_output_space(struct perf_mmap_data *data, unsigned long tail,
+static bool perf_output_space(struct perf_buffer *buffer, unsigned long tail,
unsigned long offset, unsigned long head)
{
unsigned long mask;
- if (!data->writable)
+ if (!buffer->writable)
return true;
- mask = perf_data_size(data) - 1;
+ mask = perf_data_size(buffer) - 1;
offset = (offset - tail) & mask;
head = (head - tail) & mask;
@@ -2992,7 +2992,7 @@
static void perf_output_wakeup(struct perf_output_handle *handle)
{
- atomic_set(&handle->data->poll, POLL_IN);
+ atomic_set(&handle->buffer->poll, POLL_IN);
if (handle->nmi) {
handle->event->pending_wakeup = 1;
@@ -3012,45 +3012,45 @@
*/
static void perf_output_get_handle(struct perf_output_handle *handle)
{
- struct perf_mmap_data *data = handle->data;
+ struct perf_buffer *buffer = handle->buffer;
preempt_disable();
- local_inc(&data->nest);
- handle->wakeup = local_read(&data->wakeup);
+ local_inc(&buffer->nest);
+ handle->wakeup = local_read(&buffer->wakeup);
}
static void perf_output_put_handle(struct perf_output_handle *handle)
{
- struct perf_mmap_data *data = handle->data;
+ struct perf_buffer *buffer = handle->buffer;
unsigned long head;
again:
- head = local_read(&data->head);
+ head = local_read(&buffer->head);
/*
* IRQ/NMI can happen here, which means we can miss a head update.
*/
- if (!local_dec_and_test(&data->nest))
+ if (!local_dec_and_test(&buffer->nest))
goto out;
/*
* Publish the known good head. Rely on the full barrier implied
- * by atomic_dec_and_test() order the data->head read and this
+ * by atomic_dec_and_test() order the buffer->head read and this
* write.
*/
- data->user_page->data_head = head;
+ buffer->user_page->data_head = head;
/*
* Now check if we missed an update, rely on the (compiler)
- * barrier in atomic_dec_and_test() to re-read data->head.
+ * barrier in atomic_dec_and_test() to re-read buffer->head.
*/
- if (unlikely(head != local_read(&data->head))) {
- local_inc(&data->nest);
+ if (unlikely(head != local_read(&buffer->head))) {
+ local_inc(&buffer->nest);
goto again;
}
- if (handle->wakeup != local_read(&data->wakeup))
+ if (handle->wakeup != local_read(&buffer->wakeup))
perf_output_wakeup(handle);
out:
@@ -3070,12 +3070,12 @@
buf += size;
handle->size -= size;
if (!handle->size) {
- struct perf_mmap_data *data = handle->data;
+ struct perf_buffer *buffer = handle->buffer;
handle->page++;
- handle->page &= data->nr_pages - 1;
- handle->addr = data->data_pages[handle->page];
- handle->size = PAGE_SIZE << page_order(data);
+ handle->page &= buffer->nr_pages - 1;
+ handle->addr = buffer->data_pages[handle->page];
+ handle->size = PAGE_SIZE << page_order(buffer);
}
} while (len);
}
@@ -3084,7 +3084,7 @@
struct perf_event *event, unsigned int size,
int nmi, int sample)
{
- struct perf_mmap_data *data;
+ struct perf_buffer *buffer;
unsigned long tail, offset, head;
int have_lost;
struct {
@@ -3100,19 +3100,19 @@
if (event->parent)
event = event->parent;
- data = rcu_dereference(event->data);
- if (!data)
+ buffer = rcu_dereference(event->buffer);
+ if (!buffer)
goto out;
- handle->data = data;
+ handle->buffer = buffer;
handle->event = event;
handle->nmi = nmi;
handle->sample = sample;
- if (!data->nr_pages)
+ if (!buffer->nr_pages)
goto out;
- have_lost = local_read(&data->lost);
+ have_lost = local_read(&buffer->lost);
if (have_lost)
size += sizeof(lost_event);
@@ -3124,30 +3124,30 @@
* tail pointer. So that all reads will be completed before the
* write is issued.
*/
- tail = ACCESS_ONCE(data->user_page->data_tail);
+ tail = ACCESS_ONCE(buffer->user_page->data_tail);
smp_rmb();
- offset = head = local_read(&data->head);
+ offset = head = local_read(&buffer->head);
head += size;
- if (unlikely(!perf_output_space(data, tail, offset, head)))
+ if (unlikely(!perf_output_space(buffer, tail, offset, head)))
goto fail;
- } while (local_cmpxchg(&data->head, offset, head) != offset);
+ } while (local_cmpxchg(&buffer->head, offset, head) != offset);
- if (head - local_read(&data->wakeup) > data->watermark)
- local_add(data->watermark, &data->wakeup);
+ if (head - local_read(&buffer->wakeup) > buffer->watermark)
+ local_add(buffer->watermark, &buffer->wakeup);
- handle->page = offset >> (PAGE_SHIFT + page_order(data));
- handle->page &= data->nr_pages - 1;
- handle->size = offset & ((PAGE_SIZE << page_order(data)) - 1);
- handle->addr = data->data_pages[handle->page];
+ handle->page = offset >> (PAGE_SHIFT + page_order(buffer));
+ handle->page &= buffer->nr_pages - 1;
+ handle->size = offset & ((PAGE_SIZE << page_order(buffer)) - 1);
+ handle->addr = buffer->data_pages[handle->page];
handle->addr += handle->size;
- handle->size = (PAGE_SIZE << page_order(data)) - handle->size;
+ handle->size = (PAGE_SIZE << page_order(buffer)) - handle->size;
if (have_lost) {
lost_event.header.type = PERF_RECORD_LOST;
lost_event.header.misc = 0;
lost_event.header.size = sizeof(lost_event);
lost_event.id = event->id;
- lost_event.lost = local_xchg(&data->lost, 0);
+ lost_event.lost = local_xchg(&buffer->lost, 0);
perf_output_put(handle, lost_event);
}
@@ -3155,7 +3155,7 @@
return 0;
fail:
- local_inc(&data->lost);
+ local_inc(&buffer->lost);
perf_output_put_handle(handle);
out:
rcu_read_unlock();
@@ -3166,15 +3166,15 @@
void perf_output_end(struct perf_output_handle *handle)
{
struct perf_event *event = handle->event;
- struct perf_mmap_data *data = handle->data;
+ struct perf_buffer *buffer = handle->buffer;
int wakeup_events = event->attr.wakeup_events;
if (handle->sample && wakeup_events) {
- int events = local_inc_return(&data->events);
+ int events = local_inc_return(&buffer->events);
if (events >= wakeup_events) {
- local_sub(wakeup_events, &data->events);
- local_inc(&data->wakeup);
+ local_sub(wakeup_events, &buffer->events);
+ local_inc(&buffer->wakeup);
}
}
@@ -3211,7 +3211,7 @@
u64 values[4];
int n = 0;
- values[n++] = atomic64_read(&event->count);
+ values[n++] = perf_event_count(event);
if (read_format & PERF_FORMAT_TOTAL_TIME_ENABLED) {
values[n++] = event->total_time_enabled +
atomic64_read(&event->child_total_time_enabled);
@@ -3248,7 +3248,7 @@
if (leader != event)
leader->pmu->read(leader);
- values[n++] = atomic64_read(&leader->count);
+ values[n++] = perf_event_count(leader);
if (read_format & PERF_FORMAT_ID)
values[n++] = primary_event_id(leader);
@@ -3260,7 +3260,7 @@
if (sub != event)
sub->pmu->read(sub);
- values[n++] = atomic64_read(&sub->count);
+ values[n++] = perf_event_count(sub);
if (read_format & PERF_FORMAT_ID)
values[n++] = primary_event_id(sub);
@@ -3491,7 +3491,7 @@
/*
* task tracking -- fork/exit
*
- * enabled by: attr.comm | attr.mmap | attr.task
+ * enabled by: attr.comm | attr.mmap | attr.mmap_data | attr.task
*/
struct perf_task_event {
@@ -3541,7 +3541,8 @@
if (event->cpu != -1 && event->cpu != smp_processor_id())
return 0;
- if (event->attr.comm || event->attr.mmap || event->attr.task)
+ if (event->attr.comm || event->attr.mmap ||
+ event->attr.mmap_data || event->attr.task)
return 1;
return 0;
@@ -3766,7 +3767,8 @@
}
static int perf_event_mmap_match(struct perf_event *event,
- struct perf_mmap_event *mmap_event)
+ struct perf_mmap_event *mmap_event,
+ int executable)
{
if (event->state < PERF_EVENT_STATE_INACTIVE)
return 0;
@@ -3774,19 +3776,21 @@
if (event->cpu != -1 && event->cpu != smp_processor_id())
return 0;
- if (event->attr.mmap)
+ if ((!executable && event->attr.mmap_data) ||
+ (executable && event->attr.mmap))
return 1;
return 0;
}
static void perf_event_mmap_ctx(struct perf_event_context *ctx,
- struct perf_mmap_event *mmap_event)
+ struct perf_mmap_event *mmap_event,
+ int executable)
{
struct perf_event *event;
list_for_each_entry_rcu(event, &ctx->event_list, event_entry) {
- if (perf_event_mmap_match(event, mmap_event))
+ if (perf_event_mmap_match(event, mmap_event, executable))
perf_event_mmap_output(event, mmap_event);
}
}
@@ -3830,6 +3834,14 @@
if (!vma->vm_mm) {
name = strncpy(tmp, "[vdso]", sizeof(tmp));
goto got_name;
+ } else if (vma->vm_start <= vma->vm_mm->start_brk &&
+ vma->vm_end >= vma->vm_mm->brk) {
+ name = strncpy(tmp, "[heap]", sizeof(tmp));
+ goto got_name;
+ } else if (vma->vm_start <= vma->vm_mm->start_stack &&
+ vma->vm_end >= vma->vm_mm->start_stack) {
+ name = strncpy(tmp, "[stack]", sizeof(tmp));
+ goto got_name;
}
name = strncpy(tmp, "//anon", sizeof(tmp));
@@ -3846,17 +3858,17 @@
rcu_read_lock();
cpuctx = &get_cpu_var(perf_cpu_context);
- perf_event_mmap_ctx(&cpuctx->ctx, mmap_event);
+ perf_event_mmap_ctx(&cpuctx->ctx, mmap_event, vma->vm_flags & VM_EXEC);
ctx = rcu_dereference(current->perf_event_ctxp);
if (ctx)
- perf_event_mmap_ctx(ctx, mmap_event);
+ perf_event_mmap_ctx(ctx, mmap_event, vma->vm_flags & VM_EXEC);
put_cpu_var(perf_cpu_context);
rcu_read_unlock();
kfree(buf);
}
-void __perf_event_mmap(struct vm_area_struct *vma)
+void perf_event_mmap(struct vm_area_struct *vma)
{
struct perf_mmap_event mmap_event;
@@ -4018,14 +4030,14 @@
hwc->last_period = hwc->sample_period;
again:
- old = val = atomic64_read(&hwc->period_left);
+ old = val = local64_read(&hwc->period_left);
if (val < 0)
return 0;
nr = div64_u64(period + val, period);
offset = nr * period;
val -= offset;
- if (atomic64_cmpxchg(&hwc->period_left, old, val) != old)
+ if (local64_cmpxchg(&hwc->period_left, old, val) != old)
goto again;
return nr;
@@ -4064,7 +4076,7 @@
{
struct hw_perf_event *hwc = &event->hw;
- atomic64_add(nr, &event->count);
+ local64_add(nr, &event->count);
if (!regs)
return;
@@ -4075,7 +4087,7 @@
if (nr == 1 && hwc->sample_period == 1 && !event->attr.freq)
return perf_swevent_overflow(event, 1, nmi, data, regs);
- if (atomic64_add_negative(nr, &hwc->period_left))
+ if (local64_add_negative(nr, &hwc->period_left))
return;
perf_swevent_overflow(event, 0, nmi, data, regs);
@@ -4213,14 +4225,12 @@
}
EXPORT_SYMBOL_GPL(perf_swevent_get_recursion_context);
-void perf_swevent_put_recursion_context(int rctx)
+void inline perf_swevent_put_recursion_context(int rctx)
{
struct perf_cpu_context *cpuctx = &__get_cpu_var(perf_cpu_context);
barrier();
cpuctx->recursion[rctx]--;
}
-EXPORT_SYMBOL_GPL(perf_swevent_put_recursion_context);
-
void __perf_sw_event(u32 event_id, u64 nr, int nmi,
struct pt_regs *regs, u64 addr)
@@ -4368,8 +4378,8 @@
u64 now;
now = cpu_clock(cpu);
- prev = atomic64_xchg(&event->hw.prev_count, now);
- atomic64_add(now - prev, &event->count);
+ prev = local64_xchg(&event->hw.prev_count, now);
+ local64_add(now - prev, &event->count);
}
static int cpu_clock_perf_event_enable(struct perf_event *event)
@@ -4377,7 +4387,7 @@
struct hw_perf_event *hwc = &event->hw;
int cpu = raw_smp_processor_id();
- atomic64_set(&hwc->prev_count, cpu_clock(cpu));
+ local64_set(&hwc->prev_count, cpu_clock(cpu));
perf_swevent_start_hrtimer(event);
return 0;
@@ -4409,9 +4419,9 @@
u64 prev;
s64 delta;
- prev = atomic64_xchg(&event->hw.prev_count, now);
+ prev = local64_xchg(&event->hw.prev_count, now);
delta = now - prev;
- atomic64_add(delta, &event->count);
+ local64_add(delta, &event->count);
}
static int task_clock_perf_event_enable(struct perf_event *event)
@@ -4421,7 +4431,7 @@
now = event->ctx->time;
- atomic64_set(&hwc->prev_count, now);
+ local64_set(&hwc->prev_count, now);
perf_swevent_start_hrtimer(event);
@@ -4601,7 +4611,7 @@
}
void perf_tp_event(u64 addr, u64 count, void *record, int entry_size,
- struct pt_regs *regs, struct hlist_head *head)
+ struct pt_regs *regs, struct hlist_head *head, int rctx)
{
struct perf_sample_data data;
struct perf_event *event;
@@ -4615,12 +4625,12 @@
perf_sample_data_init(&data, addr);
data.raw = &raw;
- rcu_read_lock();
hlist_for_each_entry_rcu(event, node, head, hlist_entry) {
if (perf_tp_event_match(event, &data, regs))
perf_swevent_add(event, count, 1, &data, regs);
}
- rcu_read_unlock();
+
+ perf_swevent_put_recursion_context(rctx);
}
EXPORT_SYMBOL_GPL(perf_tp_event);
@@ -4864,7 +4874,7 @@
hwc->sample_period = 1;
hwc->last_period = hwc->sample_period;
- atomic64_set(&hwc->period_left, hwc->sample_period);
+ local64_set(&hwc->period_left, hwc->sample_period);
/*
* we currently do not support PERF_FORMAT_GROUP on inherited events
@@ -4913,7 +4923,7 @@
if (!event->parent) {
atomic_inc(&nr_events);
- if (event->attr.mmap)
+ if (event->attr.mmap || event->attr.mmap_data)
atomic_inc(&nr_mmap_events);
if (event->attr.comm)
atomic_inc(&nr_comm_events);
@@ -5007,7 +5017,7 @@
static int
perf_event_set_output(struct perf_event *event, struct perf_event *output_event)
{
- struct perf_mmap_data *data = NULL, *old_data = NULL;
+ struct perf_buffer *buffer = NULL, *old_buffer = NULL;
int ret = -EINVAL;
if (!output_event)
@@ -5037,19 +5047,19 @@
if (output_event) {
/* get the buffer we want to redirect to */
- data = perf_mmap_data_get(output_event);
- if (!data)
+ buffer = perf_buffer_get(output_event);
+ if (!buffer)
goto unlock;
}
- old_data = event->data;
- rcu_assign_pointer(event->data, data);
+ old_buffer = event->buffer;
+ rcu_assign_pointer(event->buffer, buffer);
ret = 0;
unlock:
mutex_unlock(&event->mmap_mutex);
- if (old_data)
- perf_mmap_data_put(old_data);
+ if (old_buffer)
+ perf_buffer_put(old_buffer);
out:
return ret;
}
@@ -5298,7 +5308,7 @@
hwc->sample_period = sample_period;
hwc->last_period = sample_period;
- atomic64_set(&hwc->period_left, sample_period);
+ local64_set(&hwc->period_left, sample_period);
}
child_event->overflow_handler = parent_event->overflow_handler;
@@ -5359,12 +5369,12 @@
if (child_event->attr.inherit_stat)
perf_event_read_event(child_event, child);
- child_val = atomic64_read(&child_event->count);
+ child_val = perf_event_count(child_event);
/*
* Add back the child's count to the parent's count:
*/
- atomic64_add(child_val, &parent_event->count);
+ atomic64_add(child_val, &parent_event->child_count);
atomic64_add(child_event->total_time_enabled,
&parent_event->child_total_time_enabled);
atomic64_add(child_event->total_time_running,
diff --git a/kernel/sched.c b/kernel/sched.c
index f52a880..265cf3a 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -3726,7 +3726,7 @@
* off of preempt_enable. Kernel preemptions off return from interrupt
* occur there and call schedule directly.
*/
-asmlinkage void __sched preempt_schedule(void)
+asmlinkage void __sched notrace preempt_schedule(void)
{
struct thread_info *ti = current_thread_info();
@@ -3738,9 +3738,9 @@
return;
do {
- add_preempt_count(PREEMPT_ACTIVE);
+ add_preempt_count_notrace(PREEMPT_ACTIVE);
schedule();
- sub_preempt_count(PREEMPT_ACTIVE);
+ sub_preempt_count_notrace(PREEMPT_ACTIVE);
/*
* Check again in case we missed a preemption opportunity
diff --git a/kernel/softlockup.c b/kernel/softlockup.c
deleted file mode 100644
index 4b493f6..0000000
--- a/kernel/softlockup.c
+++ /dev/null
@@ -1,293 +0,0 @@
-/*
- * Detect Soft Lockups
- *
- * started by Ingo Molnar, Copyright (C) 2005, 2006 Red Hat, Inc.
- *
- * this code detects soft lockups: incidents in where on a CPU
- * the kernel does not reschedule for 10 seconds or more.
- */
-#include <linux/mm.h>
-#include <linux/cpu.h>
-#include <linux/nmi.h>
-#include <linux/init.h>
-#include <linux/delay.h>
-#include <linux/freezer.h>
-#include <linux/kthread.h>
-#include <linux/lockdep.h>
-#include <linux/notifier.h>
-#include <linux/module.h>
-#include <linux/sysctl.h>
-
-#include <asm/irq_regs.h>
-
-static DEFINE_SPINLOCK(print_lock);
-
-static DEFINE_PER_CPU(unsigned long, softlockup_touch_ts); /* touch timestamp */
-static DEFINE_PER_CPU(unsigned long, softlockup_print_ts); /* print timestamp */
-static DEFINE_PER_CPU(struct task_struct *, softlockup_watchdog);
-static DEFINE_PER_CPU(bool, softlock_touch_sync);
-
-static int __read_mostly did_panic;
-int __read_mostly softlockup_thresh = 60;
-
-/*
- * Should we panic (and reboot, if panic_timeout= is set) when a
- * soft-lockup occurs:
- */
-unsigned int __read_mostly softlockup_panic =
- CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE;
-
-static int __init softlockup_panic_setup(char *str)
-{
- softlockup_panic = simple_strtoul(str, NULL, 0);
-
- return 1;
-}
-__setup("softlockup_panic=", softlockup_panic_setup);
-
-static int
-softlock_panic(struct notifier_block *this, unsigned long event, void *ptr)
-{
- did_panic = 1;
-
- return NOTIFY_DONE;
-}
-
-static struct notifier_block panic_block = {
- .notifier_call = softlock_panic,
-};
-
-/*
- * Returns seconds, approximately. We don't need nanosecond
- * resolution, and we don't need to waste time with a big divide when
- * 2^30ns == 1.074s.
- */
-static unsigned long get_timestamp(int this_cpu)
-{
- return cpu_clock(this_cpu) >> 30LL; /* 2^30 ~= 10^9 */
-}
-
-static void __touch_softlockup_watchdog(void)
-{
- int this_cpu = raw_smp_processor_id();
-
- __raw_get_cpu_var(softlockup_touch_ts) = get_timestamp(this_cpu);
-}
-
-void touch_softlockup_watchdog(void)
-{
- __raw_get_cpu_var(softlockup_touch_ts) = 0;
-}
-EXPORT_SYMBOL(touch_softlockup_watchdog);
-
-void touch_softlockup_watchdog_sync(void)
-{
- __raw_get_cpu_var(softlock_touch_sync) = true;
- __raw_get_cpu_var(softlockup_touch_ts) = 0;
-}
-
-void touch_all_softlockup_watchdogs(void)
-{
- int cpu;
-
- /* Cause each CPU to re-update its timestamp rather than complain */
- for_each_online_cpu(cpu)
- per_cpu(softlockup_touch_ts, cpu) = 0;
-}
-EXPORT_SYMBOL(touch_all_softlockup_watchdogs);
-
-int proc_dosoftlockup_thresh(struct ctl_table *table, int write,
- void __user *buffer,
- size_t *lenp, loff_t *ppos)
-{
- touch_all_softlockup_watchdogs();
- return proc_dointvec_minmax(table, write, buffer, lenp, ppos);
-}
-
-/*
- * This callback runs from the timer interrupt, and checks
- * whether the watchdog thread has hung or not:
- */
-void softlockup_tick(void)
-{
- int this_cpu = smp_processor_id();
- unsigned long touch_ts = per_cpu(softlockup_touch_ts, this_cpu);
- unsigned long print_ts;
- struct pt_regs *regs = get_irq_regs();
- unsigned long now;
-
- /* Is detection switched off? */
- if (!per_cpu(softlockup_watchdog, this_cpu) || softlockup_thresh <= 0) {
- /* Be sure we don't false trigger if switched back on */
- if (touch_ts)
- per_cpu(softlockup_touch_ts, this_cpu) = 0;
- return;
- }
-
- if (touch_ts == 0) {
- if (unlikely(per_cpu(softlock_touch_sync, this_cpu))) {
- /*
- * If the time stamp was touched atomically
- * make sure the scheduler tick is up to date.
- */
- per_cpu(softlock_touch_sync, this_cpu) = false;
- sched_clock_tick();
- }
- __touch_softlockup_watchdog();
- return;
- }
-
- print_ts = per_cpu(softlockup_print_ts, this_cpu);
-
- /* report at most once a second */
- if (print_ts == touch_ts || did_panic)
- return;
-
- /* do not print during early bootup: */
- if (unlikely(system_state != SYSTEM_RUNNING)) {
- __touch_softlockup_watchdog();
- return;
- }
-
- now = get_timestamp(this_cpu);
-
- /*
- * Wake up the high-prio watchdog task twice per
- * threshold timespan.
- */
- if (time_after(now - softlockup_thresh/2, touch_ts))
- wake_up_process(per_cpu(softlockup_watchdog, this_cpu));
-
- /* Warn about unreasonable delays: */
- if (time_before_eq(now - softlockup_thresh, touch_ts))
- return;
-
- per_cpu(softlockup_print_ts, this_cpu) = touch_ts;
-
- spin_lock(&print_lock);
- printk(KERN_ERR "BUG: soft lockup - CPU#%d stuck for %lus! [%s:%d]\n",
- this_cpu, now - touch_ts,
- current->comm, task_pid_nr(current));
- print_modules();
- print_irqtrace_events(current);
- if (regs)
- show_regs(regs);
- else
- dump_stack();
- spin_unlock(&print_lock);
-
- if (softlockup_panic)
- panic("softlockup: hung tasks");
-}
-
-/*
- * The watchdog thread - runs every second and touches the timestamp.
- */
-static int watchdog(void *__bind_cpu)
-{
- struct sched_param param = { .sched_priority = MAX_RT_PRIO-1 };
-
- sched_setscheduler(current, SCHED_FIFO, ¶m);
-
- /* initialize timestamp */
- __touch_softlockup_watchdog();
-
- set_current_state(TASK_INTERRUPTIBLE);
- /*
- * Run briefly once per second to reset the softlockup timestamp.
- * If this gets delayed for more than 60 seconds then the
- * debug-printout triggers in softlockup_tick().
- */
- while (!kthread_should_stop()) {
- __touch_softlockup_watchdog();
- schedule();
-
- if (kthread_should_stop())
- break;
-
- set_current_state(TASK_INTERRUPTIBLE);
- }
- __set_current_state(TASK_RUNNING);
-
- return 0;
-}
-
-/*
- * Create/destroy watchdog threads as CPUs come and go:
- */
-static int __cpuinit
-cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu)
-{
- int hotcpu = (unsigned long)hcpu;
- struct task_struct *p;
-
- switch (action) {
- case CPU_UP_PREPARE:
- case CPU_UP_PREPARE_FROZEN:
- BUG_ON(per_cpu(softlockup_watchdog, hotcpu));
- p = kthread_create(watchdog, hcpu, "watchdog/%d", hotcpu);
- if (IS_ERR(p)) {
- printk(KERN_ERR "watchdog for %i failed\n", hotcpu);
- return NOTIFY_BAD;
- }
- per_cpu(softlockup_touch_ts, hotcpu) = 0;
- per_cpu(softlockup_watchdog, hotcpu) = p;
- kthread_bind(p, hotcpu);
- break;
- case CPU_ONLINE:
- case CPU_ONLINE_FROZEN:
- wake_up_process(per_cpu(softlockup_watchdog, hotcpu));
- break;
-#ifdef CONFIG_HOTPLUG_CPU
- case CPU_UP_CANCELED:
- case CPU_UP_CANCELED_FROZEN:
- if (!per_cpu(softlockup_watchdog, hotcpu))
- break;
- /* Unbind so it can run. Fall thru. */
- kthread_bind(per_cpu(softlockup_watchdog, hotcpu),
- cpumask_any(cpu_online_mask));
- case CPU_DEAD:
- case CPU_DEAD_FROZEN:
- p = per_cpu(softlockup_watchdog, hotcpu);
- per_cpu(softlockup_watchdog, hotcpu) = NULL;
- kthread_stop(p);
- break;
-#endif /* CONFIG_HOTPLUG_CPU */
- }
- return NOTIFY_OK;
-}
-
-static struct notifier_block __cpuinitdata cpu_nfb = {
- .notifier_call = cpu_callback
-};
-
-static int __initdata nosoftlockup;
-
-static int __init nosoftlockup_setup(char *str)
-{
- nosoftlockup = 1;
- return 1;
-}
-__setup("nosoftlockup", nosoftlockup_setup);
-
-static int __init spawn_softlockup_task(void)
-{
- void *cpu = (void *)(long)smp_processor_id();
- int err;
-
- if (nosoftlockup)
- return 0;
-
- err = cpu_callback(&cpu_nfb, CPU_UP_PREPARE, cpu);
- if (err == NOTIFY_BAD) {
- BUG();
- return 1;
- }
- cpu_callback(&cpu_nfb, CPU_ONLINE, cpu);
- register_cpu_notifier(&cpu_nfb);
-
- atomic_notifier_chain_register(&panic_notifier_list, &panic_block);
-
- return 0;
-}
-early_initcall(spawn_softlockup_task);
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index d24f761..6f79c7f 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -76,6 +76,10 @@
#include <scsi/sg.h>
#endif
+#ifdef CONFIG_LOCKUP_DETECTOR
+#include <linux/nmi.h>
+#endif
+
#if defined(CONFIG_SYSCTL)
@@ -106,7 +110,7 @@
#endif
/* Constants used for minimum and maximum */
-#ifdef CONFIG_DETECT_SOFTLOCKUP
+#ifdef CONFIG_LOCKUP_DETECTOR
static int sixty = 60;
static int neg_one = -1;
#endif
@@ -710,7 +714,34 @@
.mode = 0444,
.proc_handler = proc_dointvec,
},
-#if defined(CONFIG_X86_LOCAL_APIC) && defined(CONFIG_X86)
+#if defined(CONFIG_LOCKUP_DETECTOR)
+ {
+ .procname = "watchdog",
+ .data = &watchdog_enabled,
+ .maxlen = sizeof (int),
+ .mode = 0644,
+ .proc_handler = proc_dowatchdog_enabled,
+ },
+ {
+ .procname = "watchdog_thresh",
+ .data = &softlockup_thresh,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dowatchdog_thresh,
+ .extra1 = &neg_one,
+ .extra2 = &sixty,
+ },
+ {
+ .procname = "softlockup_panic",
+ .data = &softlockup_panic,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = &zero,
+ .extra2 = &one,
+ },
+#endif
+#if defined(CONFIG_X86_LOCAL_APIC) && defined(CONFIG_X86) && !defined(CONFIG_LOCKUP_DETECTOR)
{
.procname = "unknown_nmi_panic",
.data = &unknown_nmi_panic,
@@ -813,26 +844,6 @@
.proc_handler = proc_dointvec,
},
#endif
-#ifdef CONFIG_DETECT_SOFTLOCKUP
- {
- .procname = "softlockup_panic",
- .data = &softlockup_panic,
- .maxlen = sizeof(int),
- .mode = 0644,
- .proc_handler = proc_dointvec_minmax,
- .extra1 = &zero,
- .extra2 = &one,
- },
- {
- .procname = "softlockup_thresh",
- .data = &softlockup_thresh,
- .maxlen = sizeof(int),
- .mode = 0644,
- .proc_handler = proc_dosoftlockup_thresh,
- .extra1 = &neg_one,
- .extra2 = &sixty,
- },
-#endif
#ifdef CONFIG_DETECT_HUNG_TASK
{
.procname = "hung_task_panic",
diff --git a/kernel/timer.c b/kernel/timer.c
index efde11e..6aa6f7e 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -1302,7 +1302,6 @@
{
hrtimer_run_queues();
raise_softirq(TIMER_SOFTIRQ);
- softlockup_tick();
}
/*
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 8b1797c..c7683fd 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -194,15 +194,6 @@
enabled. This option and the irqs-off timing option can be
used together or separately.)
-config SYSPROF_TRACER
- bool "Sysprof Tracer"
- depends on X86
- select GENERIC_TRACER
- select CONTEXT_SWITCH_TRACER
- help
- This tracer provides the trace needed by the 'Sysprof' userspace
- tool.
-
config SCHED_TRACER
bool "Scheduling Latency Tracer"
select GENERIC_TRACER
@@ -229,23 +220,6 @@
help
Basic tracer to catch the syscall entry and exit events.
-config BOOT_TRACER
- bool "Trace boot initcalls"
- select GENERIC_TRACER
- select CONTEXT_SWITCH_TRACER
- help
- This tracer helps developers to optimize boot times: it records
- the timings of the initcalls and traces key events and the identity
- of tasks that can cause boot delays, such as context-switches.
-
- Its aim is to be parsed by the scripts/bootgraph.pl tool to
- produce pretty graphics about boot inefficiencies, giving a visual
- representation of the delays during initcalls - but the raw
- /debug/tracing/trace text output is readable too.
-
- You must pass in initcall_debug and ftrace=initcall to the kernel
- command line to enable this on bootup.
-
config TRACE_BRANCH_PROFILING
bool
select GENERIC_TRACER
@@ -325,28 +299,6 @@
Say N if unsure.
-config KSYM_TRACER
- bool "Trace read and write access on kernel memory locations"
- depends on HAVE_HW_BREAKPOINT
- select TRACING
- help
- This tracer helps find read and write operations on any given kernel
- symbol i.e. /proc/kallsyms.
-
-config PROFILE_KSYM_TRACER
- bool "Profile all kernel memory accesses on 'watched' variables"
- depends on KSYM_TRACER
- help
- This tracer profiles kernel accesses on variables watched through the
- ksym tracer ftrace plugin. Depending upon the hardware, all read
- and write operations on kernel variables can be monitored for
- accesses.
-
- The results will be displayed in:
- /debugfs/tracing/profile_ksym
-
- Say N if unsure.
-
config STACK_TRACER
bool "Trace max stack"
depends on HAVE_FUNCTION_TRACER
@@ -371,26 +323,6 @@
Say N if unsure.
-config KMEMTRACE
- bool "Trace SLAB allocations"
- select GENERIC_TRACER
- help
- kmemtrace provides tracing for slab allocator functions, such as
- kmalloc, kfree, kmem_cache_alloc, kmem_cache_free, etc. Collected
- data is then fed to the userspace application in order to analyse
- allocation hotspots, internal fragmentation and so on, making it
- possible to see how well an allocator performs, as well as debug
- and profile kernel code.
-
- This requires an userspace application to use. See
- Documentation/trace/kmemtrace.txt for more information.
-
- Saying Y will make the kernel somewhat larger and slower. However,
- if you disable kmemtrace at run-time or boot-time, the performance
- impact is minimal (depending on the arch the kernel is built for).
-
- If unsure, say N.
-
config WORKQUEUE_TRACER
bool "Trace workqueues"
select GENERIC_TRACER
diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
index 4215530..53f3381 100644
--- a/kernel/trace/Makefile
+++ b/kernel/trace/Makefile
@@ -30,7 +30,6 @@
obj-$(CONFIG_TRACING) += trace_stat.o
obj-$(CONFIG_TRACING) += trace_printk.o
obj-$(CONFIG_CONTEXT_SWITCH_TRACER) += trace_sched_switch.o
-obj-$(CONFIG_SYSPROF_TRACER) += trace_sysprof.o
obj-$(CONFIG_FUNCTION_TRACER) += trace_functions.o
obj-$(CONFIG_IRQSOFF_TRACER) += trace_irqsoff.o
obj-$(CONFIG_PREEMPT_TRACER) += trace_irqsoff.o
@@ -38,10 +37,8 @@
obj-$(CONFIG_NOP_TRACER) += trace_nop.o
obj-$(CONFIG_STACK_TRACER) += trace_stack.o
obj-$(CONFIG_MMIOTRACE) += trace_mmiotrace.o
-obj-$(CONFIG_BOOT_TRACER) += trace_boot.o
obj-$(CONFIG_FUNCTION_GRAPH_TRACER) += trace_functions_graph.o
obj-$(CONFIG_TRACE_BRANCH_PROFILING) += trace_branch.o
-obj-$(CONFIG_KMEMTRACE) += kmemtrace.o
obj-$(CONFIG_WORKQUEUE_TRACER) += trace_workqueue.o
obj-$(CONFIG_BLK_DEV_IO_TRACE) += blktrace.o
ifeq ($(CONFIG_BLOCK),y)
@@ -55,7 +52,6 @@
endif
obj-$(CONFIG_EVENT_TRACING) += trace_events_filter.o
obj-$(CONFIG_KPROBE_EVENT) += trace_kprobe.o
-obj-$(CONFIG_KSYM_TRACER) += trace_ksym.o
obj-$(CONFIG_EVENT_TRACING) += power-traces.o
ifeq ($(CONFIG_TRACING),y)
obj-$(CONFIG_KGDB_KDB) += trace_kdb.o
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 6d2cb14..0d88ce9 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -1883,7 +1883,6 @@
struct hlist_head *hhd;
struct hlist_node *n;
unsigned long key;
- int resched;
key = hash_long(ip, FTRACE_HASH_BITS);
@@ -1897,12 +1896,12 @@
* period. This syncs the hash iteration and freeing of items
* on the hash. rcu_read_lock is too dangerous here.
*/
- resched = ftrace_preempt_disable();
+ preempt_disable_notrace();
hlist_for_each_entry_rcu(entry, n, hhd, node) {
if (entry->ip == ip)
entry->ops->func(ip, parent_ip, &entry->data);
}
- ftrace_preempt_enable(resched);
+ preempt_enable_notrace();
}
static struct ftrace_ops trace_probe_ops __read_mostly =
diff --git a/kernel/trace/kmemtrace.c b/kernel/trace/kmemtrace.c
deleted file mode 100644
index bbfc1bb..0000000
--- a/kernel/trace/kmemtrace.c
+++ /dev/null
@@ -1,529 +0,0 @@
-/*
- * Memory allocator tracing
- *
- * Copyright (C) 2008 Eduard - Gabriel Munteanu
- * Copyright (C) 2008 Pekka Enberg <penberg@cs.helsinki.fi>
- * Copyright (C) 2008 Frederic Weisbecker <fweisbec@gmail.com>
- */
-
-#include <linux/tracepoint.h>
-#include <linux/seq_file.h>
-#include <linux/debugfs.h>
-#include <linux/dcache.h>
-#include <linux/fs.h>
-
-#include <linux/kmemtrace.h>
-
-#include "trace_output.h"
-#include "trace.h"
-
-/* Select an alternative, minimalistic output than the original one */
-#define TRACE_KMEM_OPT_MINIMAL 0x1
-
-static struct tracer_opt kmem_opts[] = {
- /* Default disable the minimalistic output */
- { TRACER_OPT(kmem_minimalistic, TRACE_KMEM_OPT_MINIMAL) },
- { }
-};
-
-static struct tracer_flags kmem_tracer_flags = {
- .val = 0,
- .opts = kmem_opts
-};
-
-static struct trace_array *kmemtrace_array;
-
-/* Trace allocations */
-static inline void kmemtrace_alloc(enum kmemtrace_type_id type_id,
- unsigned long call_site,
- const void *ptr,
- size_t bytes_req,
- size_t bytes_alloc,
- gfp_t gfp_flags,
- int node)
-{
- struct ftrace_event_call *call = &event_kmem_alloc;
- struct trace_array *tr = kmemtrace_array;
- struct kmemtrace_alloc_entry *entry;
- struct ring_buffer_event *event;
-
- event = ring_buffer_lock_reserve(tr->buffer, sizeof(*entry));
- if (!event)
- return;
-
- entry = ring_buffer_event_data(event);
- tracing_generic_entry_update(&entry->ent, 0, 0);
-
- entry->ent.type = TRACE_KMEM_ALLOC;
- entry->type_id = type_id;
- entry->call_site = call_site;
- entry->ptr = ptr;
- entry->bytes_req = bytes_req;
- entry->bytes_alloc = bytes_alloc;
- entry->gfp_flags = gfp_flags;
- entry->node = node;
-
- if (!filter_check_discard(call, entry, tr->buffer, event))
- ring_buffer_unlock_commit(tr->buffer, event);
-
- trace_wake_up();
-}
-
-static inline void kmemtrace_free(enum kmemtrace_type_id type_id,
- unsigned long call_site,
- const void *ptr)
-{
- struct ftrace_event_call *call = &event_kmem_free;
- struct trace_array *tr = kmemtrace_array;
- struct kmemtrace_free_entry *entry;
- struct ring_buffer_event *event;
-
- event = ring_buffer_lock_reserve(tr->buffer, sizeof(*entry));
- if (!event)
- return;
- entry = ring_buffer_event_data(event);
- tracing_generic_entry_update(&entry->ent, 0, 0);
-
- entry->ent.type = TRACE_KMEM_FREE;
- entry->type_id = type_id;
- entry->call_site = call_site;
- entry->ptr = ptr;
-
- if (!filter_check_discard(call, entry, tr->buffer, event))
- ring_buffer_unlock_commit(tr->buffer, event);
-
- trace_wake_up();
-}
-
-static void kmemtrace_kmalloc(void *ignore,
- unsigned long call_site,
- const void *ptr,
- size_t bytes_req,
- size_t bytes_alloc,
- gfp_t gfp_flags)
-{
- kmemtrace_alloc(KMEMTRACE_TYPE_KMALLOC, call_site, ptr,
- bytes_req, bytes_alloc, gfp_flags, -1);
-}
-
-static void kmemtrace_kmem_cache_alloc(void *ignore,
- unsigned long call_site,
- const void *ptr,
- size_t bytes_req,
- size_t bytes_alloc,
- gfp_t gfp_flags)
-{
- kmemtrace_alloc(KMEMTRACE_TYPE_CACHE, call_site, ptr,
- bytes_req, bytes_alloc, gfp_flags, -1);
-}
-
-static void kmemtrace_kmalloc_node(void *ignore,
- unsigned long call_site,
- const void *ptr,
- size_t bytes_req,
- size_t bytes_alloc,
- gfp_t gfp_flags,
- int node)
-{
- kmemtrace_alloc(KMEMTRACE_TYPE_KMALLOC, call_site, ptr,
- bytes_req, bytes_alloc, gfp_flags, node);
-}
-
-static void kmemtrace_kmem_cache_alloc_node(void *ignore,
- unsigned long call_site,
- const void *ptr,
- size_t bytes_req,
- size_t bytes_alloc,
- gfp_t gfp_flags,
- int node)
-{
- kmemtrace_alloc(KMEMTRACE_TYPE_CACHE, call_site, ptr,
- bytes_req, bytes_alloc, gfp_flags, node);
-}
-
-static void
-kmemtrace_kfree(void *ignore, unsigned long call_site, const void *ptr)
-{
- kmemtrace_free(KMEMTRACE_TYPE_KMALLOC, call_site, ptr);
-}
-
-static void kmemtrace_kmem_cache_free(void *ignore,
- unsigned long call_site, const void *ptr)
-{
- kmemtrace_free(KMEMTRACE_TYPE_CACHE, call_site, ptr);
-}
-
-static int kmemtrace_start_probes(void)
-{
- int err;
-
- err = register_trace_kmalloc(kmemtrace_kmalloc, NULL);
- if (err)
- return err;
- err = register_trace_kmem_cache_alloc(kmemtrace_kmem_cache_alloc, NULL);
- if (err)
- return err;
- err = register_trace_kmalloc_node(kmemtrace_kmalloc_node, NULL);
- if (err)
- return err;
- err = register_trace_kmem_cache_alloc_node(kmemtrace_kmem_cache_alloc_node, NULL);
- if (err)
- return err;
- err = register_trace_kfree(kmemtrace_kfree, NULL);
- if (err)
- return err;
- err = register_trace_kmem_cache_free(kmemtrace_kmem_cache_free, NULL);
-
- return err;
-}
-
-static void kmemtrace_stop_probes(void)
-{
- unregister_trace_kmalloc(kmemtrace_kmalloc, NULL);
- unregister_trace_kmem_cache_alloc(kmemtrace_kmem_cache_alloc, NULL);
- unregister_trace_kmalloc_node(kmemtrace_kmalloc_node, NULL);
- unregister_trace_kmem_cache_alloc_node(kmemtrace_kmem_cache_alloc_node, NULL);
- unregister_trace_kfree(kmemtrace_kfree, NULL);
- unregister_trace_kmem_cache_free(kmemtrace_kmem_cache_free, NULL);
-}
-
-static int kmem_trace_init(struct trace_array *tr)
-{
- kmemtrace_array = tr;
-
- tracing_reset_online_cpus(tr);
-
- kmemtrace_start_probes();
-
- return 0;
-}
-
-static void kmem_trace_reset(struct trace_array *tr)
-{
- kmemtrace_stop_probes();
-}
-
-static void kmemtrace_headers(struct seq_file *s)
-{
- /* Don't need headers for the original kmemtrace output */
- if (!(kmem_tracer_flags.val & TRACE_KMEM_OPT_MINIMAL))
- return;
-
- seq_printf(s, "#\n");
- seq_printf(s, "# ALLOC TYPE REQ GIVEN FLAGS "
- " POINTER NODE CALLER\n");
- seq_printf(s, "# FREE | | | | "
- " | | | |\n");
- seq_printf(s, "# |\n\n");
-}
-
-/*
- * The following functions give the original output from kmemtrace,
- * plus the origin CPU, since reordering occurs in-kernel now.
- */
-
-#define KMEMTRACE_USER_ALLOC 0
-#define KMEMTRACE_USER_FREE 1
-
-struct kmemtrace_user_event {
- u8 event_id;
- u8 type_id;
- u16 event_size;
- u32 cpu;
- u64 timestamp;
- unsigned long call_site;
- unsigned long ptr;
-};
-
-struct kmemtrace_user_event_alloc {
- size_t bytes_req;
- size_t bytes_alloc;
- unsigned gfp_flags;
- int node;
-};
-
-static enum print_line_t
-kmemtrace_print_alloc(struct trace_iterator *iter, int flags,
- struct trace_event *event)
-{
- struct trace_seq *s = &iter->seq;
- struct kmemtrace_alloc_entry *entry;
- int ret;
-
- trace_assign_type(entry, iter->ent);
-
- ret = trace_seq_printf(s, "type_id %d call_site %pF ptr %lu "
- "bytes_req %lu bytes_alloc %lu gfp_flags %lu node %d\n",
- entry->type_id, (void *)entry->call_site, (unsigned long)entry->ptr,
- (unsigned long)entry->bytes_req, (unsigned long)entry->bytes_alloc,
- (unsigned long)entry->gfp_flags, entry->node);
-
- if (!ret)
- return TRACE_TYPE_PARTIAL_LINE;
- return TRACE_TYPE_HANDLED;
-}
-
-static enum print_line_t
-kmemtrace_print_free(struct trace_iterator *iter, int flags,
- struct trace_event *event)
-{
- struct trace_seq *s = &iter->seq;
- struct kmemtrace_free_entry *entry;
- int ret;
-
- trace_assign_type(entry, iter->ent);
-
- ret = trace_seq_printf(s, "type_id %d call_site %pF ptr %lu\n",
- entry->type_id, (void *)entry->call_site,
- (unsigned long)entry->ptr);
-
- if (!ret)
- return TRACE_TYPE_PARTIAL_LINE;
- return TRACE_TYPE_HANDLED;
-}
-
-static enum print_line_t
-kmemtrace_print_alloc_user(struct trace_iterator *iter, int flags,
- struct trace_event *event)
-{
- struct trace_seq *s = &iter->seq;
- struct kmemtrace_alloc_entry *entry;
- struct kmemtrace_user_event *ev;
- struct kmemtrace_user_event_alloc *ev_alloc;
-
- trace_assign_type(entry, iter->ent);
-
- ev = trace_seq_reserve(s, sizeof(*ev));
- if (!ev)
- return TRACE_TYPE_PARTIAL_LINE;
-
- ev->event_id = KMEMTRACE_USER_ALLOC;
- ev->type_id = entry->type_id;
- ev->event_size = sizeof(*ev) + sizeof(*ev_alloc);
- ev->cpu = iter->cpu;
- ev->timestamp = iter->ts;
- ev->call_site = entry->call_site;
- ev->ptr = (unsigned long)entry->ptr;
-
- ev_alloc = trace_seq_reserve(s, sizeof(*ev_alloc));
- if (!ev_alloc)
- return TRACE_TYPE_PARTIAL_LINE;
-
- ev_alloc->bytes_req = entry->bytes_req;
- ev_alloc->bytes_alloc = entry->bytes_alloc;
- ev_alloc->gfp_flags = entry->gfp_flags;
- ev_alloc->node = entry->node;
-
- return TRACE_TYPE_HANDLED;
-}
-
-static enum print_line_t
-kmemtrace_print_free_user(struct trace_iterator *iter, int flags,
- struct trace_event *event)
-{
- struct trace_seq *s = &iter->seq;
- struct kmemtrace_free_entry *entry;
- struct kmemtrace_user_event *ev;
-
- trace_assign_type(entry, iter->ent);
-
- ev = trace_seq_reserve(s, sizeof(*ev));
- if (!ev)
- return TRACE_TYPE_PARTIAL_LINE;
-
- ev->event_id = KMEMTRACE_USER_FREE;
- ev->type_id = entry->type_id;
- ev->event_size = sizeof(*ev);
- ev->cpu = iter->cpu;
- ev->timestamp = iter->ts;
- ev->call_site = entry->call_site;
- ev->ptr = (unsigned long)entry->ptr;
-
- return TRACE_TYPE_HANDLED;
-}
-
-/* The two other following provide a more minimalistic output */
-static enum print_line_t
-kmemtrace_print_alloc_compress(struct trace_iterator *iter)
-{
- struct kmemtrace_alloc_entry *entry;
- struct trace_seq *s = &iter->seq;
- int ret;
-
- trace_assign_type(entry, iter->ent);
-
- /* Alloc entry */
- ret = trace_seq_printf(s, " + ");
- if (!ret)
- return TRACE_TYPE_PARTIAL_LINE;
-
- /* Type */
- switch (entry->type_id) {
- case KMEMTRACE_TYPE_KMALLOC:
- ret = trace_seq_printf(s, "K ");
- break;
- case KMEMTRACE_TYPE_CACHE:
- ret = trace_seq_printf(s, "C ");
- break;
- case KMEMTRACE_TYPE_PAGES:
- ret = trace_seq_printf(s, "P ");
- break;
- default:
- ret = trace_seq_printf(s, "? ");
- }
-
- if (!ret)
- return TRACE_TYPE_PARTIAL_LINE;
-
- /* Requested */
- ret = trace_seq_printf(s, "%4zu ", entry->bytes_req);
- if (!ret)
- return TRACE_TYPE_PARTIAL_LINE;
-
- /* Allocated */
- ret = trace_seq_printf(s, "%4zu ", entry->bytes_alloc);
- if (!ret)
- return TRACE_TYPE_PARTIAL_LINE;
-
- /* Flags
- * TODO: would be better to see the name of the GFP flag names
- */
- ret = trace_seq_printf(s, "%08x ", entry->gfp_flags);
- if (!ret)
- return TRACE_TYPE_PARTIAL_LINE;
-
- /* Pointer to allocated */
- ret = trace_seq_printf(s, "0x%tx ", (ptrdiff_t)entry->ptr);
- if (!ret)
- return TRACE_TYPE_PARTIAL_LINE;
-
- /* Node and call site*/
- ret = trace_seq_printf(s, "%4d %pf\n", entry->node,
- (void *)entry->call_site);
- if (!ret)
- return TRACE_TYPE_PARTIAL_LINE;
-
- return TRACE_TYPE_HANDLED;
-}
-
-static enum print_line_t
-kmemtrace_print_free_compress(struct trace_iterator *iter)
-{
- struct kmemtrace_free_entry *entry;
- struct trace_seq *s = &iter->seq;
- int ret;
-
- trace_assign_type(entry, iter->ent);
-
- /* Free entry */
- ret = trace_seq_printf(s, " - ");
- if (!ret)
- return TRACE_TYPE_PARTIAL_LINE;
-
- /* Type */
- switch (entry->type_id) {
- case KMEMTRACE_TYPE_KMALLOC:
- ret = trace_seq_printf(s, "K ");
- break;
- case KMEMTRACE_TYPE_CACHE:
- ret = trace_seq_printf(s, "C ");
- break;
- case KMEMTRACE_TYPE_PAGES:
- ret = trace_seq_printf(s, "P ");
- break;
- default:
- ret = trace_seq_printf(s, "? ");
- }
-
- if (!ret)
- return TRACE_TYPE_PARTIAL_LINE;
-
- /* Skip requested/allocated/flags */
- ret = trace_seq_printf(s, " ");
- if (!ret)
- return TRACE_TYPE_PARTIAL_LINE;
-
- /* Pointer to allocated */
- ret = trace_seq_printf(s, "0x%tx ", (ptrdiff_t)entry->ptr);
- if (!ret)
- return TRACE_TYPE_PARTIAL_LINE;
-
- /* Skip node and print call site*/
- ret = trace_seq_printf(s, " %pf\n", (void *)entry->call_site);
- if (!ret)
- return TRACE_TYPE_PARTIAL_LINE;
-
- return TRACE_TYPE_HANDLED;
-}
-
-static enum print_line_t kmemtrace_print_line(struct trace_iterator *iter)
-{
- struct trace_entry *entry = iter->ent;
-
- if (!(kmem_tracer_flags.val & TRACE_KMEM_OPT_MINIMAL))
- return TRACE_TYPE_UNHANDLED;
-
- switch (entry->type) {
- case TRACE_KMEM_ALLOC:
- return kmemtrace_print_alloc_compress(iter);
- case TRACE_KMEM_FREE:
- return kmemtrace_print_free_compress(iter);
- default:
- return TRACE_TYPE_UNHANDLED;
- }
-}
-
-static struct trace_event_functions kmem_trace_alloc_funcs = {
- .trace = kmemtrace_print_alloc,
- .binary = kmemtrace_print_alloc_user,
-};
-
-static struct trace_event kmem_trace_alloc = {
- .type = TRACE_KMEM_ALLOC,
- .funcs = &kmem_trace_alloc_funcs,
-};
-
-static struct trace_event_functions kmem_trace_free_funcs = {
- .trace = kmemtrace_print_free,
- .binary = kmemtrace_print_free_user,
-};
-
-static struct trace_event kmem_trace_free = {
- .type = TRACE_KMEM_FREE,
- .funcs = &kmem_trace_free_funcs,
-};
-
-static struct tracer kmem_tracer __read_mostly = {
- .name = "kmemtrace",
- .init = kmem_trace_init,
- .reset = kmem_trace_reset,
- .print_line = kmemtrace_print_line,
- .print_header = kmemtrace_headers,
- .flags = &kmem_tracer_flags
-};
-
-void kmemtrace_init(void)
-{
- /* earliest opportunity to start kmem tracing */
-}
-
-static int __init init_kmem_tracer(void)
-{
- if (!register_ftrace_event(&kmem_trace_alloc)) {
- pr_warning("Warning: could not register kmem events\n");
- return 1;
- }
-
- if (!register_ftrace_event(&kmem_trace_free)) {
- pr_warning("Warning: could not register kmem events\n");
- return 1;
- }
-
- if (register_tracer(&kmem_tracer) != 0) {
- pr_warning("Warning: could not register the kmem tracer\n");
- return 1;
- }
-
- return 0;
-}
-device_initcall(init_kmem_tracer);
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 1da7b6e..3632ce8 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -443,6 +443,7 @@
*/
struct ring_buffer_per_cpu {
int cpu;
+ atomic_t record_disabled;
struct ring_buffer *buffer;
spinlock_t reader_lock; /* serialize readers */
arch_spinlock_t lock;
@@ -462,7 +463,6 @@
unsigned long read;
u64 write_stamp;
u64 read_stamp;
- atomic_t record_disabled;
};
struct ring_buffer {
@@ -2242,8 +2242,6 @@
#endif
-static DEFINE_PER_CPU(int, rb_need_resched);
-
/**
* ring_buffer_lock_reserve - reserve a part of the buffer
* @buffer: the ring buffer to reserve from
@@ -2264,13 +2262,13 @@
{
struct ring_buffer_per_cpu *cpu_buffer;
struct ring_buffer_event *event;
- int cpu, resched;
+ int cpu;
if (ring_buffer_flags != RB_BUFFERS_ON)
return NULL;
/* If we are tracing schedule, we don't want to recurse */
- resched = ftrace_preempt_disable();
+ preempt_disable_notrace();
if (atomic_read(&buffer->record_disabled))
goto out_nocheck;
@@ -2295,21 +2293,13 @@
if (!event)
goto out;
- /*
- * Need to store resched state on this cpu.
- * Only the first needs to.
- */
-
- if (preempt_count() == 1)
- per_cpu(rb_need_resched, cpu) = resched;
-
return event;
out:
trace_recursive_unlock();
out_nocheck:
- ftrace_preempt_enable(resched);
+ preempt_enable_notrace();
return NULL;
}
EXPORT_SYMBOL_GPL(ring_buffer_lock_reserve);
@@ -2355,13 +2345,7 @@
trace_recursive_unlock();
- /*
- * Only the last preempt count needs to restore preemption.
- */
- if (preempt_count() == 1)
- ftrace_preempt_enable(per_cpu(rb_need_resched, cpu));
- else
- preempt_enable_no_resched_notrace();
+ preempt_enable_notrace();
return 0;
}
@@ -2469,13 +2453,7 @@
trace_recursive_unlock();
- /*
- * Only the last preempt count needs to restore preemption.
- */
- if (preempt_count() == 1)
- ftrace_preempt_enable(per_cpu(rb_need_resched, cpu));
- else
- preempt_enable_no_resched_notrace();
+ preempt_enable_notrace();
}
EXPORT_SYMBOL_GPL(ring_buffer_discard_commit);
@@ -2501,12 +2479,12 @@
struct ring_buffer_event *event;
void *body;
int ret = -EBUSY;
- int cpu, resched;
+ int cpu;
if (ring_buffer_flags != RB_BUFFERS_ON)
return -EBUSY;
- resched = ftrace_preempt_disable();
+ preempt_disable_notrace();
if (atomic_read(&buffer->record_disabled))
goto out;
@@ -2536,7 +2514,7 @@
ret = 0;
out:
- ftrace_preempt_enable(resched);
+ preempt_enable_notrace();
return ret;
}
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index d6736b9..ed1032d 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -341,7 +341,7 @@
/* trace_flags holds trace_options default values */
unsigned long trace_flags = TRACE_ITER_PRINT_PARENT | TRACE_ITER_PRINTK |
TRACE_ITER_ANNOTATE | TRACE_ITER_CONTEXT_INFO | TRACE_ITER_SLEEP_TIME |
- TRACE_ITER_GRAPH_TIME;
+ TRACE_ITER_GRAPH_TIME | TRACE_ITER_RECORD_CMD;
static int trace_stop_count;
static DEFINE_SPINLOCK(tracing_start_lock);
@@ -425,6 +425,7 @@
"latency-format",
"sleep-time",
"graph-time",
+ "record-cmd",
NULL
};
@@ -656,6 +657,10 @@
return;
WARN_ON_ONCE(!irqs_disabled());
+ if (!current_trace->use_max_tr) {
+ WARN_ON_ONCE(1);
+ return;
+ }
arch_spin_lock(&ftrace_max_lock);
tr->buffer = max_tr.buffer;
@@ -682,6 +687,11 @@
return;
WARN_ON_ONCE(!irqs_disabled());
+ if (!current_trace->use_max_tr) {
+ WARN_ON_ONCE(1);
+ return;
+ }
+
arch_spin_lock(&ftrace_max_lock);
ftrace_disable_cpu();
@@ -726,7 +736,7 @@
return -1;
}
- if (strlen(type->name) > MAX_TRACER_SIZE) {
+ if (strlen(type->name) >= MAX_TRACER_SIZE) {
pr_info("Tracer has a name longer than %d\n", MAX_TRACER_SIZE);
return -1;
}
@@ -1328,61 +1338,6 @@
#endif /* CONFIG_STACKTRACE */
-static void
-ftrace_trace_special(void *__tr,
- unsigned long arg1, unsigned long arg2, unsigned long arg3,
- int pc)
-{
- struct ftrace_event_call *call = &event_special;
- struct ring_buffer_event *event;
- struct trace_array *tr = __tr;
- struct ring_buffer *buffer = tr->buffer;
- struct special_entry *entry;
-
- event = trace_buffer_lock_reserve(buffer, TRACE_SPECIAL,
- sizeof(*entry), 0, pc);
- if (!event)
- return;
- entry = ring_buffer_event_data(event);
- entry->arg1 = arg1;
- entry->arg2 = arg2;
- entry->arg3 = arg3;
-
- if (!filter_check_discard(call, entry, buffer, event))
- trace_buffer_unlock_commit(buffer, event, 0, pc);
-}
-
-void
-__trace_special(void *__tr, void *__data,
- unsigned long arg1, unsigned long arg2, unsigned long arg3)
-{
- ftrace_trace_special(__tr, arg1, arg2, arg3, preempt_count());
-}
-
-void
-ftrace_special(unsigned long arg1, unsigned long arg2, unsigned long arg3)
-{
- struct trace_array *tr = &global_trace;
- struct trace_array_cpu *data;
- unsigned long flags;
- int cpu;
- int pc;
-
- if (tracing_disabled)
- return;
-
- pc = preempt_count();
- local_irq_save(flags);
- cpu = raw_smp_processor_id();
- data = tr->data[cpu];
-
- if (likely(atomic_inc_return(&data->disabled) == 1))
- ftrace_trace_special(tr, arg1, arg2, arg3, pc);
-
- atomic_dec(&data->disabled);
- local_irq_restore(flags);
-}
-
/**
* trace_vbprintk - write binary msg to tracing buffer
*
@@ -1401,7 +1356,6 @@
struct bprint_entry *entry;
unsigned long flags;
int disable;
- int resched;
int cpu, len = 0, size, pc;
if (unlikely(tracing_selftest_running || tracing_disabled))
@@ -1411,7 +1365,7 @@
pause_graph_tracing();
pc = preempt_count();
- resched = ftrace_preempt_disable();
+ preempt_disable_notrace();
cpu = raw_smp_processor_id();
data = tr->data[cpu];
@@ -1449,7 +1403,7 @@
out:
atomic_dec_return(&data->disabled);
- ftrace_preempt_enable(resched);
+ preempt_enable_notrace();
unpause_graph_tracing();
return len;
@@ -2386,6 +2340,7 @@
.open = show_traces_open,
.read = seq_read,
.release = seq_release,
+ .llseek = seq_lseek,
};
/*
@@ -2479,6 +2434,7 @@
.open = tracing_open_generic,
.read = tracing_cpumask_read,
.write = tracing_cpumask_write,
+ .llseek = generic_file_llseek,
};
static int tracing_trace_options_show(struct seq_file *m, void *v)
@@ -2554,6 +2510,9 @@
trace_flags |= mask;
else
trace_flags &= ~mask;
+
+ if (mask == TRACE_ITER_RECORD_CMD)
+ trace_event_enable_cmd_record(enabled);
}
static ssize_t
@@ -2645,6 +2604,7 @@
static const struct file_operations tracing_readme_fops = {
.open = tracing_open_generic,
.read = tracing_readme_read,
+ .llseek = generic_file_llseek,
};
static ssize_t
@@ -2695,6 +2655,7 @@
static const struct file_operations tracing_saved_cmdlines_fops = {
.open = tracing_open_generic,
.read = tracing_saved_cmdlines_read,
+ .llseek = generic_file_llseek,
};
static ssize_t
@@ -2790,6 +2751,9 @@
if (ret < 0)
return ret;
+ if (!current_trace->use_max_tr)
+ goto out;
+
ret = ring_buffer_resize(max_tr.buffer, size);
if (ret < 0) {
int r;
@@ -2817,11 +2781,14 @@
return ret;
}
+ max_tr.entries = size;
+ out:
global_trace.entries = size;
return ret;
}
+
/**
* tracing_update_buffers - used by tracing facility to expand ring buffers
*
@@ -2882,12 +2849,26 @@
trace_branch_disable();
if (current_trace && current_trace->reset)
current_trace->reset(tr);
-
+ if (current_trace && current_trace->use_max_tr) {
+ /*
+ * We don't free the ring buffer. instead, resize it because
+ * The max_tr ring buffer has some state (e.g. ring->clock) and
+ * we want preserve it.
+ */
+ ring_buffer_resize(max_tr.buffer, 1);
+ max_tr.entries = 1;
+ }
destroy_trace_option_files(topts);
current_trace = t;
topts = create_trace_option_files(current_trace);
+ if (current_trace->use_max_tr) {
+ ret = ring_buffer_resize(max_tr.buffer, global_trace.entries);
+ if (ret < 0)
+ goto out;
+ max_tr.entries = global_trace.entries;
+ }
if (t->init) {
ret = tracer_init(t, tr);
@@ -3024,6 +3005,7 @@
if (iter->trace->pipe_open)
iter->trace->pipe_open(iter);
+ nonseekable_open(inode, filp);
out:
mutex_unlock(&trace_types_lock);
return ret;
@@ -3469,7 +3451,6 @@
}
tracing_start();
- max_tr.entries = global_trace.entries;
mutex_unlock(&trace_types_lock);
return cnt;
@@ -3582,18 +3563,21 @@
.open = tracing_open_generic,
.read = tracing_max_lat_read,
.write = tracing_max_lat_write,
+ .llseek = generic_file_llseek,
};
static const struct file_operations tracing_ctrl_fops = {
.open = tracing_open_generic,
.read = tracing_ctrl_read,
.write = tracing_ctrl_write,
+ .llseek = generic_file_llseek,
};
static const struct file_operations set_tracer_fops = {
.open = tracing_open_generic,
.read = tracing_set_trace_read,
.write = tracing_set_trace_write,
+ .llseek = generic_file_llseek,
};
static const struct file_operations tracing_pipe_fops = {
@@ -3602,17 +3586,20 @@
.read = tracing_read_pipe,
.splice_read = tracing_splice_read_pipe,
.release = tracing_release_pipe,
+ .llseek = no_llseek,
};
static const struct file_operations tracing_entries_fops = {
.open = tracing_open_generic,
.read = tracing_entries_read,
.write = tracing_entries_write,
+ .llseek = generic_file_llseek,
};
static const struct file_operations tracing_mark_fops = {
.open = tracing_open_generic,
.write = tracing_mark_write,
+ .llseek = generic_file_llseek,
};
static const struct file_operations trace_clock_fops = {
@@ -3918,6 +3905,7 @@
static const struct file_operations tracing_stats_fops = {
.open = tracing_open_generic,
.read = tracing_stats_read,
+ .llseek = generic_file_llseek,
};
#ifdef CONFIG_DYNAMIC_FTRACE
@@ -3954,6 +3942,7 @@
static const struct file_operations tracing_dyn_info_fops = {
.open = tracing_open_generic,
.read = tracing_read_dyn_info,
+ .llseek = generic_file_llseek,
};
#endif
@@ -4107,6 +4096,7 @@
.open = tracing_open_generic,
.read = trace_options_read,
.write = trace_options_write,
+ .llseek = generic_file_llseek,
};
static ssize_t
@@ -4158,6 +4148,7 @@
.open = tracing_open_generic,
.read = trace_options_core_read,
.write = trace_options_core_write,
+ .llseek = generic_file_llseek,
};
struct dentry *trace_create_file(const char *name,
@@ -4347,9 +4338,6 @@
trace_create_file("dyn_ftrace_total_info", 0444, d_tracer,
&ftrace_update_tot_cnt, &tracing_dyn_info_fops);
#endif
-#ifdef CONFIG_SYSPROF_TRACER
- init_tracer_sysprof_debugfs(d_tracer);
-#endif
create_trace_options_dir();
@@ -4576,16 +4564,14 @@
#ifdef CONFIG_TRACER_MAX_TRACE
- max_tr.buffer = ring_buffer_alloc(ring_buf_size,
- TRACE_BUFFER_FLAGS);
+ max_tr.buffer = ring_buffer_alloc(1, TRACE_BUFFER_FLAGS);
if (!max_tr.buffer) {
printk(KERN_ERR "tracer: failed to allocate max ring buffer!\n");
WARN_ON(1);
ring_buffer_free(global_trace.buffer);
goto out_free_cpumask;
}
- max_tr.entries = ring_buffer_size(max_tr.buffer);
- WARN_ON(max_tr.entries != global_trace.entries);
+ max_tr.entries = 1;
#endif
/* Allocate the first page for all buffers */
@@ -4598,9 +4584,6 @@
register_tracer(&nop_trace);
current_trace = &nop_trace;
-#ifdef CONFIG_BOOT_TRACER
- register_tracer(&boot_tracer);
-#endif
/* All seems OK, enable tracing */
tracing_disabled = 0;
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 0605fc0..d39b3c5 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -9,10 +9,7 @@
#include <linux/mmiotrace.h>
#include <linux/tracepoint.h>
#include <linux/ftrace.h>
-#include <trace/boot.h>
-#include <linux/kmemtrace.h>
#include <linux/hw_breakpoint.h>
-
#include <linux/trace_seq.h>
#include <linux/ftrace_event.h>
@@ -25,30 +22,17 @@
TRACE_STACK,
TRACE_PRINT,
TRACE_BPRINT,
- TRACE_SPECIAL,
TRACE_MMIO_RW,
TRACE_MMIO_MAP,
TRACE_BRANCH,
- TRACE_BOOT_CALL,
- TRACE_BOOT_RET,
TRACE_GRAPH_RET,
TRACE_GRAPH_ENT,
TRACE_USER_STACK,
- TRACE_KMEM_ALLOC,
- TRACE_KMEM_FREE,
TRACE_BLK,
- TRACE_KSYM,
__TRACE_LAST_TYPE,
};
-enum kmemtrace_type_id {
- KMEMTRACE_TYPE_KMALLOC = 0, /* kmalloc() or kfree(). */
- KMEMTRACE_TYPE_CACHE, /* kmem_cache_*(). */
- KMEMTRACE_TYPE_PAGES, /* __get_free_pages() and friends. */
-};
-
-extern struct tracer boot_tracer;
#undef __field
#define __field(type, item) type item;
@@ -204,23 +188,15 @@
IF_ASSIGN(var, ent, struct userstack_entry, TRACE_USER_STACK);\
IF_ASSIGN(var, ent, struct print_entry, TRACE_PRINT); \
IF_ASSIGN(var, ent, struct bprint_entry, TRACE_BPRINT); \
- IF_ASSIGN(var, ent, struct special_entry, 0); \
IF_ASSIGN(var, ent, struct trace_mmiotrace_rw, \
TRACE_MMIO_RW); \
IF_ASSIGN(var, ent, struct trace_mmiotrace_map, \
TRACE_MMIO_MAP); \
- IF_ASSIGN(var, ent, struct trace_boot_call, TRACE_BOOT_CALL);\
- IF_ASSIGN(var, ent, struct trace_boot_ret, TRACE_BOOT_RET);\
IF_ASSIGN(var, ent, struct trace_branch, TRACE_BRANCH); \
IF_ASSIGN(var, ent, struct ftrace_graph_ent_entry, \
TRACE_GRAPH_ENT); \
IF_ASSIGN(var, ent, struct ftrace_graph_ret_entry, \
TRACE_GRAPH_RET); \
- IF_ASSIGN(var, ent, struct kmemtrace_alloc_entry, \
- TRACE_KMEM_ALLOC); \
- IF_ASSIGN(var, ent, struct kmemtrace_free_entry, \
- TRACE_KMEM_FREE); \
- IF_ASSIGN(var, ent, struct ksym_trace_entry, TRACE_KSYM);\
__ftrace_bad_type(); \
} while (0)
@@ -298,6 +274,7 @@
struct tracer *next;
int print_max;
struct tracer_flags *flags;
+ int use_max_tr;
};
@@ -318,7 +295,6 @@
const struct file_operations *fops);
struct dentry *tracing_init_dentry(void);
-void init_tracer_sysprof_debugfs(struct dentry *d_tracer);
struct ring_buffer_event;
@@ -363,11 +339,6 @@
struct task_struct *wakee,
struct task_struct *cur,
unsigned long flags, int pc);
-void trace_special(struct trace_array *tr,
- struct trace_array_cpu *data,
- unsigned long arg1,
- unsigned long arg2,
- unsigned long arg3, int pc);
void trace_function(struct trace_array *tr,
unsigned long ip,
unsigned long parent_ip,
@@ -398,8 +369,6 @@
#define for_each_tracing_cpu(cpu) \
for_each_cpu(cpu, tracing_buffer_mask)
-extern int process_new_ksym_entry(char *ksymname, int op, unsigned long addr);
-
extern unsigned long nsecs_to_usecs(unsigned long nsecs);
extern unsigned long tracing_thresh;
@@ -469,12 +438,8 @@
struct trace_array *tr);
extern int trace_selftest_startup_sched_switch(struct tracer *trace,
struct trace_array *tr);
-extern int trace_selftest_startup_sysprof(struct tracer *trace,
- struct trace_array *tr);
extern int trace_selftest_startup_branch(struct tracer *trace,
struct trace_array *tr);
-extern int trace_selftest_startup_ksym(struct tracer *trace,
- struct trace_array *tr);
#endif /* CONFIG_FTRACE_STARTUP_TEST */
extern void *head_page(struct trace_array_cpu *data);
@@ -636,6 +601,7 @@
TRACE_ITER_LATENCY_FMT = 0x20000,
TRACE_ITER_SLEEP_TIME = 0x40000,
TRACE_ITER_GRAPH_TIME = 0x80000,
+ TRACE_ITER_RECORD_CMD = 0x100000,
};
/*
@@ -647,54 +613,6 @@
extern struct tracer nop_trace;
-/**
- * ftrace_preempt_disable - disable preemption scheduler safe
- *
- * When tracing can happen inside the scheduler, there exists
- * cases that the tracing might happen before the need_resched
- * flag is checked. If this happens and the tracer calls
- * preempt_enable (after a disable), a schedule might take place
- * causing an infinite recursion.
- *
- * To prevent this, we read the need_resched flag before
- * disabling preemption. When we want to enable preemption we
- * check the flag, if it is set, then we call preempt_enable_no_resched.
- * Otherwise, we call preempt_enable.
- *
- * The rational for doing the above is that if need_resched is set
- * and we have yet to reschedule, we are either in an atomic location
- * (where we do not need to check for scheduling) or we are inside
- * the scheduler and do not want to resched.
- */
-static inline int ftrace_preempt_disable(void)
-{
- int resched;
-
- resched = need_resched();
- preempt_disable_notrace();
-
- return resched;
-}
-
-/**
- * ftrace_preempt_enable - enable preemption scheduler safe
- * @resched: the return value from ftrace_preempt_disable
- *
- * This is a scheduler safe way to enable preemption and not miss
- * any preemption checks. The disabled saved the state of preemption.
- * If resched is set, then we are either inside an atomic or
- * are inside the scheduler (we would have already scheduled
- * otherwise). In this case, we do not want to call normal
- * preempt_enable, but preempt_enable_no_resched instead.
- */
-static inline void ftrace_preempt_enable(int resched)
-{
- if (resched)
- preempt_enable_no_resched_notrace();
- else
- preempt_enable_notrace();
-}
-
#ifdef CONFIG_BRANCH_TRACER
extern int enable_branch_tracing(struct trace_array *tr);
extern void disable_branch_tracing(void);
@@ -785,6 +703,8 @@
int pop_n;
};
+extern struct list_head ftrace_common_fields;
+
extern enum regex_type
filter_parse_regex(char *buff, int len, char **search, int *not);
extern void print_event_filter(struct ftrace_event_call *call,
@@ -814,6 +734,8 @@
return 0;
}
+extern void trace_event_enable_cmd_record(bool enable);
+
extern struct mutex event_mutex;
extern struct list_head ftrace_events;
diff --git a/kernel/trace/trace_boot.c b/kernel/trace/trace_boot.c
deleted file mode 100644
index c21d5f3..0000000
--- a/kernel/trace/trace_boot.c
+++ /dev/null
@@ -1,185 +0,0 @@
-/*
- * ring buffer based initcalls tracer
- *
- * Copyright (C) 2008 Frederic Weisbecker <fweisbec@gmail.com>
- *
- */
-
-#include <linux/init.h>
-#include <linux/debugfs.h>
-#include <linux/ftrace.h>
-#include <linux/kallsyms.h>
-#include <linux/time.h>
-
-#include "trace.h"
-#include "trace_output.h"
-
-static struct trace_array *boot_trace;
-static bool pre_initcalls_finished;
-
-/* Tells the boot tracer that the pre_smp_initcalls are finished.
- * So we are ready .
- * It doesn't enable sched events tracing however.
- * You have to call enable_boot_trace to do so.
- */
-void start_boot_trace(void)
-{
- pre_initcalls_finished = true;
-}
-
-void enable_boot_trace(void)
-{
- if (boot_trace && pre_initcalls_finished)
- tracing_start_sched_switch_record();
-}
-
-void disable_boot_trace(void)
-{
- if (boot_trace && pre_initcalls_finished)
- tracing_stop_sched_switch_record();
-}
-
-static int boot_trace_init(struct trace_array *tr)
-{
- boot_trace = tr;
-
- if (!tr)
- return 0;
-
- tracing_reset_online_cpus(tr);
-
- tracing_sched_switch_assign_trace(tr);
- return 0;
-}
-
-static enum print_line_t
-initcall_call_print_line(struct trace_iterator *iter)
-{
- struct trace_entry *entry = iter->ent;
- struct trace_seq *s = &iter->seq;
- struct trace_boot_call *field;
- struct boot_trace_call *call;
- u64 ts;
- unsigned long nsec_rem;
- int ret;
-
- trace_assign_type(field, entry);
- call = &field->boot_call;
- ts = iter->ts;
- nsec_rem = do_div(ts, NSEC_PER_SEC);
-
- ret = trace_seq_printf(s, "[%5ld.%09ld] calling %s @ %i\n",
- (unsigned long)ts, nsec_rem, call->func, call->caller);
-
- if (!ret)
- return TRACE_TYPE_PARTIAL_LINE;
- else
- return TRACE_TYPE_HANDLED;
-}
-
-static enum print_line_t
-initcall_ret_print_line(struct trace_iterator *iter)
-{
- struct trace_entry *entry = iter->ent;
- struct trace_seq *s = &iter->seq;
- struct trace_boot_ret *field;
- struct boot_trace_ret *init_ret;
- u64 ts;
- unsigned long nsec_rem;
- int ret;
-
- trace_assign_type(field, entry);
- init_ret = &field->boot_ret;
- ts = iter->ts;
- nsec_rem = do_div(ts, NSEC_PER_SEC);
-
- ret = trace_seq_printf(s, "[%5ld.%09ld] initcall %s "
- "returned %d after %llu msecs\n",
- (unsigned long) ts,
- nsec_rem,
- init_ret->func, init_ret->result, init_ret->duration);
-
- if (!ret)
- return TRACE_TYPE_PARTIAL_LINE;
- else
- return TRACE_TYPE_HANDLED;
-}
-
-static enum print_line_t initcall_print_line(struct trace_iterator *iter)
-{
- struct trace_entry *entry = iter->ent;
-
- switch (entry->type) {
- case TRACE_BOOT_CALL:
- return initcall_call_print_line(iter);
- case TRACE_BOOT_RET:
- return initcall_ret_print_line(iter);
- default:
- return TRACE_TYPE_UNHANDLED;
- }
-}
-
-struct tracer boot_tracer __read_mostly =
-{
- .name = "initcall",
- .init = boot_trace_init,
- .reset = tracing_reset_online_cpus,
- .print_line = initcall_print_line,
-};
-
-void trace_boot_call(struct boot_trace_call *bt, initcall_t fn)
-{
- struct ftrace_event_call *call = &event_boot_call;
- struct ring_buffer_event *event;
- struct ring_buffer *buffer;
- struct trace_boot_call *entry;
- struct trace_array *tr = boot_trace;
-
- if (!tr || !pre_initcalls_finished)
- return;
-
- /* Get its name now since this function could
- * disappear because it is in the .init section.
- */
- sprint_symbol(bt->func, (unsigned long)fn);
- preempt_disable();
-
- buffer = tr->buffer;
- event = trace_buffer_lock_reserve(buffer, TRACE_BOOT_CALL,
- sizeof(*entry), 0, 0);
- if (!event)
- goto out;
- entry = ring_buffer_event_data(event);
- entry->boot_call = *bt;
- if (!filter_check_discard(call, entry, buffer, event))
- trace_buffer_unlock_commit(buffer, event, 0, 0);
- out:
- preempt_enable();
-}
-
-void trace_boot_ret(struct boot_trace_ret *bt, initcall_t fn)
-{
- struct ftrace_event_call *call = &event_boot_ret;
- struct ring_buffer_event *event;
- struct ring_buffer *buffer;
- struct trace_boot_ret *entry;
- struct trace_array *tr = boot_trace;
-
- if (!tr || !pre_initcalls_finished)
- return;
-
- sprint_symbol(bt->func, (unsigned long)fn);
- preempt_disable();
-
- buffer = tr->buffer;
- event = trace_buffer_lock_reserve(buffer, TRACE_BOOT_RET,
- sizeof(*entry), 0, 0);
- if (!event)
- goto out;
- entry = ring_buffer_event_data(event);
- entry->boot_ret = *bt;
- if (!filter_check_discard(call, entry, buffer, event))
- trace_buffer_unlock_commit(buffer, event, 0, 0);
- out:
- preempt_enable();
-}
diff --git a/kernel/trace/trace_clock.c b/kernel/trace/trace_clock.c
index 9d589d8..52fda6c 100644
--- a/kernel/trace/trace_clock.c
+++ b/kernel/trace/trace_clock.c
@@ -32,16 +32,15 @@
u64 notrace trace_clock_local(void)
{
u64 clock;
- int resched;
/*
* sched_clock() is an architecture implemented, fast, scalable,
* lockless clock. It is not guaranteed to be coherent across
* CPUs, nor across CPU idle events.
*/
- resched = ftrace_preempt_disable();
+ preempt_disable_notrace();
clock = sched_clock();
- ftrace_preempt_enable(resched);
+ preempt_enable_notrace();
return clock;
}
diff --git a/kernel/trace/trace_entries.h b/kernel/trace/trace_entries.h
index dc008c1..e3dfeca 100644
--- a/kernel/trace/trace_entries.h
+++ b/kernel/trace/trace_entries.h
@@ -151,23 +151,6 @@
);
/*
- * Special (free-form) trace entry:
- */
-FTRACE_ENTRY(special, special_entry,
-
- TRACE_SPECIAL,
-
- F_STRUCT(
- __field( unsigned long, arg1 )
- __field( unsigned long, arg2 )
- __field( unsigned long, arg3 )
- ),
-
- F_printk("(%08lx) (%08lx) (%08lx)",
- __entry->arg1, __entry->arg2, __entry->arg3)
-);
-
-/*
* Stack-trace entry:
*/
@@ -271,33 +254,6 @@
__entry->map_id, __entry->opcode)
);
-FTRACE_ENTRY(boot_call, trace_boot_call,
-
- TRACE_BOOT_CALL,
-
- F_STRUCT(
- __field_struct( struct boot_trace_call, boot_call )
- __field_desc( pid_t, boot_call, caller )
- __array_desc( char, boot_call, func, KSYM_SYMBOL_LEN)
- ),
-
- F_printk("%d %s", __entry->caller, __entry->func)
-);
-
-FTRACE_ENTRY(boot_ret, trace_boot_ret,
-
- TRACE_BOOT_RET,
-
- F_STRUCT(
- __field_struct( struct boot_trace_ret, boot_ret )
- __array_desc( char, boot_ret, func, KSYM_SYMBOL_LEN)
- __field_desc( int, boot_ret, result )
- __field_desc( unsigned long, boot_ret, duration )
- ),
-
- F_printk("%s %d %lx",
- __entry->func, __entry->result, __entry->duration)
-);
#define TRACE_FUNC_SIZE 30
#define TRACE_FILE_SIZE 20
@@ -318,53 +274,3 @@
__entry->func, __entry->file, __entry->correct)
);
-FTRACE_ENTRY(kmem_alloc, kmemtrace_alloc_entry,
-
- TRACE_KMEM_ALLOC,
-
- F_STRUCT(
- __field( enum kmemtrace_type_id, type_id )
- __field( unsigned long, call_site )
- __field( const void *, ptr )
- __field( size_t, bytes_req )
- __field( size_t, bytes_alloc )
- __field( gfp_t, gfp_flags )
- __field( int, node )
- ),
-
- F_printk("type:%u call_site:%lx ptr:%p req:%zi alloc:%zi"
- " flags:%x node:%d",
- __entry->type_id, __entry->call_site, __entry->ptr,
- __entry->bytes_req, __entry->bytes_alloc,
- __entry->gfp_flags, __entry->node)
-);
-
-FTRACE_ENTRY(kmem_free, kmemtrace_free_entry,
-
- TRACE_KMEM_FREE,
-
- F_STRUCT(
- __field( enum kmemtrace_type_id, type_id )
- __field( unsigned long, call_site )
- __field( const void *, ptr )
- ),
-
- F_printk("type:%u call_site:%lx ptr:%p",
- __entry->type_id, __entry->call_site, __entry->ptr)
-);
-
-FTRACE_ENTRY(ksym_trace, ksym_trace_entry,
-
- TRACE_KSYM,
-
- F_STRUCT(
- __field( unsigned long, ip )
- __field( unsigned char, type )
- __array( char , cmd, TASK_COMM_LEN )
- __field( unsigned long, addr )
- ),
-
- F_printk("ip: %pF type: %d ksym_name: %pS cmd: %s",
- (void *)__entry->ip, (unsigned int)__entry->type,
- (void *)__entry->addr, __entry->cmd)
-);
diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index 8a2b73f..000e6e8 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -9,8 +9,6 @@
#include <linux/kprobes.h>
#include "trace.h"
-EXPORT_SYMBOL_GPL(perf_arch_fetch_caller_regs);
-
static char *perf_trace_buf[4];
/*
@@ -56,13 +54,7 @@
}
}
- if (tp_event->class->reg)
- ret = tp_event->class->reg(tp_event, TRACE_REG_PERF_REGISTER);
- else
- ret = tracepoint_probe_register(tp_event->name,
- tp_event->class->perf_probe,
- tp_event);
-
+ ret = tp_event->class->reg(tp_event, TRACE_REG_PERF_REGISTER);
if (ret)
goto fail;
@@ -96,9 +88,7 @@
mutex_lock(&event_mutex);
list_for_each_entry(tp_event, &ftrace_events, list) {
if (tp_event->event.type == event_id &&
- tp_event->class &&
- (tp_event->class->perf_probe ||
- tp_event->class->reg) &&
+ tp_event->class && tp_event->class->reg &&
try_module_get(tp_event->mod)) {
ret = perf_trace_event_init(tp_event, p_event);
break;
@@ -138,18 +128,13 @@
if (--tp_event->perf_refcount > 0)
goto out;
- if (tp_event->class->reg)
- tp_event->class->reg(tp_event, TRACE_REG_PERF_UNREGISTER);
- else
- tracepoint_probe_unregister(tp_event->name,
- tp_event->class->perf_probe,
- tp_event);
+ tp_event->class->reg(tp_event, TRACE_REG_PERF_UNREGISTER);
/*
- * Ensure our callback won't be called anymore. See
- * tracepoint_probe_unregister() and __DO_TRACE().
+ * Ensure our callback won't be called anymore. The buffers
+ * will be freed after that.
*/
- synchronize_sched();
+ tracepoint_synchronize_unregister();
free_percpu(tp_event->perf_events);
tp_event->perf_events = NULL;
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 53cffc0..09b4fa6 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -28,6 +28,7 @@
DEFINE_MUTEX(event_mutex);
LIST_HEAD(ftrace_events);
+LIST_HEAD(ftrace_common_fields);
struct list_head *
trace_get_fields(struct ftrace_event_call *event_call)
@@ -37,15 +38,11 @@
return event_call->class->get_fields(event_call);
}
-int trace_define_field(struct ftrace_event_call *call, const char *type,
- const char *name, int offset, int size, int is_signed,
- int filter_type)
+static int __trace_define_field(struct list_head *head, const char *type,
+ const char *name, int offset, int size,
+ int is_signed, int filter_type)
{
struct ftrace_event_field *field;
- struct list_head *head;
-
- if (WARN_ON(!call->class))
- return 0;
field = kzalloc(sizeof(*field), GFP_KERNEL);
if (!field)
@@ -68,7 +65,6 @@
field->size = size;
field->is_signed = is_signed;
- head = trace_get_fields(call);
list_add(&field->link, head);
return 0;
@@ -80,17 +76,32 @@
return -ENOMEM;
}
+
+int trace_define_field(struct ftrace_event_call *call, const char *type,
+ const char *name, int offset, int size, int is_signed,
+ int filter_type)
+{
+ struct list_head *head;
+
+ if (WARN_ON(!call->class))
+ return 0;
+
+ head = trace_get_fields(call);
+ return __trace_define_field(head, type, name, offset, size,
+ is_signed, filter_type);
+}
EXPORT_SYMBOL_GPL(trace_define_field);
#define __common_field(type, item) \
- ret = trace_define_field(call, #type, "common_" #item, \
- offsetof(typeof(ent), item), \
- sizeof(ent.item), \
- is_signed_type(type), FILTER_OTHER); \
+ ret = __trace_define_field(&ftrace_common_fields, #type, \
+ "common_" #item, \
+ offsetof(typeof(ent), item), \
+ sizeof(ent.item), \
+ is_signed_type(type), FILTER_OTHER); \
if (ret) \
return ret;
-static int trace_define_common_fields(struct ftrace_event_call *call)
+static int trace_define_common_fields(void)
{
int ret;
struct trace_entry ent;
@@ -130,6 +141,55 @@
}
EXPORT_SYMBOL_GPL(trace_event_raw_init);
+int ftrace_event_reg(struct ftrace_event_call *call, enum trace_reg type)
+{
+ switch (type) {
+ case TRACE_REG_REGISTER:
+ return tracepoint_probe_register(call->name,
+ call->class->probe,
+ call);
+ case TRACE_REG_UNREGISTER:
+ tracepoint_probe_unregister(call->name,
+ call->class->probe,
+ call);
+ return 0;
+
+#ifdef CONFIG_PERF_EVENTS
+ case TRACE_REG_PERF_REGISTER:
+ return tracepoint_probe_register(call->name,
+ call->class->perf_probe,
+ call);
+ case TRACE_REG_PERF_UNREGISTER:
+ tracepoint_probe_unregister(call->name,
+ call->class->perf_probe,
+ call);
+ return 0;
+#endif
+ }
+ return 0;
+}
+EXPORT_SYMBOL_GPL(ftrace_event_reg);
+
+void trace_event_enable_cmd_record(bool enable)
+{
+ struct ftrace_event_call *call;
+
+ mutex_lock(&event_mutex);
+ list_for_each_entry(call, &ftrace_events, list) {
+ if (!(call->flags & TRACE_EVENT_FL_ENABLED))
+ continue;
+
+ if (enable) {
+ tracing_start_cmdline_record();
+ call->flags |= TRACE_EVENT_FL_RECORDED_CMD;
+ } else {
+ tracing_stop_cmdline_record();
+ call->flags &= ~TRACE_EVENT_FL_RECORDED_CMD;
+ }
+ }
+ mutex_unlock(&event_mutex);
+}
+
static int ftrace_event_enable_disable(struct ftrace_event_call *call,
int enable)
{
@@ -139,24 +199,20 @@
case 0:
if (call->flags & TRACE_EVENT_FL_ENABLED) {
call->flags &= ~TRACE_EVENT_FL_ENABLED;
- tracing_stop_cmdline_record();
- if (call->class->reg)
- call->class->reg(call, TRACE_REG_UNREGISTER);
- else
- tracepoint_probe_unregister(call->name,
- call->class->probe,
- call);
+ if (call->flags & TRACE_EVENT_FL_RECORDED_CMD) {
+ tracing_stop_cmdline_record();
+ call->flags &= ~TRACE_EVENT_FL_RECORDED_CMD;
+ }
+ call->class->reg(call, TRACE_REG_UNREGISTER);
}
break;
case 1:
if (!(call->flags & TRACE_EVENT_FL_ENABLED)) {
- tracing_start_cmdline_record();
- if (call->class->reg)
- ret = call->class->reg(call, TRACE_REG_REGISTER);
- else
- ret = tracepoint_probe_register(call->name,
- call->class->probe,
- call);
+ if (trace_flags & TRACE_ITER_RECORD_CMD) {
+ tracing_start_cmdline_record();
+ call->flags |= TRACE_EVENT_FL_RECORDED_CMD;
+ }
+ ret = call->class->reg(call, TRACE_REG_REGISTER);
if (ret) {
tracing_stop_cmdline_record();
pr_info("event trace: Could not enable event "
@@ -194,8 +250,7 @@
mutex_lock(&event_mutex);
list_for_each_entry(call, &ftrace_events, list) {
- if (!call->name || !call->class ||
- (!call->class->probe && !call->class->reg))
+ if (!call->name || !call->class || !call->class->reg)
continue;
if (match &&
@@ -321,7 +376,7 @@
* The ftrace subsystem is for showing formats only.
* They can not be enabled or disabled via the event files.
*/
- if (call->class && (call->class->probe || call->class->reg))
+ if (call->class && call->class->reg)
return call;
}
@@ -474,8 +529,7 @@
mutex_lock(&event_mutex);
list_for_each_entry(call, &ftrace_events, list) {
- if (!call->name || !call->class ||
- (!call->class->probe && !call->class->reg))
+ if (!call->name || !call->class || !call->class->reg)
continue;
if (system && strcmp(call->class->system, system) != 0)
@@ -544,32 +598,10 @@
return ret;
}
-static ssize_t
-event_format_read(struct file *filp, char __user *ubuf, size_t cnt,
- loff_t *ppos)
+static void print_event_fields(struct trace_seq *s, struct list_head *head)
{
- struct ftrace_event_call *call = filp->private_data;
struct ftrace_event_field *field;
- struct list_head *head;
- struct trace_seq *s;
- int common_field_count = 5;
- char *buf;
- int r = 0;
- if (*ppos)
- return 0;
-
- s = kmalloc(sizeof(*s), GFP_KERNEL);
- if (!s)
- return -ENOMEM;
-
- trace_seq_init(s);
-
- trace_seq_printf(s, "name: %s\n", call->name);
- trace_seq_printf(s, "ID: %d\n", call->event.type);
- trace_seq_printf(s, "format:\n");
-
- head = trace_get_fields(call);
list_for_each_entry_reverse(field, head, link) {
/*
* Smartly shows the array type(except dynamic array).
@@ -584,29 +616,54 @@
array_descriptor = NULL;
if (!array_descriptor) {
- r = trace_seq_printf(s, "\tfield:%s %s;\toffset:%u;"
+ trace_seq_printf(s, "\tfield:%s %s;\toffset:%u;"
"\tsize:%u;\tsigned:%d;\n",
field->type, field->name, field->offset,
field->size, !!field->is_signed);
} else {
- r = trace_seq_printf(s, "\tfield:%.*s %s%s;\toffset:%u;"
+ trace_seq_printf(s, "\tfield:%.*s %s%s;\toffset:%u;"
"\tsize:%u;\tsigned:%d;\n",
(int)(array_descriptor - field->type),
field->type, field->name,
array_descriptor, field->offset,
field->size, !!field->is_signed);
}
-
- if (--common_field_count == 0)
- r = trace_seq_printf(s, "\n");
-
- if (!r)
- break;
}
+}
- if (r)
- r = trace_seq_printf(s, "\nprint fmt: %s\n",
- call->print_fmt);
+static ssize_t
+event_format_read(struct file *filp, char __user *ubuf, size_t cnt,
+ loff_t *ppos)
+{
+ struct ftrace_event_call *call = filp->private_data;
+ struct list_head *head;
+ struct trace_seq *s;
+ char *buf;
+ int r;
+
+ if (*ppos)
+ return 0;
+
+ s = kmalloc(sizeof(*s), GFP_KERNEL);
+ if (!s)
+ return -ENOMEM;
+
+ trace_seq_init(s);
+
+ trace_seq_printf(s, "name: %s\n", call->name);
+ trace_seq_printf(s, "ID: %d\n", call->event.type);
+ trace_seq_printf(s, "format:\n");
+
+ /* print common fields */
+ print_event_fields(s, &ftrace_common_fields);
+
+ trace_seq_putc(s, '\n');
+
+ /* print event specific fields */
+ head = trace_get_fields(call);
+ print_event_fields(s, head);
+
+ r = trace_seq_printf(s, "\nprint fmt: %s\n", call->print_fmt);
if (!r) {
/*
@@ -963,35 +1020,31 @@
return -1;
}
- if (call->class->probe || call->class->reg)
+ if (call->class->reg)
trace_create_file("enable", 0644, call->dir, call,
enable);
#ifdef CONFIG_PERF_EVENTS
- if (call->event.type && (call->class->perf_probe || call->class->reg))
+ if (call->event.type && call->class->reg)
trace_create_file("id", 0444, call->dir, call,
id);
#endif
- if (call->class->define_fields) {
- /*
- * Other events may have the same class. Only update
- * the fields if they are not already defined.
- */
- head = trace_get_fields(call);
- if (list_empty(head)) {
- ret = trace_define_common_fields(call);
- if (!ret)
- ret = call->class->define_fields(call);
- if (ret < 0) {
- pr_warning("Could not initialize trace point"
- " events/%s\n", call->name);
- return ret;
- }
+ /*
+ * Other events may have the same class. Only update
+ * the fields if they are not already defined.
+ */
+ head = trace_get_fields(call);
+ if (list_empty(head)) {
+ ret = call->class->define_fields(call);
+ if (ret < 0) {
+ pr_warning("Could not initialize trace point"
+ " events/%s\n", call->name);
+ return ret;
}
- trace_create_file("filter", 0644, call->dir, call,
- filter);
}
+ trace_create_file("filter", 0644, call->dir, call,
+ filter);
trace_create_file("format", 0444, call->dir, call,
format);
@@ -999,11 +1052,17 @@
return 0;
}
-static int __trace_add_event_call(struct ftrace_event_call *call)
+static int
+__trace_add_event_call(struct ftrace_event_call *call, struct module *mod,
+ const struct file_operations *id,
+ const struct file_operations *enable,
+ const struct file_operations *filter,
+ const struct file_operations *format)
{
struct dentry *d_events;
int ret;
+ /* The linker may leave blanks */
if (!call->name)
return -EINVAL;
@@ -1011,8 +1070,8 @@
ret = call->class->raw_init(call);
if (ret < 0) {
if (ret != -ENOSYS)
- pr_warning("Could not initialize trace "
- "events/%s\n", call->name);
+ pr_warning("Could not initialize trace events/%s\n",
+ call->name);
return ret;
}
}
@@ -1021,11 +1080,10 @@
if (!d_events)
return -ENOENT;
- ret = event_create_dir(call, d_events, &ftrace_event_id_fops,
- &ftrace_enable_fops, &ftrace_event_filter_fops,
- &ftrace_event_format_fops);
+ ret = event_create_dir(call, d_events, id, enable, filter, format);
if (!ret)
list_add(&call->list, &ftrace_events);
+ call->mod = mod;
return ret;
}
@@ -1035,7 +1093,10 @@
{
int ret;
mutex_lock(&event_mutex);
- ret = __trace_add_event_call(call);
+ ret = __trace_add_event_call(call, NULL, &ftrace_event_id_fops,
+ &ftrace_enable_fops,
+ &ftrace_event_filter_fops,
+ &ftrace_event_format_fops);
mutex_unlock(&event_mutex);
return ret;
}
@@ -1152,8 +1213,6 @@
{
struct ftrace_module_file_ops *file_ops = NULL;
struct ftrace_event_call *call, *start, *end;
- struct dentry *d_events;
- int ret;
start = mod->trace_events;
end = mod->trace_events + mod->num_trace_events;
@@ -1161,38 +1220,14 @@
if (start == end)
return;
- d_events = event_trace_events_dir();
- if (!d_events)
+ file_ops = trace_create_file_ops(mod);
+ if (!file_ops)
return;
for_each_event(call, start, end) {
- /* The linker may leave blanks */
- if (!call->name)
- continue;
- if (call->class->raw_init) {
- ret = call->class->raw_init(call);
- if (ret < 0) {
- if (ret != -ENOSYS)
- pr_warning("Could not initialize trace "
- "point events/%s\n", call->name);
- continue;
- }
- }
- /*
- * This module has events, create file ops for this module
- * if not already done.
- */
- if (!file_ops) {
- file_ops = trace_create_file_ops(mod);
- if (!file_ops)
- return;
- }
- call->mod = mod;
- ret = event_create_dir(call, d_events,
+ __trace_add_event_call(call, mod,
&file_ops->id, &file_ops->enable,
&file_ops->filter, &file_ops->format);
- if (!ret)
- list_add(&call->list, &ftrace_events);
}
}
@@ -1319,25 +1354,14 @@
trace_create_file("enable", 0644, d_events,
NULL, &ftrace_system_enable_fops);
+ if (trace_define_common_fields())
+ pr_warning("tracing: Failed to allocate common fields");
+
for_each_event(call, __start_ftrace_events, __stop_ftrace_events) {
- /* The linker may leave blanks */
- if (!call->name)
- continue;
- if (call->class->raw_init) {
- ret = call->class->raw_init(call);
- if (ret < 0) {
- if (ret != -ENOSYS)
- pr_warning("Could not initialize trace "
- "point events/%s\n", call->name);
- continue;
- }
- }
- ret = event_create_dir(call, d_events, &ftrace_event_id_fops,
+ __trace_add_event_call(call, NULL, &ftrace_event_id_fops,
&ftrace_enable_fops,
&ftrace_event_filter_fops,
&ftrace_event_format_fops);
- if (!ret)
- list_add(&call->list, &ftrace_events);
}
while (true) {
@@ -1524,12 +1548,11 @@
struct ftrace_entry *entry;
unsigned long flags;
long disabled;
- int resched;
int cpu;
int pc;
pc = preempt_count();
- resched = ftrace_preempt_disable();
+ preempt_disable_notrace();
cpu = raw_smp_processor_id();
disabled = atomic_inc_return(&per_cpu(ftrace_test_event_disable, cpu));
@@ -1551,7 +1574,7 @@
out:
atomic_dec(&per_cpu(ftrace_test_event_disable, cpu));
- ftrace_preempt_enable(resched);
+ preempt_enable_notrace();
}
static struct ftrace_ops trace_ops __initdata =
diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c
index 57bb1bb..36d4010 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -497,12 +497,10 @@
}
static struct ftrace_event_field *
-find_event_field(struct ftrace_event_call *call, char *name)
+__find_event_field(struct list_head *head, char *name)
{
struct ftrace_event_field *field;
- struct list_head *head;
- head = trace_get_fields(call);
list_for_each_entry(field, head, link) {
if (!strcmp(field->name, name))
return field;
@@ -511,6 +509,20 @@
return NULL;
}
+static struct ftrace_event_field *
+find_event_field(struct ftrace_event_call *call, char *name)
+{
+ struct ftrace_event_field *field;
+ struct list_head *head;
+
+ field = __find_event_field(&ftrace_common_fields, name);
+ if (field)
+ return field;
+
+ head = trace_get_fields(call);
+ return __find_event_field(head, name);
+}
+
static void filter_free_pred(struct filter_pred *pred)
{
if (!pred)
@@ -627,9 +639,6 @@
int err;
list_for_each_entry(call, &ftrace_events, list) {
- if (!call->class || !call->class->define_fields)
- continue;
-
if (strcmp(call->class->system, system->name) != 0)
continue;
@@ -646,9 +655,6 @@
struct ftrace_event_call *call;
list_for_each_entry(call, &ftrace_events, list) {
- if (!call->class || !call->class->define_fields)
- continue;
-
if (strcmp(call->class->system, system->name) != 0)
continue;
@@ -1251,9 +1257,6 @@
list_for_each_entry(call, &ftrace_events, list) {
struct event_filter *filter = call->filter;
- if (!call->class || !call->class->define_fields)
- continue;
-
if (strcmp(call->class->system, system->name) != 0)
continue;
diff --git a/kernel/trace/trace_export.c b/kernel/trace/trace_export.c
index 8536e2a..4ba44de 100644
--- a/kernel/trace/trace_export.c
+++ b/kernel/trace/trace_export.c
@@ -125,12 +125,6 @@
#include "trace_entries.h"
-static int ftrace_raw_init_event(struct ftrace_event_call *call)
-{
- INIT_LIST_HEAD(&call->class->fields);
- return 0;
-}
-
#undef __entry
#define __entry REC
@@ -158,7 +152,7 @@
struct ftrace_event_class event_class_ftrace_##call = { \
.system = __stringify(TRACE_SYSTEM), \
.define_fields = ftrace_define_fields_##call, \
- .raw_init = ftrace_raw_init_event, \
+ .fields = LIST_HEAD_INIT(event_class_ftrace_##call.fields),\
}; \
\
struct ftrace_event_call __used \
diff --git a/kernel/trace/trace_functions.c b/kernel/trace/trace_functions.c
index b3f3776..16aee4d 100644
--- a/kernel/trace/trace_functions.c
+++ b/kernel/trace/trace_functions.c
@@ -54,14 +54,14 @@
struct trace_array_cpu *data;
unsigned long flags;
long disabled;
- int cpu, resched;
+ int cpu;
int pc;
if (unlikely(!ftrace_function_enabled))
return;
pc = preempt_count();
- resched = ftrace_preempt_disable();
+ preempt_disable_notrace();
local_save_flags(flags);
cpu = raw_smp_processor_id();
data = tr->data[cpu];
@@ -71,7 +71,7 @@
trace_function(tr, ip, parent_ip, flags, pc);
atomic_dec(&data->disabled);
- ftrace_preempt_enable(resched);
+ preempt_enable_notrace();
}
static void
diff --git a/kernel/trace/trace_functions_graph.c b/kernel/trace/trace_functions_graph.c
index 79f4bac..6bff236 100644
--- a/kernel/trace/trace_functions_graph.c
+++ b/kernel/trace/trace_functions_graph.c
@@ -641,7 +641,8 @@
/* Print nsecs (we don't want to exceed 7 numbers) */
if (len < 7) {
- snprintf(nsecs_str, 8 - len, "%03lu", nsecs_rem);
+ snprintf(nsecs_str, min(sizeof(nsecs_str), 8UL - len), "%03lu",
+ nsecs_rem);
ret = trace_seq_printf(s, ".%s", nsecs_str);
if (!ret)
return TRACE_TYPE_PARTIAL_LINE;
diff --git a/kernel/trace/trace_irqsoff.c b/kernel/trace/trace_irqsoff.c
index 6fd486e..73a6b06 100644
--- a/kernel/trace/trace_irqsoff.c
+++ b/kernel/trace/trace_irqsoff.c
@@ -649,6 +649,7 @@
#endif
.open = irqsoff_trace_open,
.close = irqsoff_trace_close,
+ .use_max_tr = 1,
};
# define register_irqsoff(trace) register_tracer(&trace)
#else
@@ -681,6 +682,7 @@
#endif
.open = irqsoff_trace_open,
.close = irqsoff_trace_close,
+ .use_max_tr = 1,
};
# define register_preemptoff(trace) register_tracer(&trace)
#else
@@ -715,6 +717,7 @@
#endif
.open = irqsoff_trace_open,
.close = irqsoff_trace_close,
+ .use_max_tr = 1,
};
# define register_preemptirqsoff(trace) register_tracer(&trace)
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index f52b5f5..8b27c98 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -30,6 +30,8 @@
#include <linux/ptrace.h>
#include <linux/perf_event.h>
#include <linux/stringify.h>
+#include <linux/limits.h>
+#include <linux/uaccess.h>
#include <asm/bitsperlong.h>
#include "trace.h"
@@ -38,6 +40,7 @@
#define MAX_TRACE_ARGS 128
#define MAX_ARGSTR_LEN 63
#define MAX_EVENT_NAME_LEN 64
+#define MAX_STRING_SIZE PATH_MAX
#define KPROBE_EVENT_SYSTEM "kprobes"
/* Reserved field names */
@@ -58,14 +61,16 @@
};
/* Printing function type */
-typedef int (*print_type_func_t)(struct trace_seq *, const char *, void *);
+typedef int (*print_type_func_t)(struct trace_seq *, const char *, void *,
+ void *);
#define PRINT_TYPE_FUNC_NAME(type) print_type_##type
#define PRINT_TYPE_FMT_NAME(type) print_type_format_##type
/* Printing in basic type function template */
#define DEFINE_BASIC_PRINT_TYPE_FUNC(type, fmt, cast) \
static __kprobes int PRINT_TYPE_FUNC_NAME(type)(struct trace_seq *s, \
- const char *name, void *data)\
+ const char *name, \
+ void *data, void *ent)\
{ \
return trace_seq_printf(s, " %s=" fmt, name, (cast)*(type *)data);\
} \
@@ -80,6 +85,49 @@
DEFINE_BASIC_PRINT_TYPE_FUNC(s32, "%ld", long)
DEFINE_BASIC_PRINT_TYPE_FUNC(s64, "%lld", long long)
+/* data_rloc: data relative location, compatible with u32 */
+#define make_data_rloc(len, roffs) \
+ (((u32)(len) << 16) | ((u32)(roffs) & 0xffff))
+#define get_rloc_len(dl) ((u32)(dl) >> 16)
+#define get_rloc_offs(dl) ((u32)(dl) & 0xffff)
+
+static inline void *get_rloc_data(u32 *dl)
+{
+ return (u8 *)dl + get_rloc_offs(*dl);
+}
+
+/* For data_loc conversion */
+static inline void *get_loc_data(u32 *dl, void *ent)
+{
+ return (u8 *)ent + get_rloc_offs(*dl);
+}
+
+/*
+ * Convert data_rloc to data_loc:
+ * data_rloc stores the offset from data_rloc itself, but data_loc
+ * stores the offset from event entry.
+ */
+#define convert_rloc_to_loc(dl, offs) ((u32)(dl) + (offs))
+
+/* For defining macros, define string/string_size types */
+typedef u32 string;
+typedef u32 string_size;
+
+/* Print type function for string type */
+static __kprobes int PRINT_TYPE_FUNC_NAME(string)(struct trace_seq *s,
+ const char *name,
+ void *data, void *ent)
+{
+ int len = *(u32 *)data >> 16;
+
+ if (!len)
+ return trace_seq_printf(s, " %s=(fault)", name);
+ else
+ return trace_seq_printf(s, " %s=\"%s\"", name,
+ (const char *)get_loc_data(data, ent));
+}
+static const char PRINT_TYPE_FMT_NAME(string)[] = "\\\"%s\\\"";
+
/* Data fetch function type */
typedef void (*fetch_func_t)(struct pt_regs *, void *, void *);
@@ -94,32 +142,38 @@
return fprm->fn(regs, fprm->data, dest);
}
-#define FETCH_FUNC_NAME(kind, type) fetch_##kind##_##type
+#define FETCH_FUNC_NAME(method, type) fetch_##method##_##type
/*
* Define macro for basic types - we don't need to define s* types, because
* we have to care only about bitwidth at recording time.
*/
-#define DEFINE_BASIC_FETCH_FUNCS(kind) \
-DEFINE_FETCH_##kind(u8) \
-DEFINE_FETCH_##kind(u16) \
-DEFINE_FETCH_##kind(u32) \
-DEFINE_FETCH_##kind(u64)
+#define DEFINE_BASIC_FETCH_FUNCS(method) \
+DEFINE_FETCH_##method(u8) \
+DEFINE_FETCH_##method(u16) \
+DEFINE_FETCH_##method(u32) \
+DEFINE_FETCH_##method(u64)
-#define CHECK_BASIC_FETCH_FUNCS(kind, fn) \
- ((FETCH_FUNC_NAME(kind, u8) == fn) || \
- (FETCH_FUNC_NAME(kind, u16) == fn) || \
- (FETCH_FUNC_NAME(kind, u32) == fn) || \
- (FETCH_FUNC_NAME(kind, u64) == fn))
+#define CHECK_FETCH_FUNCS(method, fn) \
+ (((FETCH_FUNC_NAME(method, u8) == fn) || \
+ (FETCH_FUNC_NAME(method, u16) == fn) || \
+ (FETCH_FUNC_NAME(method, u32) == fn) || \
+ (FETCH_FUNC_NAME(method, u64) == fn) || \
+ (FETCH_FUNC_NAME(method, string) == fn) || \
+ (FETCH_FUNC_NAME(method, string_size) == fn)) \
+ && (fn != NULL))
/* Data fetch function templates */
#define DEFINE_FETCH_reg(type) \
static __kprobes void FETCH_FUNC_NAME(reg, type)(struct pt_regs *regs, \
- void *offset, void *dest) \
+ void *offset, void *dest) \
{ \
*(type *)dest = (type)regs_get_register(regs, \
(unsigned int)((unsigned long)offset)); \
}
DEFINE_BASIC_FETCH_FUNCS(reg)
+/* No string on the register */
+#define fetch_reg_string NULL
+#define fetch_reg_string_size NULL
#define DEFINE_FETCH_stack(type) \
static __kprobes void FETCH_FUNC_NAME(stack, type)(struct pt_regs *regs,\
@@ -129,6 +183,9 @@
(unsigned int)((unsigned long)offset)); \
}
DEFINE_BASIC_FETCH_FUNCS(stack)
+/* No string on the stack entry */
+#define fetch_stack_string NULL
+#define fetch_stack_string_size NULL
#define DEFINE_FETCH_retval(type) \
static __kprobes void FETCH_FUNC_NAME(retval, type)(struct pt_regs *regs,\
@@ -137,6 +194,9 @@
*(type *)dest = (type)regs_return_value(regs); \
}
DEFINE_BASIC_FETCH_FUNCS(retval)
+/* No string on the retval */
+#define fetch_retval_string NULL
+#define fetch_retval_string_size NULL
#define DEFINE_FETCH_memory(type) \
static __kprobes void FETCH_FUNC_NAME(memory, type)(struct pt_regs *regs,\
@@ -149,6 +209,62 @@
*(type *)dest = retval; \
}
DEFINE_BASIC_FETCH_FUNCS(memory)
+/*
+ * Fetch a null-terminated string. Caller MUST set *(u32 *)dest with max
+ * length and relative data location.
+ */
+static __kprobes void FETCH_FUNC_NAME(memory, string)(struct pt_regs *regs,
+ void *addr, void *dest)
+{
+ long ret;
+ int maxlen = get_rloc_len(*(u32 *)dest);
+ u8 *dst = get_rloc_data(dest);
+ u8 *src = addr;
+ mm_segment_t old_fs = get_fs();
+ if (!maxlen)
+ return;
+ /*
+ * Try to get string again, since the string can be changed while
+ * probing.
+ */
+ set_fs(KERNEL_DS);
+ pagefault_disable();
+ do
+ ret = __copy_from_user_inatomic(dst++, src++, 1);
+ while (dst[-1] && ret == 0 && src - (u8 *)addr < maxlen);
+ dst[-1] = '\0';
+ pagefault_enable();
+ set_fs(old_fs);
+
+ if (ret < 0) { /* Failed to fetch string */
+ ((u8 *)get_rloc_data(dest))[0] = '\0';
+ *(u32 *)dest = make_data_rloc(0, get_rloc_offs(*(u32 *)dest));
+ } else
+ *(u32 *)dest = make_data_rloc(src - (u8 *)addr,
+ get_rloc_offs(*(u32 *)dest));
+}
+/* Return the length of string -- including null terminal byte */
+static __kprobes void FETCH_FUNC_NAME(memory, string_size)(struct pt_regs *regs,
+ void *addr, void *dest)
+{
+ int ret, len = 0;
+ u8 c;
+ mm_segment_t old_fs = get_fs();
+
+ set_fs(KERNEL_DS);
+ pagefault_disable();
+ do {
+ ret = __copy_from_user_inatomic(&c, (u8 *)addr + len, 1);
+ len++;
+ } while (c && ret == 0 && len < MAX_STRING_SIZE);
+ pagefault_enable();
+ set_fs(old_fs);
+
+ if (ret < 0) /* Failed to check the length */
+ *(u32 *)dest = 0;
+ else
+ *(u32 *)dest = len;
+}
/* Memory fetching by symbol */
struct symbol_cache {
@@ -203,6 +319,8 @@
*(type *)dest = 0; \
}
DEFINE_BASIC_FETCH_FUNCS(symbol)
+DEFINE_FETCH_symbol(string)
+DEFINE_FETCH_symbol(string_size)
/* Dereference memory access function */
struct deref_fetch_param {
@@ -224,12 +342,14 @@
*(type *)dest = 0; \
}
DEFINE_BASIC_FETCH_FUNCS(deref)
+DEFINE_FETCH_deref(string)
+DEFINE_FETCH_deref(string_size)
static __kprobes void free_deref_fetch_param(struct deref_fetch_param *data)
{
- if (CHECK_BASIC_FETCH_FUNCS(deref, data->orig.fn))
+ if (CHECK_FETCH_FUNCS(deref, data->orig.fn))
free_deref_fetch_param(data->orig.data);
- else if (CHECK_BASIC_FETCH_FUNCS(symbol, data->orig.fn))
+ else if (CHECK_FETCH_FUNCS(symbol, data->orig.fn))
free_symbol_cache(data->orig.data);
kfree(data);
}
@@ -240,23 +360,43 @@
#define DEFAULT_FETCH_TYPE _DEFAULT_FETCH_TYPE(BITS_PER_LONG)
#define DEFAULT_FETCH_TYPE_STR __stringify(DEFAULT_FETCH_TYPE)
-#define ASSIGN_FETCH_FUNC(kind, type) \
- .kind = FETCH_FUNC_NAME(kind, type)
+/* Fetch types */
+enum {
+ FETCH_MTD_reg = 0,
+ FETCH_MTD_stack,
+ FETCH_MTD_retval,
+ FETCH_MTD_memory,
+ FETCH_MTD_symbol,
+ FETCH_MTD_deref,
+ FETCH_MTD_END,
+};
-#define ASSIGN_FETCH_TYPE(ptype, ftype, sign) \
- {.name = #ptype, \
- .size = sizeof(ftype), \
- .is_signed = sign, \
- .print = PRINT_TYPE_FUNC_NAME(ptype), \
- .fmt = PRINT_TYPE_FMT_NAME(ptype), \
-ASSIGN_FETCH_FUNC(reg, ftype), \
-ASSIGN_FETCH_FUNC(stack, ftype), \
-ASSIGN_FETCH_FUNC(retval, ftype), \
-ASSIGN_FETCH_FUNC(memory, ftype), \
-ASSIGN_FETCH_FUNC(symbol, ftype), \
-ASSIGN_FETCH_FUNC(deref, ftype), \
+#define ASSIGN_FETCH_FUNC(method, type) \
+ [FETCH_MTD_##method] = FETCH_FUNC_NAME(method, type)
+
+#define __ASSIGN_FETCH_TYPE(_name, ptype, ftype, _size, sign, _fmttype) \
+ {.name = _name, \
+ .size = _size, \
+ .is_signed = sign, \
+ .print = PRINT_TYPE_FUNC_NAME(ptype), \
+ .fmt = PRINT_TYPE_FMT_NAME(ptype), \
+ .fmttype = _fmttype, \
+ .fetch = { \
+ASSIGN_FETCH_FUNC(reg, ftype), \
+ASSIGN_FETCH_FUNC(stack, ftype), \
+ASSIGN_FETCH_FUNC(retval, ftype), \
+ASSIGN_FETCH_FUNC(memory, ftype), \
+ASSIGN_FETCH_FUNC(symbol, ftype), \
+ASSIGN_FETCH_FUNC(deref, ftype), \
+ } \
}
+#define ASSIGN_FETCH_TYPE(ptype, ftype, sign) \
+ __ASSIGN_FETCH_TYPE(#ptype, ptype, ftype, sizeof(ftype), sign, #ptype)
+
+#define FETCH_TYPE_STRING 0
+#define FETCH_TYPE_STRSIZE 1
+
/* Fetch type information table */
static const struct fetch_type {
const char *name; /* Name of type */
@@ -264,14 +404,16 @@
int is_signed; /* Signed flag */
print_type_func_t print; /* Print functions */
const char *fmt; /* Fromat string */
+ const char *fmttype; /* Name in format file */
/* Fetch functions */
- fetch_func_t reg;
- fetch_func_t stack;
- fetch_func_t retval;
- fetch_func_t memory;
- fetch_func_t symbol;
- fetch_func_t deref;
+ fetch_func_t fetch[FETCH_MTD_END];
} fetch_type_table[] = {
+ /* Special types */
+ [FETCH_TYPE_STRING] = __ASSIGN_FETCH_TYPE("string", string, string,
+ sizeof(u32), 1, "__data_loc char[]"),
+ [FETCH_TYPE_STRSIZE] = __ASSIGN_FETCH_TYPE("string_size", u32,
+ string_size, sizeof(u32), 0, "u32"),
+ /* Basic types */
ASSIGN_FETCH_TYPE(u8, u8, 0),
ASSIGN_FETCH_TYPE(u16, u16, 0),
ASSIGN_FETCH_TYPE(u32, u32, 0),
@@ -302,12 +444,28 @@
*(unsigned long *)dest = kernel_stack_pointer(regs);
}
+static fetch_func_t get_fetch_size_function(const struct fetch_type *type,
+ fetch_func_t orig_fn)
+{
+ int i;
+
+ if (type != &fetch_type_table[FETCH_TYPE_STRING])
+ return NULL; /* Only string type needs size function */
+ for (i = 0; i < FETCH_MTD_END; i++)
+ if (type->fetch[i] == orig_fn)
+ return fetch_type_table[FETCH_TYPE_STRSIZE].fetch[i];
+
+ WARN_ON(1); /* This should not happen */
+ return NULL;
+}
+
/**
* Kprobe event core functions
*/
struct probe_arg {
struct fetch_param fetch;
+ struct fetch_param fetch_size;
unsigned int offset; /* Offset from argument entry */
const char *name; /* Name of this argument */
const char *comm; /* Command of this argument */
@@ -429,9 +587,9 @@
static void free_probe_arg(struct probe_arg *arg)
{
- if (CHECK_BASIC_FETCH_FUNCS(deref, arg->fetch.fn))
+ if (CHECK_FETCH_FUNCS(deref, arg->fetch.fn))
free_deref_fetch_param(arg->fetch.data);
- else if (CHECK_BASIC_FETCH_FUNCS(symbol, arg->fetch.fn))
+ else if (CHECK_FETCH_FUNCS(symbol, arg->fetch.fn))
free_symbol_cache(arg->fetch.data);
kfree(arg->name);
kfree(arg->comm);
@@ -548,7 +706,7 @@
if (strcmp(arg, "retval") == 0) {
if (is_return)
- f->fn = t->retval;
+ f->fn = t->fetch[FETCH_MTD_retval];
else
ret = -EINVAL;
} else if (strncmp(arg, "stack", 5) == 0) {
@@ -562,7 +720,7 @@
if (ret || param > PARAM_MAX_STACK)
ret = -EINVAL;
else {
- f->fn = t->stack;
+ f->fn = t->fetch[FETCH_MTD_stack];
f->data = (void *)param;
}
} else
@@ -588,7 +746,7 @@
case '%': /* named register */
ret = regs_query_register_offset(arg + 1);
if (ret >= 0) {
- f->fn = t->reg;
+ f->fn = t->fetch[FETCH_MTD_reg];
f->data = (void *)(unsigned long)ret;
ret = 0;
}
@@ -598,7 +756,7 @@
ret = strict_strtoul(arg + 1, 0, ¶m);
if (ret)
break;
- f->fn = t->memory;
+ f->fn = t->fetch[FETCH_MTD_memory];
f->data = (void *)param;
} else {
ret = split_symbol_offset(arg + 1, &offset);
@@ -606,7 +764,7 @@
break;
f->data = alloc_symbol_cache(arg + 1, offset);
if (f->data)
- f->fn = t->symbol;
+ f->fn = t->fetch[FETCH_MTD_symbol];
}
break;
case '+': /* deref memory */
@@ -636,14 +794,17 @@
if (ret)
kfree(dprm);
else {
- f->fn = t->deref;
+ f->fn = t->fetch[FETCH_MTD_deref];
f->data = (void *)dprm;
}
}
break;
}
- if (!ret && !f->fn)
+ if (!ret && !f->fn) { /* Parsed, but do not find fetch method */
+ pr_info("%s type has no corresponding fetch method.\n",
+ t->name);
ret = -EINVAL;
+ }
return ret;
}
@@ -652,6 +813,7 @@
struct probe_arg *parg, int is_return)
{
const char *t;
+ int ret;
if (strlen(arg) > MAX_ARGSTR_LEN) {
pr_info("Argument is too long.: %s\n", arg);
@@ -674,7 +836,13 @@
}
parg->offset = tp->size;
tp->size += parg->type->size;
- return __parse_probe_arg(arg, parg->type, &parg->fetch, is_return);
+ ret = __parse_probe_arg(arg, parg->type, &parg->fetch, is_return);
+ if (ret >= 0) {
+ parg->fetch_size.fn = get_fetch_size_function(parg->type,
+ parg->fetch.fn);
+ parg->fetch_size.data = parg->fetch.data;
+ }
+ return ret;
}
/* Return 1 if name is reserved or already used by another argument */
@@ -757,14 +925,17 @@
pr_info("Delete command needs an event name.\n");
return -EINVAL;
}
+ mutex_lock(&probe_lock);
tp = find_probe_event(event, group);
if (!tp) {
+ mutex_unlock(&probe_lock);
pr_info("Event %s/%s doesn't exist.\n", group, event);
return -ENOENT;
}
/* delete an event */
unregister_trace_probe(tp);
free_trace_probe(tp);
+ mutex_unlock(&probe_lock);
return 0;
}
@@ -1043,6 +1214,54 @@
.release = seq_release,
};
+/* Sum up total data length for dynamic arraies (strings) */
+static __kprobes int __get_data_size(struct trace_probe *tp,
+ struct pt_regs *regs)
+{
+ int i, ret = 0;
+ u32 len;
+
+ for (i = 0; i < tp->nr_args; i++)
+ if (unlikely(tp->args[i].fetch_size.fn)) {
+ call_fetch(&tp->args[i].fetch_size, regs, &len);
+ ret += len;
+ }
+
+ return ret;
+}
+
+/* Store the value of each argument */
+static __kprobes void store_trace_args(int ent_size, struct trace_probe *tp,
+ struct pt_regs *regs,
+ u8 *data, int maxlen)
+{
+ int i;
+ u32 end = tp->size;
+ u32 *dl; /* Data (relative) location */
+
+ for (i = 0; i < tp->nr_args; i++) {
+ if (unlikely(tp->args[i].fetch_size.fn)) {
+ /*
+ * First, we set the relative location and
+ * maximum data length to *dl
+ */
+ dl = (u32 *)(data + tp->args[i].offset);
+ *dl = make_data_rloc(maxlen, end - tp->args[i].offset);
+ /* Then try to fetch string or dynamic array data */
+ call_fetch(&tp->args[i].fetch, regs, dl);
+ /* Reduce maximum length */
+ end += get_rloc_len(*dl);
+ maxlen -= get_rloc_len(*dl);
+ /* Trick here, convert data_rloc to data_loc */
+ *dl = convert_rloc_to_loc(*dl,
+ ent_size + tp->args[i].offset);
+ } else
+ /* Just fetching data normally */
+ call_fetch(&tp->args[i].fetch, regs,
+ data + tp->args[i].offset);
+ }
+}
+
/* Kprobe handler */
static __kprobes void kprobe_trace_func(struct kprobe *kp, struct pt_regs *regs)
{
@@ -1050,8 +1269,7 @@
struct kprobe_trace_entry_head *entry;
struct ring_buffer_event *event;
struct ring_buffer *buffer;
- u8 *data;
- int size, i, pc;
+ int size, dsize, pc;
unsigned long irq_flags;
struct ftrace_event_call *call = &tp->call;
@@ -1060,7 +1278,8 @@
local_save_flags(irq_flags);
pc = preempt_count();
- size = sizeof(*entry) + tp->size;
+ dsize = __get_data_size(tp, regs);
+ size = sizeof(*entry) + tp->size + dsize;
event = trace_current_buffer_lock_reserve(&buffer, call->event.type,
size, irq_flags, pc);
@@ -1069,9 +1288,7 @@
entry = ring_buffer_event_data(event);
entry->ip = (unsigned long)kp->addr;
- data = (u8 *)&entry[1];
- for (i = 0; i < tp->nr_args; i++)
- call_fetch(&tp->args[i].fetch, regs, data + tp->args[i].offset);
+ store_trace_args(sizeof(*entry), tp, regs, (u8 *)&entry[1], dsize);
if (!filter_current_check_discard(buffer, call, entry, event))
trace_nowake_buffer_unlock_commit(buffer, event, irq_flags, pc);
@@ -1085,15 +1302,15 @@
struct kretprobe_trace_entry_head *entry;
struct ring_buffer_event *event;
struct ring_buffer *buffer;
- u8 *data;
- int size, i, pc;
+ int size, pc, dsize;
unsigned long irq_flags;
struct ftrace_event_call *call = &tp->call;
local_save_flags(irq_flags);
pc = preempt_count();
- size = sizeof(*entry) + tp->size;
+ dsize = __get_data_size(tp, regs);
+ size = sizeof(*entry) + tp->size + dsize;
event = trace_current_buffer_lock_reserve(&buffer, call->event.type,
size, irq_flags, pc);
@@ -1103,9 +1320,7 @@
entry = ring_buffer_event_data(event);
entry->func = (unsigned long)tp->rp.kp.addr;
entry->ret_ip = (unsigned long)ri->ret_addr;
- data = (u8 *)&entry[1];
- for (i = 0; i < tp->nr_args; i++)
- call_fetch(&tp->args[i].fetch, regs, data + tp->args[i].offset);
+ store_trace_args(sizeof(*entry), tp, regs, (u8 *)&entry[1], dsize);
if (!filter_current_check_discard(buffer, call, entry, event))
trace_nowake_buffer_unlock_commit(buffer, event, irq_flags, pc);
@@ -1137,7 +1352,7 @@
data = (u8 *)&field[1];
for (i = 0; i < tp->nr_args; i++)
if (!tp->args[i].type->print(s, tp->args[i].name,
- data + tp->args[i].offset))
+ data + tp->args[i].offset, field))
goto partial;
if (!trace_seq_puts(s, "\n"))
@@ -1179,7 +1394,7 @@
data = (u8 *)&field[1];
for (i = 0; i < tp->nr_args; i++)
if (!tp->args[i].type->print(s, tp->args[i].name,
- data + tp->args[i].offset))
+ data + tp->args[i].offset, field))
goto partial;
if (!trace_seq_puts(s, "\n"))
@@ -1214,11 +1429,6 @@
}
}
-static int probe_event_raw_init(struct ftrace_event_call *event_call)
-{
- return 0;
-}
-
#undef DEFINE_FIELD
#define DEFINE_FIELD(type, item, name, is_signed) \
do { \
@@ -1239,7 +1449,7 @@
DEFINE_FIELD(unsigned long, ip, FIELD_STRING_IP, 0);
/* Set argument names as fields */
for (i = 0; i < tp->nr_args; i++) {
- ret = trace_define_field(event_call, tp->args[i].type->name,
+ ret = trace_define_field(event_call, tp->args[i].type->fmttype,
tp->args[i].name,
sizeof(field) + tp->args[i].offset,
tp->args[i].type->size,
@@ -1261,7 +1471,7 @@
DEFINE_FIELD(unsigned long, ret_ip, FIELD_STRING_RETIP, 0);
/* Set argument names as fields */
for (i = 0; i < tp->nr_args; i++) {
- ret = trace_define_field(event_call, tp->args[i].type->name,
+ ret = trace_define_field(event_call, tp->args[i].type->fmttype,
tp->args[i].name,
sizeof(field) + tp->args[i].offset,
tp->args[i].type->size,
@@ -1301,8 +1511,13 @@
pos += snprintf(buf + pos, LEN_OR_ZERO, "\", %s", arg);
for (i = 0; i < tp->nr_args; i++) {
- pos += snprintf(buf + pos, LEN_OR_ZERO, ", REC->%s",
- tp->args[i].name);
+ if (strcmp(tp->args[i].type->name, "string") == 0)
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ ", __get_str(%s)",
+ tp->args[i].name);
+ else
+ pos += snprintf(buf + pos, LEN_OR_ZERO, ", REC->%s",
+ tp->args[i].name);
}
#undef LEN_OR_ZERO
@@ -1339,11 +1554,11 @@
struct ftrace_event_call *call = &tp->call;
struct kprobe_trace_entry_head *entry;
struct hlist_head *head;
- u8 *data;
- int size, __size, i;
+ int size, __size, dsize;
int rctx;
- __size = sizeof(*entry) + tp->size;
+ dsize = __get_data_size(tp, regs);
+ __size = sizeof(*entry) + tp->size + dsize;
size = ALIGN(__size + sizeof(u32), sizeof(u64));
size -= sizeof(u32);
if (WARN_ONCE(size > PERF_MAX_TRACE_SIZE,
@@ -1355,9 +1570,8 @@
return;
entry->ip = (unsigned long)kp->addr;
- data = (u8 *)&entry[1];
- for (i = 0; i < tp->nr_args; i++)
- call_fetch(&tp->args[i].fetch, regs, data + tp->args[i].offset);
+ memset(&entry[1], 0, dsize);
+ store_trace_args(sizeof(*entry), tp, regs, (u8 *)&entry[1], dsize);
head = this_cpu_ptr(call->perf_events);
perf_trace_buf_submit(entry, size, rctx, entry->ip, 1, regs, head);
@@ -1371,11 +1585,11 @@
struct ftrace_event_call *call = &tp->call;
struct kretprobe_trace_entry_head *entry;
struct hlist_head *head;
- u8 *data;
- int size, __size, i;
+ int size, __size, dsize;
int rctx;
- __size = sizeof(*entry) + tp->size;
+ dsize = __get_data_size(tp, regs);
+ __size = sizeof(*entry) + tp->size + dsize;
size = ALIGN(__size + sizeof(u32), sizeof(u64));
size -= sizeof(u32);
if (WARN_ONCE(size > PERF_MAX_TRACE_SIZE,
@@ -1388,9 +1602,7 @@
entry->func = (unsigned long)tp->rp.kp.addr;
entry->ret_ip = (unsigned long)ri->ret_addr;
- data = (u8 *)&entry[1];
- for (i = 0; i < tp->nr_args; i++)
- call_fetch(&tp->args[i].fetch, regs, data + tp->args[i].offset);
+ store_trace_args(sizeof(*entry), tp, regs, (u8 *)&entry[1], dsize);
head = this_cpu_ptr(call->perf_events);
perf_trace_buf_submit(entry, size, rctx, entry->ret_ip, 1, regs, head);
@@ -1486,15 +1698,12 @@
int ret;
/* Initialize ftrace_event_call */
+ INIT_LIST_HEAD(&call->class->fields);
if (probe_is_return(tp)) {
- INIT_LIST_HEAD(&call->class->fields);
call->event.funcs = &kretprobe_funcs;
- call->class->raw_init = probe_event_raw_init;
call->class->define_fields = kretprobe_event_define_fields;
} else {
- INIT_LIST_HEAD(&call->class->fields);
call->event.funcs = &kprobe_funcs;
- call->class->raw_init = probe_event_raw_init;
call->class->define_fields = kprobe_event_define_fields;
}
if (set_print_fmt(tp) < 0)
diff --git a/kernel/trace/trace_ksym.c b/kernel/trace/trace_ksym.c
deleted file mode 100644
index 8eaf007..0000000
--- a/kernel/trace/trace_ksym.c
+++ /dev/null
@@ -1,508 +0,0 @@
-/*
- * trace_ksym.c - Kernel Symbol Tracer
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program; if not, write to the Free Software
- * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
- *
- * Copyright (C) IBM Corporation, 2009
- */
-
-#include <linux/kallsyms.h>
-#include <linux/uaccess.h>
-#include <linux/debugfs.h>
-#include <linux/ftrace.h>
-#include <linux/module.h>
-#include <linux/slab.h>
-#include <linux/fs.h>
-
-#include "trace_output.h"
-#include "trace.h"
-
-#include <linux/hw_breakpoint.h>
-#include <asm/hw_breakpoint.h>
-
-#include <asm/atomic.h>
-
-#define KSYM_TRACER_OP_LEN 3 /* rw- */
-
-struct trace_ksym {
- struct perf_event **ksym_hbp;
- struct perf_event_attr attr;
-#ifdef CONFIG_PROFILE_KSYM_TRACER
- atomic64_t counter;
-#endif
- struct hlist_node ksym_hlist;
-};
-
-static struct trace_array *ksym_trace_array;
-
-static unsigned int ksym_tracing_enabled;
-
-static HLIST_HEAD(ksym_filter_head);
-
-static DEFINE_MUTEX(ksym_tracer_mutex);
-
-#ifdef CONFIG_PROFILE_KSYM_TRACER
-
-#define MAX_UL_INT 0xffffffff
-
-void ksym_collect_stats(unsigned long hbp_hit_addr)
-{
- struct hlist_node *node;
- struct trace_ksym *entry;
-
- rcu_read_lock();
- hlist_for_each_entry_rcu(entry, node, &ksym_filter_head, ksym_hlist) {
- if (entry->attr.bp_addr == hbp_hit_addr) {
- atomic64_inc(&entry->counter);
- break;
- }
- }
- rcu_read_unlock();
-}
-#endif /* CONFIG_PROFILE_KSYM_TRACER */
-
-void ksym_hbp_handler(struct perf_event *hbp, int nmi,
- struct perf_sample_data *data,
- struct pt_regs *regs)
-{
- struct ring_buffer_event *event;
- struct ksym_trace_entry *entry;
- struct ring_buffer *buffer;
- int pc;
-
- if (!ksym_tracing_enabled)
- return;
-
- buffer = ksym_trace_array->buffer;
-
- pc = preempt_count();
-
- event = trace_buffer_lock_reserve(buffer, TRACE_KSYM,
- sizeof(*entry), 0, pc);
- if (!event)
- return;
-
- entry = ring_buffer_event_data(event);
- entry->ip = instruction_pointer(regs);
- entry->type = hw_breakpoint_type(hbp);
- entry->addr = hw_breakpoint_addr(hbp);
- strlcpy(entry->cmd, current->comm, TASK_COMM_LEN);
-
-#ifdef CONFIG_PROFILE_KSYM_TRACER
- ksym_collect_stats(hw_breakpoint_addr(hbp));
-#endif /* CONFIG_PROFILE_KSYM_TRACER */
-
- trace_buffer_unlock_commit(buffer, event, 0, pc);
-}
-
-/* Valid access types are represented as
- *
- * rw- : Set Read/Write Access Breakpoint
- * -w- : Set Write Access Breakpoint
- * --- : Clear Breakpoints
- * --x : Set Execution Break points (Not available yet)
- *
- */
-static int ksym_trace_get_access_type(char *str)
-{
- int access = 0;
-
- if (str[0] == 'r')
- access |= HW_BREAKPOINT_R;
-
- if (str[1] == 'w')
- access |= HW_BREAKPOINT_W;
-
- if (str[2] == 'x')
- access |= HW_BREAKPOINT_X;
-
- switch (access) {
- case HW_BREAKPOINT_R:
- case HW_BREAKPOINT_W:
- case HW_BREAKPOINT_W | HW_BREAKPOINT_R:
- return access;
- default:
- return -EINVAL;
- }
-}
-
-/*
- * There can be several possible malformed requests and we attempt to capture
- * all of them. We enumerate some of the rules
- * 1. We will not allow kernel symbols with ':' since it is used as a delimiter.
- * i.e. multiple ':' symbols disallowed. Possible uses are of the form
- * <module>:<ksym_name>:<op>.
- * 2. No delimiter symbol ':' in the input string
- * 3. Spurious operator symbols or symbols not in their respective positions
- * 4. <ksym_name>:--- i.e. clear breakpoint request when ksym_name not in file
- * 5. Kernel symbol not a part of /proc/kallsyms
- * 6. Duplicate requests
- */
-static int parse_ksym_trace_str(char *input_string, char **ksymname,
- unsigned long *addr)
-{
- int ret;
-
- *ksymname = strsep(&input_string, ":");
- *addr = kallsyms_lookup_name(*ksymname);
-
- /* Check for malformed request: (2), (1) and (5) */
- if ((!input_string) ||
- (strlen(input_string) != KSYM_TRACER_OP_LEN) ||
- (*addr == 0))
- return -EINVAL;;
-
- ret = ksym_trace_get_access_type(input_string);
-
- return ret;
-}
-
-int process_new_ksym_entry(char *ksymname, int op, unsigned long addr)
-{
- struct trace_ksym *entry;
- int ret = -ENOMEM;
-
- entry = kzalloc(sizeof(struct trace_ksym), GFP_KERNEL);
- if (!entry)
- return -ENOMEM;
-
- hw_breakpoint_init(&entry->attr);
-
- entry->attr.bp_type = op;
- entry->attr.bp_addr = addr;
- entry->attr.bp_len = HW_BREAKPOINT_LEN_4;
-
- entry->ksym_hbp = register_wide_hw_breakpoint(&entry->attr,
- ksym_hbp_handler);
-
- if (IS_ERR(entry->ksym_hbp)) {
- ret = PTR_ERR(entry->ksym_hbp);
- if (ret == -ENOSPC) {
- printk(KERN_ERR "ksym_tracer: Maximum limit reached."
- " No new requests for tracing can be accepted now.\n");
- } else {
- printk(KERN_INFO "ksym_tracer request failed. Try again"
- " later!!\n");
- }
- goto err;
- }
-
- hlist_add_head_rcu(&(entry->ksym_hlist), &ksym_filter_head);
-
- return 0;
-
-err:
- kfree(entry);
-
- return ret;
-}
-
-static ssize_t ksym_trace_filter_read(struct file *filp, char __user *ubuf,
- size_t count, loff_t *ppos)
-{
- struct trace_ksym *entry;
- struct hlist_node *node;
- struct trace_seq *s;
- ssize_t cnt = 0;
- int ret;
-
- s = kmalloc(sizeof(*s), GFP_KERNEL);
- if (!s)
- return -ENOMEM;
- trace_seq_init(s);
-
- mutex_lock(&ksym_tracer_mutex);
-
- hlist_for_each_entry(entry, node, &ksym_filter_head, ksym_hlist) {
- ret = trace_seq_printf(s, "%pS:",
- (void *)(unsigned long)entry->attr.bp_addr);
- if (entry->attr.bp_type == HW_BREAKPOINT_R)
- ret = trace_seq_puts(s, "r--\n");
- else if (entry->attr.bp_type == HW_BREAKPOINT_W)
- ret = trace_seq_puts(s, "-w-\n");
- else if (entry->attr.bp_type == (HW_BREAKPOINT_W | HW_BREAKPOINT_R))
- ret = trace_seq_puts(s, "rw-\n");
- WARN_ON_ONCE(!ret);
- }
-
- cnt = simple_read_from_buffer(ubuf, count, ppos, s->buffer, s->len);
-
- mutex_unlock(&ksym_tracer_mutex);
-
- kfree(s);
-
- return cnt;
-}
-
-static void __ksym_trace_reset(void)
-{
- struct trace_ksym *entry;
- struct hlist_node *node, *node1;
-
- mutex_lock(&ksym_tracer_mutex);
- hlist_for_each_entry_safe(entry, node, node1, &ksym_filter_head,
- ksym_hlist) {
- unregister_wide_hw_breakpoint(entry->ksym_hbp);
- hlist_del_rcu(&(entry->ksym_hlist));
- synchronize_rcu();
- kfree(entry);
- }
- mutex_unlock(&ksym_tracer_mutex);
-}
-
-static ssize_t ksym_trace_filter_write(struct file *file,
- const char __user *buffer,
- size_t count, loff_t *ppos)
-{
- struct trace_ksym *entry;
- struct hlist_node *node;
- char *buf, *input_string, *ksymname = NULL;
- unsigned long ksym_addr = 0;
- int ret, op, changed = 0;
-
- buf = kzalloc(count + 1, GFP_KERNEL);
- if (!buf)
- return -ENOMEM;
-
- ret = -EFAULT;
- if (copy_from_user(buf, buffer, count))
- goto out;
-
- buf[count] = '\0';
- input_string = strstrip(buf);
-
- /*
- * Clear all breakpoints if:
- * 1: echo > ksym_trace_filter
- * 2: echo 0 > ksym_trace_filter
- * 3: echo "*:---" > ksym_trace_filter
- */
- if (!input_string[0] || !strcmp(input_string, "0") ||
- !strcmp(input_string, "*:---")) {
- __ksym_trace_reset();
- ret = 0;
- goto out;
- }
-
- ret = op = parse_ksym_trace_str(input_string, &ksymname, &ksym_addr);
- if (ret < 0)
- goto out;
-
- mutex_lock(&ksym_tracer_mutex);
-
- ret = -EINVAL;
- hlist_for_each_entry(entry, node, &ksym_filter_head, ksym_hlist) {
- if (entry->attr.bp_addr == ksym_addr) {
- /* Check for malformed request: (6) */
- if (entry->attr.bp_type != op)
- changed = 1;
- else
- goto out_unlock;
- break;
- }
- }
- if (changed) {
- unregister_wide_hw_breakpoint(entry->ksym_hbp);
- entry->attr.bp_type = op;
- ret = 0;
- if (op > 0) {
- entry->ksym_hbp =
- register_wide_hw_breakpoint(&entry->attr,
- ksym_hbp_handler);
- if (IS_ERR(entry->ksym_hbp))
- ret = PTR_ERR(entry->ksym_hbp);
- else
- goto out_unlock;
- }
- /* Error or "symbol:---" case: drop it */
- hlist_del_rcu(&(entry->ksym_hlist));
- synchronize_rcu();
- kfree(entry);
- goto out_unlock;
- } else {
- /* Check for malformed request: (4) */
- if (op)
- ret = process_new_ksym_entry(ksymname, op, ksym_addr);
- }
-out_unlock:
- mutex_unlock(&ksym_tracer_mutex);
-out:
- kfree(buf);
- return !ret ? count : ret;
-}
-
-static const struct file_operations ksym_tracing_fops = {
- .open = tracing_open_generic,
- .read = ksym_trace_filter_read,
- .write = ksym_trace_filter_write,
-};
-
-static void ksym_trace_reset(struct trace_array *tr)
-{
- ksym_tracing_enabled = 0;
- __ksym_trace_reset();
-}
-
-static int ksym_trace_init(struct trace_array *tr)
-{
- int cpu, ret = 0;
-
- for_each_online_cpu(cpu)
- tracing_reset(tr, cpu);
- ksym_tracing_enabled = 1;
- ksym_trace_array = tr;
-
- return ret;
-}
-
-static void ksym_trace_print_header(struct seq_file *m)
-{
- seq_puts(m,
- "# TASK-PID CPU# Symbol "
- "Type Function\n");
- seq_puts(m,
- "# | | | "
- " | |\n");
-}
-
-static enum print_line_t ksym_trace_output(struct trace_iterator *iter)
-{
- struct trace_entry *entry = iter->ent;
- struct trace_seq *s = &iter->seq;
- struct ksym_trace_entry *field;
- char str[KSYM_SYMBOL_LEN];
- int ret;
-
- if (entry->type != TRACE_KSYM)
- return TRACE_TYPE_UNHANDLED;
-
- trace_assign_type(field, entry);
-
- ret = trace_seq_printf(s, "%11s-%-5d [%03d] %pS", field->cmd,
- entry->pid, iter->cpu, (char *)field->addr);
- if (!ret)
- return TRACE_TYPE_PARTIAL_LINE;
-
- switch (field->type) {
- case HW_BREAKPOINT_R:
- ret = trace_seq_printf(s, " R ");
- break;
- case HW_BREAKPOINT_W:
- ret = trace_seq_printf(s, " W ");
- break;
- case HW_BREAKPOINT_R | HW_BREAKPOINT_W:
- ret = trace_seq_printf(s, " RW ");
- break;
- default:
- return TRACE_TYPE_PARTIAL_LINE;
- }
-
- if (!ret)
- return TRACE_TYPE_PARTIAL_LINE;
-
- sprint_symbol(str, field->ip);
- ret = trace_seq_printf(s, "%s\n", str);
- if (!ret)
- return TRACE_TYPE_PARTIAL_LINE;
-
- return TRACE_TYPE_HANDLED;
-}
-
-struct tracer ksym_tracer __read_mostly =
-{
- .name = "ksym_tracer",
- .init = ksym_trace_init,
- .reset = ksym_trace_reset,
-#ifdef CONFIG_FTRACE_SELFTEST
- .selftest = trace_selftest_startup_ksym,
-#endif
- .print_header = ksym_trace_print_header,
- .print_line = ksym_trace_output
-};
-
-#ifdef CONFIG_PROFILE_KSYM_TRACER
-static int ksym_profile_show(struct seq_file *m, void *v)
-{
- struct hlist_node *node;
- struct trace_ksym *entry;
- int access_type = 0;
- char fn_name[KSYM_NAME_LEN];
-
- seq_puts(m, " Access Type ");
- seq_puts(m, " Symbol Counter\n");
- seq_puts(m, " ----------- ");
- seq_puts(m, " ------ -------\n");
-
- rcu_read_lock();
- hlist_for_each_entry_rcu(entry, node, &ksym_filter_head, ksym_hlist) {
-
- access_type = entry->attr.bp_type;
-
- switch (access_type) {
- case HW_BREAKPOINT_R:
- seq_puts(m, " R ");
- break;
- case HW_BREAKPOINT_W:
- seq_puts(m, " W ");
- break;
- case HW_BREAKPOINT_R | HW_BREAKPOINT_W:
- seq_puts(m, " RW ");
- break;
- default:
- seq_puts(m, " NA ");
- }
-
- if (lookup_symbol_name(entry->attr.bp_addr, fn_name) >= 0)
- seq_printf(m, " %-36s", fn_name);
- else
- seq_printf(m, " %-36s", "<NA>");
- seq_printf(m, " %15llu\n",
- (unsigned long long)atomic64_read(&entry->counter));
- }
- rcu_read_unlock();
-
- return 0;
-}
-
-static int ksym_profile_open(struct inode *node, struct file *file)
-{
- return single_open(file, ksym_profile_show, NULL);
-}
-
-static const struct file_operations ksym_profile_fops = {
- .open = ksym_profile_open,
- .read = seq_read,
- .llseek = seq_lseek,
- .release = single_release,
-};
-#endif /* CONFIG_PROFILE_KSYM_TRACER */
-
-__init static int init_ksym_trace(void)
-{
- struct dentry *d_tracer;
-
- d_tracer = tracing_init_dentry();
-
- trace_create_file("ksym_trace_filter", 0644, d_tracer,
- NULL, &ksym_tracing_fops);
-
-#ifdef CONFIG_PROFILE_KSYM_TRACER
- trace_create_file("ksym_profile", 0444, d_tracer,
- NULL, &ksym_profile_fops);
-#endif
-
- return register_tracer(&ksym_tracer);
-}
-device_initcall(init_ksym_trace);
diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
index 57c1b45..02272ba 100644
--- a/kernel/trace/trace_output.c
+++ b/kernel/trace/trace_output.c
@@ -16,9 +16,6 @@
DECLARE_RWSEM(trace_event_mutex);
-DEFINE_PER_CPU(struct trace_seq, ftrace_event_seq);
-EXPORT_PER_CPU_SYMBOL(ftrace_event_seq);
-
static struct hlist_head event_hash[EVENT_HASHSIZE] __read_mostly;
static int next_event_type = __TRACE_LAST_TYPE + 1;
@@ -1069,65 +1066,6 @@
.funcs = &trace_wake_funcs,
};
-/* TRACE_SPECIAL */
-static enum print_line_t trace_special_print(struct trace_iterator *iter,
- int flags, struct trace_event *event)
-{
- struct special_entry *field;
-
- trace_assign_type(field, iter->ent);
-
- if (!trace_seq_printf(&iter->seq, "# %ld %ld %ld\n",
- field->arg1,
- field->arg2,
- field->arg3))
- return TRACE_TYPE_PARTIAL_LINE;
-
- return TRACE_TYPE_HANDLED;
-}
-
-static enum print_line_t trace_special_hex(struct trace_iterator *iter,
- int flags, struct trace_event *event)
-{
- struct special_entry *field;
- struct trace_seq *s = &iter->seq;
-
- trace_assign_type(field, iter->ent);
-
- SEQ_PUT_HEX_FIELD_RET(s, field->arg1);
- SEQ_PUT_HEX_FIELD_RET(s, field->arg2);
- SEQ_PUT_HEX_FIELD_RET(s, field->arg3);
-
- return TRACE_TYPE_HANDLED;
-}
-
-static enum print_line_t trace_special_bin(struct trace_iterator *iter,
- int flags, struct trace_event *event)
-{
- struct special_entry *field;
- struct trace_seq *s = &iter->seq;
-
- trace_assign_type(field, iter->ent);
-
- SEQ_PUT_FIELD_RET(s, field->arg1);
- SEQ_PUT_FIELD_RET(s, field->arg2);
- SEQ_PUT_FIELD_RET(s, field->arg3);
-
- return TRACE_TYPE_HANDLED;
-}
-
-static struct trace_event_functions trace_special_funcs = {
- .trace = trace_special_print,
- .raw = trace_special_print,
- .hex = trace_special_hex,
- .binary = trace_special_bin,
-};
-
-static struct trace_event trace_special_event = {
- .type = TRACE_SPECIAL,
- .funcs = &trace_special_funcs,
-};
-
/* TRACE_STACK */
static enum print_line_t trace_stack_print(struct trace_iterator *iter,
@@ -1161,9 +1099,6 @@
static struct trace_event_functions trace_stack_funcs = {
.trace = trace_stack_print,
- .raw = trace_special_print,
- .hex = trace_special_hex,
- .binary = trace_special_bin,
};
static struct trace_event trace_stack_event = {
@@ -1194,9 +1129,6 @@
static struct trace_event_functions trace_user_stack_funcs = {
.trace = trace_user_stack_print,
- .raw = trace_special_print,
- .hex = trace_special_hex,
- .binary = trace_special_bin,
};
static struct trace_event trace_user_stack_event = {
@@ -1314,7 +1246,6 @@
&trace_fn_event,
&trace_ctx_event,
&trace_wake_event,
- &trace_special_event,
&trace_stack_event,
&trace_user_stack_event,
&trace_bprint_event,
diff --git a/kernel/trace/trace_sched_wakeup.c b/kernel/trace/trace_sched_wakeup.c
index 0e73bc2..4086eae6 100644
--- a/kernel/trace/trace_sched_wakeup.c
+++ b/kernel/trace/trace_sched_wakeup.c
@@ -46,7 +46,6 @@
struct trace_array_cpu *data;
unsigned long flags;
long disabled;
- int resched;
int cpu;
int pc;
@@ -54,7 +53,7 @@
return;
pc = preempt_count();
- resched = ftrace_preempt_disable();
+ preempt_disable_notrace();
cpu = raw_smp_processor_id();
if (cpu != wakeup_current_cpu)
@@ -74,7 +73,7 @@
out:
atomic_dec(&data->disabled);
out_enable:
- ftrace_preempt_enable(resched);
+ preempt_enable_notrace();
}
static struct ftrace_ops trace_ops __read_mostly =
@@ -383,6 +382,7 @@
#ifdef CONFIG_FTRACE_SELFTEST
.selftest = trace_selftest_startup_wakeup,
#endif
+ .use_max_tr = 1,
};
static struct tracer wakeup_rt_tracer __read_mostly =
@@ -397,6 +397,7 @@
#ifdef CONFIG_FTRACE_SELFTEST
.selftest = trace_selftest_startup_wakeup,
#endif
+ .use_max_tr = 1,
};
__init static int init_wakeup_tracer(void)
diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
index 250e7f9..155a415 100644
--- a/kernel/trace/trace_selftest.c
+++ b/kernel/trace/trace_selftest.c
@@ -13,11 +13,9 @@
case TRACE_WAKE:
case TRACE_STACK:
case TRACE_PRINT:
- case TRACE_SPECIAL:
case TRACE_BRANCH:
case TRACE_GRAPH_ENT:
case TRACE_GRAPH_RET:
- case TRACE_KSYM:
return 1;
}
return 0;
@@ -691,38 +689,6 @@
}
#endif /* CONFIG_CONTEXT_SWITCH_TRACER */
-#ifdef CONFIG_SYSPROF_TRACER
-int
-trace_selftest_startup_sysprof(struct tracer *trace, struct trace_array *tr)
-{
- unsigned long count;
- int ret;
-
- /* start the tracing */
- ret = tracer_init(trace, tr);
- if (ret) {
- warn_failed_init_tracer(trace, ret);
- return ret;
- }
-
- /* Sleep for a 1/10 of a second */
- msleep(100);
- /* stop the tracing. */
- tracing_stop();
- /* check the trace buffer */
- ret = trace_test_buffer(tr, &count);
- trace->reset(tr);
- tracing_start();
-
- if (!ret && !count) {
- printk(KERN_CONT ".. no entries found ..");
- ret = -1;
- }
-
- return ret;
-}
-#endif /* CONFIG_SYSPROF_TRACER */
-
#ifdef CONFIG_BRANCH_TRACER
int
trace_selftest_startup_branch(struct tracer *trace, struct trace_array *tr)
@@ -755,56 +721,3 @@
}
#endif /* CONFIG_BRANCH_TRACER */
-#ifdef CONFIG_KSYM_TRACER
-static int ksym_selftest_dummy;
-
-int
-trace_selftest_startup_ksym(struct tracer *trace, struct trace_array *tr)
-{
- unsigned long count;
- int ret;
-
- /* start the tracing */
- ret = tracer_init(trace, tr);
- if (ret) {
- warn_failed_init_tracer(trace, ret);
- return ret;
- }
-
- ksym_selftest_dummy = 0;
- /* Register the read-write tracing request */
-
- ret = process_new_ksym_entry("ksym_selftest_dummy",
- HW_BREAKPOINT_R | HW_BREAKPOINT_W,
- (unsigned long)(&ksym_selftest_dummy));
-
- if (ret < 0) {
- printk(KERN_CONT "ksym_trace read-write startup test failed\n");
- goto ret_path;
- }
- /* Perform a read and a write operation over the dummy variable to
- * trigger the tracer
- */
- if (ksym_selftest_dummy == 0)
- ksym_selftest_dummy++;
-
- /* stop the tracing. */
- tracing_stop();
- /* check the trace buffer */
- ret = trace_test_buffer(tr, &count);
- trace->reset(tr);
- tracing_start();
-
- /* read & write operations - one each is performed on the dummy variable
- * triggering two entries in the trace buffer
- */
- if (!ret && count != 2) {
- printk(KERN_CONT "Ksym tracer startup test failed");
- ret = -1;
- }
-
-ret_path:
- return ret;
-}
-#endif /* CONFIG_KSYM_TRACER */
-
diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index f4bc9b2..056468e 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -110,12 +110,12 @@
static void
stack_trace_call(unsigned long ip, unsigned long parent_ip)
{
- int cpu, resched;
+ int cpu;
if (unlikely(!ftrace_enabled || stack_trace_disabled))
return;
- resched = ftrace_preempt_disable();
+ preempt_disable_notrace();
cpu = raw_smp_processor_id();
/* no atomic needed, we only modify this variable by this cpu */
@@ -127,7 +127,7 @@
out:
per_cpu(trace_active, cpu)--;
/* prevent recursion in schedule */
- ftrace_preempt_enable(resched);
+ preempt_enable_notrace();
}
static struct ftrace_ops trace_ops __read_mostly =
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index 34e3580..bac752f 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -23,6 +23,9 @@
static int syscall_enter_define_fields(struct ftrace_event_call *call);
static int syscall_exit_define_fields(struct ftrace_event_call *call);
+/* All syscall exit events have the same fields */
+static LIST_HEAD(syscall_exit_fields);
+
static struct list_head *
syscall_get_enter_fields(struct ftrace_event_call *call)
{
@@ -34,9 +37,7 @@
static struct list_head *
syscall_get_exit_fields(struct ftrace_event_call *call)
{
- struct syscall_metadata *entry = call->data;
-
- return &entry->exit_fields;
+ return &syscall_exit_fields;
}
struct trace_event_functions enter_syscall_print_funcs = {
diff --git a/kernel/trace/trace_sysprof.c b/kernel/trace/trace_sysprof.c
deleted file mode 100644
index a7974a5..0000000
--- a/kernel/trace/trace_sysprof.c
+++ /dev/null
@@ -1,329 +0,0 @@
-/*
- * trace stack traces
- *
- * Copyright (C) 2004-2008, Soeren Sandmann
- * Copyright (C) 2007 Steven Rostedt <srostedt@redhat.com>
- * Copyright (C) 2008 Ingo Molnar <mingo@redhat.com>
- */
-#include <linux/kallsyms.h>
-#include <linux/debugfs.h>
-#include <linux/hrtimer.h>
-#include <linux/uaccess.h>
-#include <linux/ftrace.h>
-#include <linux/module.h>
-#include <linux/irq.h>
-#include <linux/fs.h>
-
-#include <asm/stacktrace.h>
-
-#include "trace.h"
-
-static struct trace_array *sysprof_trace;
-static int __read_mostly tracer_enabled;
-
-/*
- * 1 msec sample interval by default:
- */
-static unsigned long sample_period = 1000000;
-static const unsigned int sample_max_depth = 512;
-
-static DEFINE_MUTEX(sample_timer_lock);
-/*
- * Per CPU hrtimers that do the profiling:
- */
-static DEFINE_PER_CPU(struct hrtimer, stack_trace_hrtimer);
-
-struct stack_frame {
- const void __user *next_fp;
- unsigned long return_address;
-};
-
-static int copy_stack_frame(const void __user *fp, struct stack_frame *frame)
-{
- int ret;
-
- if (!access_ok(VERIFY_READ, fp, sizeof(*frame)))
- return 0;
-
- ret = 1;
- pagefault_disable();
- if (__copy_from_user_inatomic(frame, fp, sizeof(*frame)))
- ret = 0;
- pagefault_enable();
-
- return ret;
-}
-
-struct backtrace_info {
- struct trace_array_cpu *data;
- struct trace_array *tr;
- int pos;
-};
-
-static void
-backtrace_warning_symbol(void *data, char *msg, unsigned long symbol)
-{
- /* Ignore warnings */
-}
-
-static void backtrace_warning(void *data, char *msg)
-{
- /* Ignore warnings */
-}
-
-static int backtrace_stack(void *data, char *name)
-{
- /* Don't bother with IRQ stacks for now */
- return -1;
-}
-
-static void backtrace_address(void *data, unsigned long addr, int reliable)
-{
- struct backtrace_info *info = data;
-
- if (info->pos < sample_max_depth && reliable) {
- __trace_special(info->tr, info->data, 1, addr, 0);
-
- info->pos++;
- }
-}
-
-static const struct stacktrace_ops backtrace_ops = {
- .warning = backtrace_warning,
- .warning_symbol = backtrace_warning_symbol,
- .stack = backtrace_stack,
- .address = backtrace_address,
- .walk_stack = print_context_stack,
-};
-
-static int
-trace_kernel(struct pt_regs *regs, struct trace_array *tr,
- struct trace_array_cpu *data)
-{
- struct backtrace_info info;
- unsigned long bp;
- char *stack;
-
- info.tr = tr;
- info.data = data;
- info.pos = 1;
-
- __trace_special(info.tr, info.data, 1, regs->ip, 0);
-
- stack = ((char *)regs + sizeof(struct pt_regs));
-#ifdef CONFIG_FRAME_POINTER
- bp = regs->bp;
-#else
- bp = 0;
-#endif
-
- dump_trace(NULL, regs, (void *)stack, bp, &backtrace_ops, &info);
-
- return info.pos;
-}
-
-static void timer_notify(struct pt_regs *regs, int cpu)
-{
- struct trace_array_cpu *data;
- struct stack_frame frame;
- struct trace_array *tr;
- const void __user *fp;
- int is_user;
- int i;
-
- if (!regs)
- return;
-
- tr = sysprof_trace;
- data = tr->data[cpu];
- is_user = user_mode(regs);
-
- if (!current || current->pid == 0)
- return;
-
- if (is_user && current->state != TASK_RUNNING)
- return;
-
- __trace_special(tr, data, 0, 0, current->pid);
-
- if (!is_user)
- i = trace_kernel(regs, tr, data);
- else
- i = 0;
-
- /*
- * Trace user stack if we are not a kernel thread
- */
- if (current->mm && i < sample_max_depth) {
- regs = (struct pt_regs *)current->thread.sp0 - 1;
-
- fp = (void __user *)regs->bp;
-
- __trace_special(tr, data, 2, regs->ip, 0);
-
- while (i < sample_max_depth) {
- frame.next_fp = NULL;
- frame.return_address = 0;
- if (!copy_stack_frame(fp, &frame))
- break;
- if ((unsigned long)fp < regs->sp)
- break;
-
- __trace_special(tr, data, 2, frame.return_address,
- (unsigned long)fp);
- fp = frame.next_fp;
-
- i++;
- }
-
- }
-
- /*
- * Special trace entry if we overflow the max depth:
- */
- if (i == sample_max_depth)
- __trace_special(tr, data, -1, -1, -1);
-
- __trace_special(tr, data, 3, current->pid, i);
-}
-
-static enum hrtimer_restart stack_trace_timer_fn(struct hrtimer *hrtimer)
-{
- /* trace here */
- timer_notify(get_irq_regs(), smp_processor_id());
-
- hrtimer_forward_now(hrtimer, ns_to_ktime(sample_period));
-
- return HRTIMER_RESTART;
-}
-
-static void start_stack_timer(void *unused)
-{
- struct hrtimer *hrtimer = &__get_cpu_var(stack_trace_hrtimer);
-
- hrtimer_init(hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
- hrtimer->function = stack_trace_timer_fn;
-
- hrtimer_start(hrtimer, ns_to_ktime(sample_period),
- HRTIMER_MODE_REL_PINNED);
-}
-
-static void start_stack_timers(void)
-{
- on_each_cpu(start_stack_timer, NULL, 1);
-}
-
-static void stop_stack_timer(int cpu)
-{
- struct hrtimer *hrtimer = &per_cpu(stack_trace_hrtimer, cpu);
-
- hrtimer_cancel(hrtimer);
-}
-
-static void stop_stack_timers(void)
-{
- int cpu;
-
- for_each_online_cpu(cpu)
- stop_stack_timer(cpu);
-}
-
-static void stop_stack_trace(struct trace_array *tr)
-{
- mutex_lock(&sample_timer_lock);
- stop_stack_timers();
- tracer_enabled = 0;
- mutex_unlock(&sample_timer_lock);
-}
-
-static int stack_trace_init(struct trace_array *tr)
-{
- sysprof_trace = tr;
-
- tracing_start_cmdline_record();
-
- mutex_lock(&sample_timer_lock);
- start_stack_timers();
- tracer_enabled = 1;
- mutex_unlock(&sample_timer_lock);
- return 0;
-}
-
-static void stack_trace_reset(struct trace_array *tr)
-{
- tracing_stop_cmdline_record();
- stop_stack_trace(tr);
-}
-
-static struct tracer stack_trace __read_mostly =
-{
- .name = "sysprof",
- .init = stack_trace_init,
- .reset = stack_trace_reset,
-#ifdef CONFIG_FTRACE_SELFTEST
- .selftest = trace_selftest_startup_sysprof,
-#endif
-};
-
-__init static int init_stack_trace(void)
-{
- return register_tracer(&stack_trace);
-}
-device_initcall(init_stack_trace);
-
-#define MAX_LONG_DIGITS 22
-
-static ssize_t
-sysprof_sample_read(struct file *filp, char __user *ubuf,
- size_t cnt, loff_t *ppos)
-{
- char buf[MAX_LONG_DIGITS];
- int r;
-
- r = sprintf(buf, "%ld\n", nsecs_to_usecs(sample_period));
-
- return simple_read_from_buffer(ubuf, cnt, ppos, buf, r);
-}
-
-static ssize_t
-sysprof_sample_write(struct file *filp, const char __user *ubuf,
- size_t cnt, loff_t *ppos)
-{
- char buf[MAX_LONG_DIGITS];
- unsigned long val;
-
- if (cnt > MAX_LONG_DIGITS-1)
- cnt = MAX_LONG_DIGITS-1;
-
- if (copy_from_user(&buf, ubuf, cnt))
- return -EFAULT;
-
- buf[cnt] = 0;
-
- val = simple_strtoul(buf, NULL, 10);
- /*
- * Enforce a minimum sample period of 100 usecs:
- */
- if (val < 100)
- val = 100;
-
- mutex_lock(&sample_timer_lock);
- stop_stack_timers();
- sample_period = val * 1000;
- start_stack_timers();
- mutex_unlock(&sample_timer_lock);
-
- return cnt;
-}
-
-static const struct file_operations sysprof_sample_fops = {
- .read = sysprof_sample_read,
- .write = sysprof_sample_write,
-};
-
-void init_tracer_sysprof_debugfs(struct dentry *d_tracer)
-{
-
- trace_create_file("sysprof_sample_period", 0644,
- d_tracer, NULL, &sysprof_sample_fops);
-}
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
new file mode 100644
index 0000000..613bc1f
--- /dev/null
+++ b/kernel/watchdog.c
@@ -0,0 +1,567 @@
+/*
+ * Detect hard and soft lockups on a system
+ *
+ * started by Don Zickus, Copyright (C) 2010 Red Hat, Inc.
+ *
+ * this code detects hard lockups: incidents in where on a CPU
+ * the kernel does not respond to anything except NMI.
+ *
+ * Note: Most of this code is borrowed heavily from softlockup.c,
+ * so thanks to Ingo for the initial implementation.
+ * Some chunks also taken from arch/x86/kernel/apic/nmi.c, thanks
+ * to those contributors as well.
+ */
+
+#include <linux/mm.h>
+#include <linux/cpu.h>
+#include <linux/nmi.h>
+#include <linux/init.h>
+#include <linux/delay.h>
+#include <linux/freezer.h>
+#include <linux/kthread.h>
+#include <linux/lockdep.h>
+#include <linux/notifier.h>
+#include <linux/module.h>
+#include <linux/sysctl.h>
+
+#include <asm/irq_regs.h>
+#include <linux/perf_event.h>
+
+int watchdog_enabled;
+int __read_mostly softlockup_thresh = 60;
+
+static DEFINE_PER_CPU(unsigned long, watchdog_touch_ts);
+static DEFINE_PER_CPU(struct task_struct *, softlockup_watchdog);
+static DEFINE_PER_CPU(struct hrtimer, watchdog_hrtimer);
+static DEFINE_PER_CPU(bool, softlockup_touch_sync);
+static DEFINE_PER_CPU(bool, soft_watchdog_warn);
+#ifdef CONFIG_HARDLOCKUP_DETECTOR
+static DEFINE_PER_CPU(bool, hard_watchdog_warn);
+static DEFINE_PER_CPU(bool, watchdog_nmi_touch);
+static DEFINE_PER_CPU(unsigned long, hrtimer_interrupts);
+static DEFINE_PER_CPU(unsigned long, hrtimer_interrupts_saved);
+static DEFINE_PER_CPU(struct perf_event *, watchdog_ev);
+#endif
+
+static int __read_mostly did_panic;
+static int __initdata no_watchdog;
+
+
+/* boot commands */
+/*
+ * Should we panic when a soft-lockup or hard-lockup occurs:
+ */
+#ifdef CONFIG_HARDLOCKUP_DETECTOR
+static int hardlockup_panic;
+
+static int __init hardlockup_panic_setup(char *str)
+{
+ if (!strncmp(str, "panic", 5))
+ hardlockup_panic = 1;
+ return 1;
+}
+__setup("nmi_watchdog=", hardlockup_panic_setup);
+#endif
+
+unsigned int __read_mostly softlockup_panic =
+ CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE;
+
+static int __init softlockup_panic_setup(char *str)
+{
+ softlockup_panic = simple_strtoul(str, NULL, 0);
+
+ return 1;
+}
+__setup("softlockup_panic=", softlockup_panic_setup);
+
+static int __init nowatchdog_setup(char *str)
+{
+ no_watchdog = 1;
+ return 1;
+}
+__setup("nowatchdog", nowatchdog_setup);
+
+/* deprecated */
+static int __init nosoftlockup_setup(char *str)
+{
+ no_watchdog = 1;
+ return 1;
+}
+__setup("nosoftlockup", nosoftlockup_setup);
+/* */
+
+
+/*
+ * Returns seconds, approximately. We don't need nanosecond
+ * resolution, and we don't need to waste time with a big divide when
+ * 2^30ns == 1.074s.
+ */
+static unsigned long get_timestamp(int this_cpu)
+{
+ return cpu_clock(this_cpu) >> 30LL; /* 2^30 ~= 10^9 */
+}
+
+static unsigned long get_sample_period(void)
+{
+ /*
+ * convert softlockup_thresh from seconds to ns
+ * the divide by 5 is to give hrtimer 5 chances to
+ * increment before the hardlockup detector generates
+ * a warning
+ */
+ return softlockup_thresh / 5 * NSEC_PER_SEC;
+}
+
+/* Commands for resetting the watchdog */
+static void __touch_watchdog(void)
+{
+ int this_cpu = smp_processor_id();
+
+ __get_cpu_var(watchdog_touch_ts) = get_timestamp(this_cpu);
+}
+
+void touch_softlockup_watchdog(void)
+{
+ __get_cpu_var(watchdog_touch_ts) = 0;
+}
+EXPORT_SYMBOL(touch_softlockup_watchdog);
+
+void touch_all_softlockup_watchdogs(void)
+{
+ int cpu;
+
+ /*
+ * this is done lockless
+ * do we care if a 0 races with a timestamp?
+ * all it means is the softlock check starts one cycle later
+ */
+ for_each_online_cpu(cpu)
+ per_cpu(watchdog_touch_ts, cpu) = 0;
+}
+
+#ifdef CONFIG_HARDLOCKUP_DETECTOR
+void touch_nmi_watchdog(void)
+{
+ __get_cpu_var(watchdog_nmi_touch) = true;
+ touch_softlockup_watchdog();
+}
+EXPORT_SYMBOL(touch_nmi_watchdog);
+
+#endif
+
+void touch_softlockup_watchdog_sync(void)
+{
+ __raw_get_cpu_var(softlockup_touch_sync) = true;
+ __raw_get_cpu_var(watchdog_touch_ts) = 0;
+}
+
+#ifdef CONFIG_HARDLOCKUP_DETECTOR
+/* watchdog detector functions */
+static int is_hardlockup(void)
+{
+ unsigned long hrint = __get_cpu_var(hrtimer_interrupts);
+
+ if (__get_cpu_var(hrtimer_interrupts_saved) == hrint)
+ return 1;
+
+ __get_cpu_var(hrtimer_interrupts_saved) = hrint;
+ return 0;
+}
+#endif
+
+static int is_softlockup(unsigned long touch_ts)
+{
+ unsigned long now = get_timestamp(smp_processor_id());
+
+ /* Warn about unreasonable delays: */
+ if (time_after(now, touch_ts + softlockup_thresh))
+ return now - touch_ts;
+
+ return 0;
+}
+
+static int
+watchdog_panic(struct notifier_block *this, unsigned long event, void *ptr)
+{
+ did_panic = 1;
+
+ return NOTIFY_DONE;
+}
+
+static struct notifier_block panic_block = {
+ .notifier_call = watchdog_panic,
+};
+
+#ifdef CONFIG_HARDLOCKUP_DETECTOR
+static struct perf_event_attr wd_hw_attr = {
+ .type = PERF_TYPE_HARDWARE,
+ .config = PERF_COUNT_HW_CPU_CYCLES,
+ .size = sizeof(struct perf_event_attr),
+ .pinned = 1,
+ .disabled = 1,
+};
+
+/* Callback function for perf event subsystem */
+void watchdog_overflow_callback(struct perf_event *event, int nmi,
+ struct perf_sample_data *data,
+ struct pt_regs *regs)
+{
+ if (__get_cpu_var(watchdog_nmi_touch) == true) {
+ __get_cpu_var(watchdog_nmi_touch) = false;
+ return;
+ }
+
+ /* check for a hardlockup
+ * This is done by making sure our timer interrupt
+ * is incrementing. The timer interrupt should have
+ * fired multiple times before we overflow'd. If it hasn't
+ * then this is a good indication the cpu is stuck
+ */
+ if (is_hardlockup()) {
+ int this_cpu = smp_processor_id();
+
+ /* only print hardlockups once */
+ if (__get_cpu_var(hard_watchdog_warn) == true)
+ return;
+
+ if (hardlockup_panic)
+ panic("Watchdog detected hard LOCKUP on cpu %d", this_cpu);
+ else
+ WARN(1, "Watchdog detected hard LOCKUP on cpu %d", this_cpu);
+
+ __get_cpu_var(hard_watchdog_warn) = true;
+ return;
+ }
+
+ __get_cpu_var(hard_watchdog_warn) = false;
+ return;
+}
+static void watchdog_interrupt_count(void)
+{
+ __get_cpu_var(hrtimer_interrupts)++;
+}
+#else
+static inline void watchdog_interrupt_count(void) { return; }
+#endif /* CONFIG_HARDLOCKUP_DETECTOR */
+
+/* watchdog kicker functions */
+static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
+{
+ unsigned long touch_ts = __get_cpu_var(watchdog_touch_ts);
+ struct pt_regs *regs = get_irq_regs();
+ int duration;
+
+ /* kick the hardlockup detector */
+ watchdog_interrupt_count();
+
+ /* kick the softlockup detector */
+ wake_up_process(__get_cpu_var(softlockup_watchdog));
+
+ /* .. and repeat */
+ hrtimer_forward_now(hrtimer, ns_to_ktime(get_sample_period()));
+
+ if (touch_ts == 0) {
+ if (unlikely(__get_cpu_var(softlockup_touch_sync))) {
+ /*
+ * If the time stamp was touched atomically
+ * make sure the scheduler tick is up to date.
+ */
+ __get_cpu_var(softlockup_touch_sync) = false;
+ sched_clock_tick();
+ }
+ __touch_watchdog();
+ return HRTIMER_RESTART;
+ }
+
+ /* check for a softlockup
+ * This is done by making sure a high priority task is
+ * being scheduled. The task touches the watchdog to
+ * indicate it is getting cpu time. If it hasn't then
+ * this is a good indication some task is hogging the cpu
+ */
+ duration = is_softlockup(touch_ts);
+ if (unlikely(duration)) {
+ /* only warn once */
+ if (__get_cpu_var(soft_watchdog_warn) == true)
+ return HRTIMER_RESTART;
+
+ printk(KERN_ERR "BUG: soft lockup - CPU#%d stuck for %us! [%s:%d]\n",
+ smp_processor_id(), duration,
+ current->comm, task_pid_nr(current));
+ print_modules();
+ print_irqtrace_events(current);
+ if (regs)
+ show_regs(regs);
+ else
+ dump_stack();
+
+ if (softlockup_panic)
+ panic("softlockup: hung tasks");
+ __get_cpu_var(soft_watchdog_warn) = true;
+ } else
+ __get_cpu_var(soft_watchdog_warn) = false;
+
+ return HRTIMER_RESTART;
+}
+
+
+/*
+ * The watchdog thread - touches the timestamp.
+ */
+static int watchdog(void *unused)
+{
+ struct sched_param param = { .sched_priority = MAX_RT_PRIO-1 };
+ struct hrtimer *hrtimer = &__raw_get_cpu_var(watchdog_hrtimer);
+
+ sched_setscheduler(current, SCHED_FIFO, ¶m);
+
+ /* initialize timestamp */
+ __touch_watchdog();
+
+ /* kick off the timer for the hardlockup detector */
+ /* done here because hrtimer_start can only pin to smp_processor_id() */
+ hrtimer_start(hrtimer, ns_to_ktime(get_sample_period()),
+ HRTIMER_MODE_REL_PINNED);
+
+ set_current_state(TASK_INTERRUPTIBLE);
+ /*
+ * Run briefly once per second to reset the softlockup timestamp.
+ * If this gets delayed for more than 60 seconds then the
+ * debug-printout triggers in watchdog_timer_fn().
+ */
+ while (!kthread_should_stop()) {
+ __touch_watchdog();
+ schedule();
+
+ if (kthread_should_stop())
+ break;
+
+ set_current_state(TASK_INTERRUPTIBLE);
+ }
+ __set_current_state(TASK_RUNNING);
+
+ return 0;
+}
+
+
+#ifdef CONFIG_HARDLOCKUP_DETECTOR
+static int watchdog_nmi_enable(int cpu)
+{
+ struct perf_event_attr *wd_attr;
+ struct perf_event *event = per_cpu(watchdog_ev, cpu);
+
+ /* is it already setup and enabled? */
+ if (event && event->state > PERF_EVENT_STATE_OFF)
+ goto out;
+
+ /* it is setup but not enabled */
+ if (event != NULL)
+ goto out_enable;
+
+ /* Try to register using hardware perf events */
+ wd_attr = &wd_hw_attr;
+ wd_attr->sample_period = hw_nmi_get_sample_period();
+ event = perf_event_create_kernel_counter(wd_attr, cpu, -1, watchdog_overflow_callback);
+ if (!IS_ERR(event)) {
+ printk(KERN_INFO "NMI watchdog enabled, takes one hw-pmu counter.\n");
+ goto out_save;
+ }
+
+ printk(KERN_ERR "NMI watchdog failed to create perf event on cpu%i: %p\n", cpu, event);
+ return -1;
+
+ /* success path */
+out_save:
+ per_cpu(watchdog_ev, cpu) = event;
+out_enable:
+ perf_event_enable(per_cpu(watchdog_ev, cpu));
+out:
+ return 0;
+}
+
+static void watchdog_nmi_disable(int cpu)
+{
+ struct perf_event *event = per_cpu(watchdog_ev, cpu);
+
+ if (event) {
+ perf_event_disable(event);
+ per_cpu(watchdog_ev, cpu) = NULL;
+
+ /* should be in cleanup, but blocks oprofile */
+ perf_event_release_kernel(event);
+ }
+ return;
+}
+#else
+static int watchdog_nmi_enable(int cpu) { return 0; }
+static void watchdog_nmi_disable(int cpu) { return; }
+#endif /* CONFIG_HARDLOCKUP_DETECTOR */
+
+/* prepare/enable/disable routines */
+static int watchdog_prepare_cpu(int cpu)
+{
+ struct hrtimer *hrtimer = &per_cpu(watchdog_hrtimer, cpu);
+
+ WARN_ON(per_cpu(softlockup_watchdog, cpu));
+ hrtimer_init(hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+ hrtimer->function = watchdog_timer_fn;
+
+ return 0;
+}
+
+static int watchdog_enable(int cpu)
+{
+ struct task_struct *p = per_cpu(softlockup_watchdog, cpu);
+
+ /* enable the perf event */
+ if (watchdog_nmi_enable(cpu) != 0)
+ return -1;
+
+ /* create the watchdog thread */
+ if (!p) {
+ p = kthread_create(watchdog, (void *)(unsigned long)cpu, "watchdog/%d", cpu);
+ if (IS_ERR(p)) {
+ printk(KERN_ERR "softlockup watchdog for %i failed\n", cpu);
+ return -1;
+ }
+ kthread_bind(p, cpu);
+ per_cpu(watchdog_touch_ts, cpu) = 0;
+ per_cpu(softlockup_watchdog, cpu) = p;
+ wake_up_process(p);
+ }
+
+ return 0;
+}
+
+static void watchdog_disable(int cpu)
+{
+ struct task_struct *p = per_cpu(softlockup_watchdog, cpu);
+ struct hrtimer *hrtimer = &per_cpu(watchdog_hrtimer, cpu);
+
+ /*
+ * cancel the timer first to stop incrementing the stats
+ * and waking up the kthread
+ */
+ hrtimer_cancel(hrtimer);
+
+ /* disable the perf event */
+ watchdog_nmi_disable(cpu);
+
+ /* stop the watchdog thread */
+ if (p) {
+ per_cpu(softlockup_watchdog, cpu) = NULL;
+ kthread_stop(p);
+ }
+
+ /* if any cpu succeeds, watchdog is considered enabled for the system */
+ watchdog_enabled = 1;
+}
+
+static void watchdog_enable_all_cpus(void)
+{
+ int cpu;
+ int result = 0;
+
+ for_each_online_cpu(cpu)
+ result += watchdog_enable(cpu);
+
+ if (result)
+ printk(KERN_ERR "watchdog: failed to be enabled on some cpus\n");
+
+}
+
+static void watchdog_disable_all_cpus(void)
+{
+ int cpu;
+
+ for_each_online_cpu(cpu)
+ watchdog_disable(cpu);
+
+ /* if all watchdogs are disabled, then they are disabled for the system */
+ watchdog_enabled = 0;
+}
+
+
+/* sysctl functions */
+#ifdef CONFIG_SYSCTL
+/*
+ * proc handler for /proc/sys/kernel/nmi_watchdog
+ */
+
+int proc_dowatchdog_enabled(struct ctl_table *table, int write,
+ void __user *buffer, size_t *length, loff_t *ppos)
+{
+ proc_dointvec(table, write, buffer, length, ppos);
+
+ if (watchdog_enabled)
+ watchdog_enable_all_cpus();
+ else
+ watchdog_disable_all_cpus();
+ return 0;
+}
+
+int proc_dowatchdog_thresh(struct ctl_table *table, int write,
+ void __user *buffer,
+ size_t *lenp, loff_t *ppos)
+{
+ return proc_dointvec_minmax(table, write, buffer, lenp, ppos);
+}
+#endif /* CONFIG_SYSCTL */
+
+
+/*
+ * Create/destroy watchdog threads as CPUs come and go:
+ */
+static int __cpuinit
+cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu)
+{
+ int hotcpu = (unsigned long)hcpu;
+
+ switch (action) {
+ case CPU_UP_PREPARE:
+ case CPU_UP_PREPARE_FROZEN:
+ if (watchdog_prepare_cpu(hotcpu))
+ return NOTIFY_BAD;
+ break;
+ case CPU_ONLINE:
+ case CPU_ONLINE_FROZEN:
+ if (watchdog_enable(hotcpu))
+ return NOTIFY_BAD;
+ break;
+#ifdef CONFIG_HOTPLUG_CPU
+ case CPU_UP_CANCELED:
+ case CPU_UP_CANCELED_FROZEN:
+ watchdog_disable(hotcpu);
+ break;
+ case CPU_DEAD:
+ case CPU_DEAD_FROZEN:
+ watchdog_disable(hotcpu);
+ break;
+#endif /* CONFIG_HOTPLUG_CPU */
+ }
+ return NOTIFY_OK;
+}
+
+static struct notifier_block __cpuinitdata cpu_nfb = {
+ .notifier_call = cpu_callback
+};
+
+static int __init spawn_watchdog_task(void)
+{
+ void *cpu = (void *)(long)smp_processor_id();
+ int err;
+
+ if (no_watchdog)
+ return 0;
+
+ err = cpu_callback(&cpu_nfb, CPU_UP_PREPARE, cpu);
+ WARN_ON(err == NOTIFY_BAD);
+
+ cpu_callback(&cpu_nfb, CPU_ONLINE, cpu);
+ register_cpu_notifier(&cpu_nfb);
+
+ atomic_notifier_chain_register(&panic_notifier_list, &panic_block);
+
+ return 0;
+}
+early_initcall(spawn_watchdog_task);
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index e80d6bf..ff87ddc 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -152,28 +152,33 @@
Drivers ought to be able to handle interrupts coming in at those
points; some don't and need to be caught.
-config DETECT_SOFTLOCKUP
- bool "Detect Soft Lockups"
+config LOCKUP_DETECTOR
+ bool "Detect Hard and Soft Lockups"
depends on DEBUG_KERNEL && !S390
- default y
help
- Say Y here to enable the kernel to detect "soft lockups",
- which are bugs that cause the kernel to loop in kernel
+ Say Y here to enable the kernel to act as a watchdog to detect
+ hard and soft lockups.
+
+ Softlockups are bugs that cause the kernel to loop in kernel
mode for more than 60 seconds, without giving other tasks a
- chance to run.
+ chance to run. The current stack trace is displayed upon
+ detection and the system will stay locked up.
- When a soft-lockup is detected, the kernel will print the
- current stack trace (which you should report), but the
- system will stay locked up. This feature has negligible
- overhead.
+ Hardlockups are bugs that cause the CPU to loop in kernel mode
+ for more than 60 seconds, without letting other interrupts have a
+ chance to run. The current stack trace is displayed upon detection
+ and the system will stay locked up.
- (Note that "hard lockups" are separate type of bugs that
- can be detected via the NMI-watchdog, on platforms that
- support it.)
+ The overhead should be minimal. A periodic hrtimer runs to
+ generate interrupts and kick the watchdog task every 10-12 seconds.
+ An NMI is generated every 60 seconds or so to check for hardlockups.
+
+config HARDLOCKUP_DETECTOR
+ def_bool LOCKUP_DETECTOR && PERF_EVENTS && HAVE_PERF_EVENTS_NMI
config BOOTPARAM_SOFTLOCKUP_PANIC
bool "Panic (Reboot) On Soft Lockups"
- depends on DETECT_SOFTLOCKUP
+ depends on LOCKUP_DETECTOR
help
Say Y here to enable the kernel to panic on "soft lockups",
which are bugs that cause the kernel to loop in kernel
@@ -190,7 +195,7 @@
config BOOTPARAM_SOFTLOCKUP_PANIC_VALUE
int
- depends on DETECT_SOFTLOCKUP
+ depends on LOCKUP_DETECTOR
range 0 1
default 0 if !BOOTPARAM_SOFTLOCKUP_PANIC
default 1 if BOOTPARAM_SOFTLOCKUP_PANIC
diff --git a/mm/mmap.c b/mm/mmap.c
index 456ec6f..e38e910 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1734,8 +1734,10 @@
grow = (address - vma->vm_end) >> PAGE_SHIFT;
error = acct_stack_growth(vma, size, grow);
- if (!error)
+ if (!error) {
vma->vm_end = address;
+ perf_event_mmap(vma);
+ }
}
anon_vma_unlock(vma);
return error;
@@ -1781,6 +1783,7 @@
if (!error) {
vma->vm_start = address;
vma->vm_pgoff -= grow;
+ perf_event_mmap(vma);
}
}
anon_vma_unlock(vma);
@@ -2208,6 +2211,7 @@
vma->vm_page_prot = vm_get_page_prot(flags);
vma_link(mm, vma, prev, rb_link, rb_parent);
out:
+ perf_event_mmap(vma);
mm->total_vm += len >> PAGE_SHIFT;
if (flags & VM_LOCKED) {
if (!mlock_vma_pages_range(vma, addr, addr + len))
diff --git a/mm/slab.c b/mm/slab.c
index e49f8f4..47360c3e 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -102,7 +102,6 @@
#include <linux/cpu.h>
#include <linux/sysctl.h>
#include <linux/module.h>
-#include <linux/kmemtrace.h>
#include <linux/rcupdate.h>
#include <linux/string.h>
#include <linux/uaccess.h>
diff --git a/mm/slob.c b/mm/slob.c
index 19d2e5d..3f19a34 100644
--- a/mm/slob.c
+++ b/mm/slob.c
@@ -66,8 +66,10 @@
#include <linux/module.h>
#include <linux/rcupdate.h>
#include <linux/list.h>
-#include <linux/kmemtrace.h>
#include <linux/kmemleak.h>
+
+#include <trace/events/kmem.h>
+
#include <asm/atomic.h>
/*
diff --git a/mm/slub.c b/mm/slub.c
index 578f68f..7bb7940 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -17,7 +17,6 @@
#include <linux/slab.h>
#include <linux/proc_fs.h>
#include <linux/seq_file.h>
-#include <linux/kmemtrace.h>
#include <linux/kmemcheck.h>
#include <linux/cpu.h>
#include <linux/cpuset.h>
diff --git a/scripts/package/Makefile b/scripts/package/Makefile
index d2c29b6..d0b931b 100644
--- a/scripts/package/Makefile
+++ b/scripts/package/Makefile
@@ -111,13 +111,38 @@
clean-dirs += $(objtree)/tar-install/
+# perf-pkg - generate a source tarball with perf source
+# ---------------------------------------------------------------------------
+
+perf-tar=perf-$(KERNELVERSION)
+
+quiet_cmd_perf_tar = TAR
+ cmd_perf_tar = \
+git archive --prefix=$(perf-tar)/ HEAD^{tree} \
+ $$(cat $(srctree)/tools/perf/MANIFEST) -o $(perf-tar).tar; \
+mkdir -p $(perf-tar); \
+git rev-parse HEAD > $(perf-tar)/HEAD; \
+tar rf $(perf-tar).tar $(perf-tar)/HEAD; \
+rm -r $(perf-tar); \
+$(if $(findstring tar-src,$@),, \
+$(if $(findstring bz2,$@),bzip2, \
+$(if $(findstring gz,$@),gzip, \
+$(error unknown target $@))) \
+ -f -9 $(perf-tar).tar)
+
+perf-%pkg: FORCE
+ $(call cmd,perf_tar)
+
# Help text displayed when executing 'make help'
# ---------------------------------------------------------------------------
help: FORCE
- @echo ' rpm-pkg - Build both source and binary RPM kernel packages'
- @echo ' binrpm-pkg - Build only the binary kernel package'
- @echo ' deb-pkg - Build the kernel as an deb package'
- @echo ' tar-pkg - Build the kernel as an uncompressed tarball'
- @echo ' targz-pkg - Build the kernel as a gzip compressed tarball'
- @echo ' tarbz2-pkg - Build the kernel as a bzip2 compressed tarball'
+ @echo ' rpm-pkg - Build both source and binary RPM kernel packages'
+ @echo ' binrpm-pkg - Build only the binary kernel package'
+ @echo ' deb-pkg - Build the kernel as an deb package'
+ @echo ' tar-pkg - Build the kernel as an uncompressed tarball'
+ @echo ' targz-pkg - Build the kernel as a gzip compressed tarball'
+ @echo ' tarbz2-pkg - Build the kernel as a bzip2 compressed tarball'
+ @echo ' perf-tar-src-pkg - Build $(perf-tar).tar source tarball'
+ @echo ' perf-targz-src-pkg - Build $(perf-tar).tar.gz source tarball'
+ @echo ' perf-tarbz2-src-pkg - Build $(perf-tar).tar.bz2 source tarball'
diff --git a/scripts/recordmcount.pl b/scripts/recordmcount.pl
index f3c9c0a..0171060 100755
--- a/scripts/recordmcount.pl
+++ b/scripts/recordmcount.pl
@@ -326,7 +326,7 @@
# 14: R_MIPS_NONE *ABS*
# 18: 00020021 nop
if ($is_module eq "0") {
- $mcount_regex = "^\\s*([0-9a-fA-F]+):.*\\s_mcount\$";
+ $mcount_regex = "^\\s*([0-9a-fA-F]+): R_MIPS_26\\s+_mcount\$";
} else {
$mcount_regex = "^\\s*([0-9a-fA-F]+): R_MIPS_HI16\\s+_mcount\$";
}
diff --git a/tools/perf/.gitignore b/tools/perf/.gitignore
index e1d60d7..cb43289 100644
--- a/tools/perf/.gitignore
+++ b/tools/perf/.gitignore
@@ -18,3 +18,5 @@
tags
TAGS
cscope*
+config.mak
+config.mak.autogen
diff --git a/tools/perf/Documentation/perf-buildid-cache.txt b/tools/perf/Documentation/perf-buildid-cache.txt
index 5d1a950..c105770 100644
--- a/tools/perf/Documentation/perf-buildid-cache.txt
+++ b/tools/perf/Documentation/perf-buildid-cache.txt
@@ -12,9 +12,9 @@
DESCRIPTION
-----------
-This command manages the build-id cache. It can add and remove files to the
-cache. In the future it should as well purge older entries, set upper limits
-for the space used by the cache, etc.
+This command manages the build-id cache. It can add and remove files to/from
+the cache. In the future it should as well purge older entries, set upper
+limits for the space used by the cache, etc.
OPTIONS
-------
@@ -23,7 +23,7 @@
Add specified file to the cache.
-r::
--remove=::
- Remove specified file to the cache.
+ Remove specified file from the cache.
-v::
--verbose::
Be more verbose.
diff --git a/tools/perf/Documentation/perf-probe.txt b/tools/perf/Documentation/perf-probe.txt
index 94a258c..27d52da 100644
--- a/tools/perf/Documentation/perf-probe.txt
+++ b/tools/perf/Documentation/perf-probe.txt
@@ -31,6 +31,10 @@
--vmlinux=PATH::
Specify vmlinux path which has debuginfo (Dwarf binary).
+-s::
+--source=PATH::
+ Specify path to kernel source.
+
-v::
--verbose::
Be more verbose (show parsed arguments, etc).
@@ -90,8 +94,8 @@
[NAME=]LOCALVAR|$retval|%REG|@SYMBOL[:TYPE]
-'NAME' specifies the name of this argument (optional). You can use the name of local variable, local data structure member (e.g. var->field, var.field2), or kprobe-tracer argument format (e.g. $retval, %ax, etc). Note that the name of this argument will be set as the last member name if you specify a local data structure member (e.g. field2 for 'var->field1.field2'.)
-'TYPE' casts the type of this argument (optional). If omitted, perf probe automatically set the type based on debuginfo.
+'NAME' specifies the name of this argument (optional). You can use the name of local variable, local data structure member (e.g. var->field, var.field2), local array with fixed index (e.g. array[1], var->array[0], var->pointer[2]), or kprobe-tracer argument format (e.g. $retval, %ax, etc). Note that the name of this argument will be set as the last member name if you specify a local data structure member (e.g. field2 for 'var->field1.field2'.)
+'TYPE' casts the type of this argument (optional). If omitted, perf probe automatically set the type based on debuginfo. You can specify 'string' type only for the local variable or structure member which is an array of or a pointer to 'char' or 'unsigned char' type.
LINE SYNTAX
-----------
diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 34e255f..3ee27dc 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -103,6 +103,19 @@
--raw-samples::
Collect raw sample records from all opened counters (default for tracepoint counters).
+-C::
+--cpu::
+Collect samples only on the list of cpus provided. Multiple CPUs can be provided as a
+comma-sperated list with no space: 0,1. Ranges of CPUs are specified with -: 0-2.
+In per-thread mode with inheritance mode on (default), samples are captured only when
+the thread executes on the designated CPUs. Default is to monitor all CPUs.
+
+-N::
+--no-buildid-cache::
+Do not update the builid cache. This saves some overhead in situations
+where the information in the perf.data file (which includes buildids)
+is sufficient.
+
SEE ALSO
--------
linkperf:perf-stat[1], linkperf:perf-list[1]
diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 909fa76..4b3a2d4 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -46,6 +46,13 @@
-B::
print large numbers with thousands' separators according to locale
+-C::
+--cpu=::
+Count only on the list of cpus provided. Multiple CPUs can be provided as a
+comma-sperated list with no space: 0,1. Ranges of CPUs are specified with -: 0-2.
+In per-thread mode, this option is ignored. The -a option is still necessary
+to activate system-wide monitoring. Default is to count on all CPUs.
+
EXAMPLES
--------
diff --git a/tools/perf/Documentation/perf-top.txt b/tools/perf/Documentation/perf-top.txt
index 785b9fc..1f96876 100644
--- a/tools/perf/Documentation/perf-top.txt
+++ b/tools/perf/Documentation/perf-top.txt
@@ -25,9 +25,11 @@
--count=<count>::
Event period to sample.
--C <cpu>::
---CPU=<cpu>::
- CPU to profile.
+-C <cpu-list>::
+--cpu=<cpu>::
+Monitor only on the list of cpus provided. Multiple CPUs can be provided as a
+comma-sperated list with no space: 0,1. Ranges of CPUs are specified with -: 0-2.
+Default is to monitor all CPUS.
-d <seconds>::
--delay=<seconds>::
diff --git a/tools/perf/MANIFEST b/tools/perf/MANIFEST
new file mode 100644
index 0000000..8c7fc0c
--- /dev/null
+++ b/tools/perf/MANIFEST
@@ -0,0 +1,12 @@
+tools/perf
+include/linux/perf_event.h
+include/linux/rbtree.h
+include/linux/list.h
+include/linux/hash.h
+include/linux/stringify.h
+lib/rbtree.c
+include/linux/swab.h
+arch/*/include/asm/unistd*.h
+include/linux/poison.h
+include/linux/magic.h
+include/linux/hw_breakpoint.h
diff --git a/tools/perf/Makefile b/tools/perf/Makefile
index d75c28a..26f626d 100644
--- a/tools/perf/Makefile
+++ b/tools/perf/Makefile
@@ -285,14 +285,10 @@
QUIET_STDERR = ">/dev/null 2>&1"
endif
-BITBUCKET = "/dev/null"
+-include feature-tests.mak
-ifneq ($(shell sh -c "(echo '\#include <stdio.h>'; echo 'int main(void) { return puts(\"hi\"); }') | $(CC) -x c - $(ALL_CFLAGS) -o $(BITBUCKET) "$(QUIET_STDERR)" && echo y"), y)
- BITBUCKET = .perf.dev.null
-endif
-
-ifeq ($(shell sh -c "echo 'int foo(void) {char X[2]; return 3;}' | $(CC) -x c -c -Werror -fstack-protector-all - -o $(BITBUCKET) "$(QUIET_STDERR)" && echo y"), y)
- CFLAGS := $(CFLAGS) -fstack-protector-all
+ifeq ($(call try-cc,$(SOURCE_HELLO),-Werror -fstack-protector-all),y)
+ CFLAGS := $(CFLAGS) -fstack-protector-all
endif
@@ -508,7 +504,8 @@
-include config.mak
ifndef NO_DWARF
-ifneq ($(shell sh -c "(echo '\#include <dwarf.h>'; echo '\#include <libdw.h>'; echo '\#include <version.h>'; echo '\#ifndef _ELFUTILS_PREREQ'; echo '\#error'; echo '\#endif'; echo 'int main(void) { Dwarf *dbg; dbg = dwarf_begin(0, DWARF_C_READ); return (long)dbg; }') | $(CC) -x c - $(ALL_CFLAGS) -I/usr/include/elfutils -ldw -lelf -o $(BITBUCKET) $(ALL_LDFLAGS) $(EXTLIBS) "$(QUIET_STDERR)" && echo y"), y)
+FLAGS_DWARF=$(ALL_CFLAGS) -I/usr/include/elfutils -ldw -lelf $(ALL_LDFLAGS) $(EXTLIBS)
+ifneq ($(call try-cc,$(SOURCE_DWARF),$(FLAGS_DWARF)),y)
msg := $(warning No libdw.h found or old libdw.h found or elfutils is older than 0.138, disables dwarf support. Please install new elfutils-devel/libdw-dev);
NO_DWARF := 1
endif # Dwarf support
@@ -536,16 +533,18 @@
BASIC_CFLAGS += -I$(OUTPUT)
endif
-ifeq ($(shell sh -c "(echo '\#include <libelf.h>'; echo 'int main(void) { Elf * elf = elf_begin(0, ELF_C_READ, 0); return (long)elf; }') | $(CC) -x c - $(ALL_CFLAGS) -o $(BITBUCKET) $(ALL_LDFLAGS) $(EXTLIBS) "$(QUIET_STDERR)" && echo y"), y)
-ifneq ($(shell sh -c "(echo '\#include <gnu/libc-version.h>'; echo 'int main(void) { const char * version = gnu_get_libc_version(); return (long)version; }') | $(CC) -x c - $(ALL_CFLAGS) -o $(BITBUCKET) $(ALL_LDFLAGS) $(EXTLIBS) "$(QUIET_STDERR)" && echo y"), y)
- msg := $(error No gnu/libc-version.h found, please install glibc-dev[el]/glibc-static);
+FLAGS_LIBELF=$(ALL_CFLAGS) $(ALL_LDFLAGS) $(EXTLIBS)
+ifneq ($(call try-cc,$(SOURCE_LIBELF),$(FLAGS_LIBELF)),y)
+ FLAGS_GLIBC=$(ALL_CFLAGS) $(ALL_LDFLAGS)
+ ifneq ($(call try-cc,$(SOURCE_GLIBC),$(FLAGS_GLIBC)),y)
+ msg := $(error No gnu/libc-version.h found, please install glibc-dev[el]/glibc-static);
+ else
+ msg := $(error No libelf.h/libelf found, please install libelf-dev/elfutils-libelf-devel);
+ endif
endif
- ifneq ($(shell sh -c "(echo '\#include <libelf.h>'; echo 'int main(void) { Elf * elf = elf_begin(0, ELF_C_READ_MMAP, 0); return (long)elf; }') | $(CC) -x c - $(ALL_CFLAGS) -o $(BITBUCKET) $(ALL_LDFLAGS) $(EXTLIBS) "$(QUIET_STDERR)" && echo y"), y)
- BASIC_CFLAGS += -DLIBELF_NO_MMAP
- endif
-else
- msg := $(error No libelf.h/libelf found, please install libelf-dev/elfutils-libelf-devel and glibc-dev[el]);
+ifneq ($(call try-cc,$(SOURCE_ELF_MMAP),$(FLAGS_COMMON)),y)
+ BASIC_CFLAGS += -DLIBELF_NO_MMAP
endif
ifndef NO_DWARF
@@ -561,64 +560,73 @@
ifdef NO_NEWT
BASIC_CFLAGS += -DNO_NEWT_SUPPORT
else
-ifneq ($(shell sh -c "(echo '\#include <newt.h>'; echo 'int main(void) { newtInit(); newtCls(); return newtFinished(); }') | $(CC) -x c - $(ALL_CFLAGS) -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -lnewt -o $(BITBUCKET) $(ALL_LDFLAGS) $(EXTLIBS) "$(QUIET_STDERR)" && echo y"), y)
- msg := $(warning newt not found, disables TUI support. Please install newt-devel or libnewt-dev);
- BASIC_CFLAGS += -DNO_NEWT_SUPPORT
-else
- # Fedora has /usr/include/slang/slang.h, but ubuntu /usr/include/slang.h
- BASIC_CFLAGS += -I/usr/include/slang
- EXTLIBS += -lnewt -lslang
- LIB_OBJS += $(OUTPUT)util/newt.o
-endif
-endif # NO_NEWT
-
-ifndef NO_LIBPERL
-PERL_EMBED_LDOPTS = `perl -MExtUtils::Embed -e ldopts 2>/dev/null`
-PERL_EMBED_CCOPTS = `perl -MExtUtils::Embed -e ccopts 2>/dev/null`
+ FLAGS_NEWT=$(ALL_CFLAGS) $(ALL_LDFLAGS) $(EXTLIBS) -lnewt
+ ifneq ($(call try-cc,$(SOURCE_NEWT),$(FLAGS_NEWT)),y)
+ msg := $(warning newt not found, disables TUI support. Please install newt-devel or libnewt-dev);
+ BASIC_CFLAGS += -DNO_NEWT_SUPPORT
+ else
+ # Fedora has /usr/include/slang/slang.h, but ubuntu /usr/include/slang.h
+ BASIC_CFLAGS += -I/usr/include/slang
+ EXTLIBS += -lnewt -lslang
+ LIB_OBJS += $(OUTPUT)util/newt.o
+ endif
endif
-ifneq ($(shell sh -c "(echo '\#include <EXTERN.h>'; echo '\#include <perl.h>'; echo 'int main(void) { perl_alloc(); return 0; }') | $(CC) -x c - $(PERL_EMBED_CCOPTS) -o $(BITBUCKET) $(PERL_EMBED_LDOPTS) > /dev/null 2>&1 && echo y"), y)
+ifdef NO_LIBPERL
BASIC_CFLAGS += -DNO_LIBPERL
else
- ALL_LDFLAGS += $(PERL_EMBED_LDOPTS)
- LIB_OBJS += $(OUTPUT)util/scripting-engines/trace-event-perl.o
- LIB_OBJS += $(OUTPUT)scripts/perl/Perf-Trace-Util/Context.o
+ PERL_EMBED_LDOPTS = `perl -MExtUtils::Embed -e ldopts 2>/dev/null`
+ PERL_EMBED_CCOPTS = `perl -MExtUtils::Embed -e ccopts 2>/dev/null`
+ FLAGS_PERL_EMBED=$(PERL_EMBED_CCOPTS) $(PERL_EMBED_LDOPTS)
+
+ ifneq ($(call try-cc,$(SOURCE_PERL_EMBED),$(FLAGS_PERL_EMBED)),y)
+ BASIC_CFLAGS += -DNO_LIBPERL
+ else
+ ALL_LDFLAGS += $(PERL_EMBED_LDOPTS)
+ LIB_OBJS += $(OUTPUT)util/scripting-engines/trace-event-perl.o
+ LIB_OBJS += $(OUTPUT)scripts/perl/Perf-Trace-Util/Context.o
+ endif
endif
-ifndef NO_LIBPYTHON
-PYTHON_EMBED_LDOPTS = `python-config --ldflags 2>/dev/null`
-PYTHON_EMBED_CCOPTS = `python-config --cflags 2>/dev/null`
-endif
-
-ifneq ($(shell sh -c "(echo '\#include <Python.h>'; echo 'int main(void) { Py_Initialize(); return 0; }') | $(CC) -x c - $(PYTHON_EMBED_CCOPTS) -o $(BITBUCKET) $(PYTHON_EMBED_LDOPTS) > /dev/null 2>&1 && echo y"), y)
+ifdef NO_LIBPYTHON
BASIC_CFLAGS += -DNO_LIBPYTHON
else
- ALL_LDFLAGS += $(PYTHON_EMBED_LDOPTS)
- LIB_OBJS += $(OUTPUT)util/scripting-engines/trace-event-python.o
- LIB_OBJS += $(OUTPUT)scripts/python/Perf-Trace-Util/Context.o
+ PYTHON_EMBED_LDOPTS = `python-config --ldflags 2>/dev/null`
+ PYTHON_EMBED_CCOPTS = `python-config --cflags 2>/dev/null`
+ FLAGS_PYTHON_EMBED=$(PYTHON_EMBED_CCOPTS) $(PYTHON_EMBED_LDOPTS)
+ ifneq ($(call try-cc,$(SOURCE_PYTHON_EMBED),$(FLAGS_PYTHON_EMBED)),y)
+ BASIC_CFLAGS += -DNO_LIBPYTHON
+ else
+ ALL_LDFLAGS += $(PYTHON_EMBED_LDOPTS)
+ LIB_OBJS += $(OUTPUT)util/scripting-engines/trace-event-python.o
+ LIB_OBJS += $(OUTPUT)scripts/python/Perf-Trace-Util/Context.o
+ endif
endif
ifdef NO_DEMANGLE
BASIC_CFLAGS += -DNO_DEMANGLE
else
- ifdef HAVE_CPLUS_DEMANGLE
+ ifdef HAVE_CPLUS_DEMANGLE
EXTLIBS += -liberty
BASIC_CFLAGS += -DHAVE_CPLUS_DEMANGLE
- else
- has_bfd := $(shell sh -c "(echo '\#include <bfd.h>'; echo 'int main(void) { bfd_demangle(0, 0, 0); return 0; }') | $(CC) -x c - $(ALL_CFLAGS) -o $(BITBUCKET) $(ALL_LDFLAGS) $(EXTLIBS) -lbfd "$(QUIET_STDERR)" && echo y")
-
+ else
+ FLAGS_BFD=$(ALL_CFLAGS) $(ALL_LDFLAGS) $(EXTLIBS) -lbfd
+ has_bfd := $(call try-cc,$(SOURCE_BFD),$(FLAGS_BFD))
ifeq ($(has_bfd),y)
EXTLIBS += -lbfd
else
- has_bfd_iberty := $(shell sh -c "(echo '\#include <bfd.h>'; echo 'int main(void) { bfd_demangle(0, 0, 0); return 0; }') | $(CC) -x c - $(ALL_CFLAGS) -o $(BITBUCKET) $(ALL_LDFLAGS) $(EXTLIBS) -lbfd -liberty "$(QUIET_STDERR)" && echo y")
+ FLAGS_BFD_IBERTY=$(FLAGS_BFD) -liberty
+ has_bfd_iberty := $(call try-cc,$(SOURCE_BFD),$(FLAGS_BFD_IBERTY))
ifeq ($(has_bfd_iberty),y)
EXTLIBS += -lbfd -liberty
else
- has_bfd_iberty_z := $(shell sh -c "(echo '\#include <bfd.h>'; echo 'int main(void) { bfd_demangle(0, 0, 0); return 0; }') | $(CC) -x c - $(ALL_CFLAGS) -o $(BITBUCKET) $(ALL_LDFLAGS) $(EXTLIBS) -lbfd -liberty -lz "$(QUIET_STDERR)" && echo y")
+ FLAGS_BFD_IBERTY_Z=$(FLAGS_BFD_IBERTY) -lz
+ has_bfd_iberty_z := $(call try-cc,$(SOURCE_BFD),$(FLAGS_BFD_IBERTY_Z))
ifeq ($(has_bfd_iberty_z),y)
EXTLIBS += -lbfd -liberty -lz
else
- has_cplus_demangle := $(shell sh -c "(echo 'extern char *cplus_demangle(const char *, int);'; echo 'int main(void) { cplus_demangle(0, 0); return 0; }') | $(CC) -x c - $(ALL_CFLAGS) -o $(BITBUCKET) $(ALL_LDFLAGS) $(EXTLIBS) -liberty "$(QUIET_STDERR)" && echo y")
+ FLAGS_CPLUS_DEMANGLE=$(ALL_CFLAGS) $(ALL_LDFLAGS) $(EXTLIBS) -liberty
+ has_cplus_demangle := $(call try-cc,$(SOURCE_CPLUS_DEMANGLE),$(FLAGS_CPLUS_DEMANGLE))
ifeq ($(has_cplus_demangle),y)
EXTLIBS += -liberty
BASIC_CFLAGS += -DHAVE_CPLUS_DEMANGLE
@@ -867,7 +875,7 @@
SHELL = $(SHELL_PATH)
-all:: .perf.dev.null shell_compatibility_test $(ALL_PROGRAMS) $(BUILT_INS) $(OTHER_PROGRAMS) $(OUTPUT)PERF-BUILD-OPTIONS
+all:: shell_compatibility_test $(ALL_PROGRAMS) $(BUILT_INS) $(OTHER_PROGRAMS) $(OUTPUT)PERF-BUILD-OPTIONS
ifneq (,$X)
$(foreach p,$(patsubst %$X,%,$(filter %$X,$(ALL_PROGRAMS) $(BUILT_INS) perf$X)), test '$p' -ef '$p$X' || $(RM) '$p';)
endif
@@ -1197,11 +1205,6 @@
.PHONY: .FORCE-PERF-VERSION-FILE TAGS tags cscope .FORCE-PERF-CFLAGS
.PHONY: .FORCE-PERF-BUILD-OPTIONS
-.perf.dev.null:
- touch .perf.dev.null
-
-.INTERMEDIATE: .perf.dev.null
-
### Make sure built-ins do not have dups and listed in perf.c
#
check-builtins::
diff --git a/tools/perf/arch/sh/Makefile b/tools/perf/arch/sh/Makefile
new file mode 100644
index 0000000..15130b50
--- /dev/null
+++ b/tools/perf/arch/sh/Makefile
@@ -0,0 +1,4 @@
+ifndef NO_DWARF
+PERF_HAVE_DWARF_REGS := 1
+LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o
+endif
diff --git a/tools/perf/arch/sh/util/dwarf-regs.c b/tools/perf/arch/sh/util/dwarf-regs.c
new file mode 100644
index 0000000..a11edb0
--- /dev/null
+++ b/tools/perf/arch/sh/util/dwarf-regs.c
@@ -0,0 +1,55 @@
+/*
+ * Mapping of DWARF debug register numbers into register names.
+ *
+ * Copyright (C) 2010 Matt Fleming <matt@console-pimps.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ */
+
+#include <libio.h>
+#include <dwarf-regs.h>
+
+/*
+ * Generic dwarf analysis helpers
+ */
+
+#define SH_MAX_REGS 18
+const char *sh_regs_table[SH_MAX_REGS] = {
+ "r0",
+ "r1",
+ "r2",
+ "r3",
+ "r4",
+ "r5",
+ "r6",
+ "r7",
+ "r8",
+ "r9",
+ "r10",
+ "r11",
+ "r12",
+ "r13",
+ "r14",
+ "r15",
+ "pc",
+ "pr",
+};
+
+/* Return architecture dependent register string (for kprobe-tracer) */
+const char *get_arch_regstr(unsigned int n)
+{
+ return (n <= SH_MAX_REGS) ? sh_regs_table[n] : NULL;
+}
diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index 96db524..fd20670 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -61,11 +61,9 @@
static int process_sample_event(event_t *event, struct perf_session *session)
{
struct addr_location al;
+ struct sample_data data;
- dump_printf("(IP, %d): %d: %#Lx\n", event->header.misc,
- event->ip.pid, event->ip.ip);
-
- if (event__preprocess_sample(event, session, &al, NULL) < 0) {
+ if (event__preprocess_sample(event, session, &al, &data, NULL) < 0) {
pr_warning("problem processing %d event, skipping it.\n",
event->header.type);
return -1;
diff --git a/tools/perf/builtin-buildid-cache.c b/tools/perf/builtin-buildid-cache.c
index f8e3d18..29ad20e 100644
--- a/tools/perf/builtin-buildid-cache.c
+++ b/tools/perf/builtin-buildid-cache.c
@@ -78,8 +78,7 @@
struct str_node *pos;
char debugdir[PATH_MAX];
- snprintf(debugdir, sizeof(debugdir), "%s/%s", getenv("HOME"),
- DEBUG_CACHE_DIR);
+ snprintf(debugdir, sizeof(debugdir), "%s", buildid_dir);
if (add_name_list_str) {
list = strlist__new(true, add_name_list_str);
diff --git a/tools/perf/builtin-buildid-list.c b/tools/perf/builtin-buildid-list.c
index 9989072..44a47e1 100644
--- a/tools/perf/builtin-buildid-list.c
+++ b/tools/perf/builtin-buildid-list.c
@@ -43,10 +43,8 @@
if (session == NULL)
return -1;
- if (with_hits) {
- symbol_conf.full_paths = true;
+ if (with_hits)
perf_session__process_events(session, &build_id__mark_dso_hit_ops);
- }
perf_session__fprintf_dsos_buildid(session, stdout, with_hits);
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index a6e2fdc..fca1d44 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -35,10 +35,7 @@
struct addr_location al;
struct sample_data data = { .period = 1, };
- dump_printf("(IP, %d): %d: %#Lx\n", event->header.misc,
- event->ip.pid, event->ip.ip);
-
- if (event__preprocess_sample(event, session, &al, NULL) < 0) {
+ if (event__preprocess_sample(event, session, &al, &data, NULL) < 0) {
pr_warning("problem processing %d event, skipping it.\n",
event->header.type);
return -1;
@@ -47,8 +44,6 @@
if (al.filtered || al.sym == NULL)
return 0;
- event__parse_sample(event, session->sample_type, &data);
-
if (hists__add_entry(&session->hists, &al, data.period)) {
pr_warning("problem incrementing symbol period, skipping event\n");
return -1;
@@ -185,8 +180,6 @@
OPT_BOOLEAN('f', "force", &force, "don't complain, do it"),
OPT_BOOLEAN('m', "modules", &symbol_conf.use_modules,
"load module symbols - WARNING: use only with -k and LIVE kernel"),
- OPT_BOOLEAN('P', "full-paths", &symbol_conf.full_paths,
- "Don't shorten the pathnames taking into account the cwd"),
OPT_STRING('d', "dsos", &symbol_conf.dso_list_str, "dso[,dso...]",
"only consider symbols in these dsos"),
OPT_STRING('C', "comms", &symbol_conf.comm_list_str, "comm[,comm...]",
diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
index e4a4da3..199d5e1 100644
--- a/tools/perf/builtin-probe.c
+++ b/tools/perf/builtin-probe.c
@@ -182,6 +182,8 @@
"Show source code lines.", opt_show_lines),
OPT_STRING('k', "vmlinux", &symbol_conf.vmlinux_name,
"file", "vmlinux pathname"),
+ OPT_STRING('s', "source", &symbol_conf.source_prefix,
+ "directory", "path to kernel source"),
#endif
OPT__DRY_RUN(&probe_event_dry_run),
OPT_INTEGER('\0', "max-probes", ¶ms.max_probe_points,
@@ -265,4 +267,3 @@
}
return 0;
}
-
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 711745f..ff77b80 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -49,7 +49,6 @@
static int realtime_prio = 0;
static bool raw_samples = false;
static bool system_wide = false;
-static int profile_cpu = -1;
static pid_t target_pid = -1;
static pid_t target_tid = -1;
static pid_t *all_tids = NULL;
@@ -61,6 +60,7 @@
static bool inherit_stat = false;
static bool no_samples = false;
static bool sample_address = false;
+static bool no_buildid = false;
static long samples = 0;
static u64 bytes_written = 0;
@@ -74,6 +74,7 @@
static off_t post_processing_offset;
static struct perf_session *session;
+static const char *cpu_list;
struct mmap_data {
int counter;
@@ -268,12 +269,17 @@
if (inherit_stat)
attr->inherit_stat = 1;
- if (sample_address)
+ if (sample_address) {
attr->sample_type |= PERF_SAMPLE_ADDR;
+ attr->mmap_data = track;
+ }
if (call_graph)
attr->sample_type |= PERF_SAMPLE_CALLCHAIN;
+ if (system_wide)
+ attr->sample_type |= PERF_SAMPLE_CPU;
+
if (raw_samples) {
attr->sample_type |= PERF_SAMPLE_TIME;
attr->sample_type |= PERF_SAMPLE_RAW;
@@ -300,7 +306,7 @@
die("Permission error - are you root?\n"
"\t Consider tweaking"
" /proc/sys/kernel/perf_event_paranoid.\n");
- else if (err == ENODEV && profile_cpu != -1) {
+ else if (err == ENODEV && cpu_list) {
die("No such device - did you specify"
" an out-of-range profile CPU?\n");
}
@@ -433,14 +439,14 @@
process_buildids();
perf_header__write(&session->header, output, true);
+ perf_session__delete(session);
+ symbol__exit();
}
}
static void event__synthesize_guest_os(struct machine *machine, void *data)
{
int err;
- char *guest_kallsyms;
- char path[PATH_MAX];
struct perf_session *psession = data;
if (machine__is_host(machine))
@@ -460,13 +466,6 @@
pr_err("Couldn't record guest kernel [%d]'s reference"
" relocation symbol.\n", machine->pid);
- if (machine__is_default_guest(machine))
- guest_kallsyms = (char *) symbol_conf.default_guest_kallsyms;
- else {
- sprintf(path, "%s/proc/kallsyms", machine->root_dir);
- guest_kallsyms = path;
- }
-
/*
* We use _stext for guest kernel because guest kernel's /proc/kallsyms
* have no _text sometimes.
@@ -561,12 +560,15 @@
if (!file_new) {
err = perf_header__read(session, output);
if (err < 0)
- return err;
+ goto out_delete_session;
}
if (have_tracepoints(attrs, nr_counters))
perf_header__set_feat(&session->header, HEADER_TRACE_INFO);
+ /*
+ * perf_session__delete(session) will be called at atexit_header()
+ */
atexit(atexit_header);
if (forks) {
@@ -622,10 +624,15 @@
close(child_ready_pipe[0]);
}
- if ((!system_wide && no_inherit) || profile_cpu != -1) {
- open_counters(profile_cpu);
+ nr_cpus = read_cpu_map(cpu_list);
+ if (nr_cpus < 1) {
+ perror("failed to collect number of CPUs\n");
+ return -1;
+ }
+
+ if (!system_wide && no_inherit && !cpu_list) {
+ open_counters(-1);
} else {
- nr_cpus = read_cpu_map();
for (i = 0; i < nr_cpus; i++)
open_counters(cpumap[i]);
}
@@ -704,7 +711,7 @@
if (perf_guest)
perf_session__process_machines(session, event__synthesize_guest_os);
- if (!system_wide && profile_cpu == -1)
+ if (!system_wide)
event__synthesize_thread(target_tid, process_synthesized_event,
session);
else
@@ -766,6 +773,10 @@
bytes_written / 24);
return 0;
+
+out_delete_session:
+ perf_session__delete(session);
+ return err;
}
static const char * const record_usage[] = {
@@ -794,8 +805,8 @@
"system-wide collection from all CPUs"),
OPT_BOOLEAN('A', "append", &append_file,
"append to the output file to do incremental profiling"),
- OPT_INTEGER('C', "profile_cpu", &profile_cpu,
- "CPU to profile on"),
+ OPT_STRING('C', "cpu", &cpu_list, "cpu",
+ "list of cpus to monitor"),
OPT_BOOLEAN('f', "force", &force,
"overwrite existing data file (deprecated)"),
OPT_U64('c', "count", &user_interval, "event period to sample"),
@@ -815,17 +826,19 @@
"Sample addresses"),
OPT_BOOLEAN('n', "no-samples", &no_samples,
"don't sample"),
+ OPT_BOOLEAN('N', "no-buildid-cache", &no_buildid,
+ "do not update the buildid cache"),
OPT_END()
};
int cmd_record(int argc, const char **argv, const char *prefix __used)
{
- int i,j;
+ int i, j, err = -ENOMEM;
argc = parse_options(argc, argv, options, record_usage,
PARSE_OPT_STOP_AT_NON_OPTION);
if (!argc && target_pid == -1 && target_tid == -1 &&
- !system_wide && profile_cpu == -1)
+ !system_wide && !cpu_list)
usage_with_options(record_usage, options);
if (force && append_file) {
@@ -839,6 +852,8 @@
}
symbol__init();
+ if (no_buildid)
+ disable_buildid_cache();
if (!nr_counters) {
nr_counters = 1;
@@ -857,7 +872,7 @@
} else {
all_tids=malloc(sizeof(pid_t));
if (!all_tids)
- return -ENOMEM;
+ goto out_symbol_exit;
all_tids[0] = target_tid;
thread_num = 1;
@@ -867,13 +882,13 @@
for (j = 0; j < MAX_COUNTERS; j++) {
fd[i][j] = malloc(sizeof(int)*thread_num);
if (!fd[i][j])
- return -ENOMEM;
+ goto out_free_fd;
}
}
event_array = malloc(
sizeof(struct pollfd)*MAX_NR_CPUS*MAX_COUNTERS*thread_num);
if (!event_array)
- return -ENOMEM;
+ goto out_free_fd;
if (user_interval != ULLONG_MAX)
default_interval = user_interval;
@@ -889,8 +904,22 @@
default_interval = freq;
} else {
fprintf(stderr, "frequency and count are zero, aborting\n");
- exit(EXIT_FAILURE);
+ err = -EINVAL;
+ goto out_free_event_array;
}
- return __cmd_record(argc, argv);
+ err = __cmd_record(argc, argv);
+
+out_free_event_array:
+ free(event_array);
+out_free_fd:
+ for (i = 0; i < MAX_NR_CPUS; i++) {
+ for (j = 0; j < MAX_COUNTERS; j++)
+ free(fd[i][j]);
+ }
+ free(all_tids);
+ all_tids = NULL;
+out_symbol_exit:
+ symbol__exit();
+ return err;
}
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index fd7407c..2f4b929 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -155,30 +155,7 @@
struct addr_location al;
struct perf_event_attr *attr;
- event__parse_sample(event, session->sample_type, &data);
-
- dump_printf("(IP, %d): %d/%d: %#Lx period: %Ld\n", event->header.misc,
- data.pid, data.tid, data.ip, data.period);
-
- if (session->sample_type & PERF_SAMPLE_CALLCHAIN) {
- unsigned int i;
-
- dump_printf("... chain: nr:%Lu\n", data.callchain->nr);
-
- if (!ip_callchain__valid(data.callchain, event)) {
- pr_debug("call-chain problem with event, "
- "skipping it.\n");
- return 0;
- }
-
- if (dump_trace) {
- for (i = 0; i < data.callchain->nr; i++)
- dump_printf("..... %2d: %016Lx\n",
- i, data.callchain->ips[i]);
- }
- }
-
- if (event__preprocess_sample(event, session, &al, NULL) < 0) {
+ if (event__preprocess_sample(event, session, &al, &data, NULL) < 0) {
fprintf(stderr, "problem processing %d event, skipping it.\n",
event->header.type);
return -1;
@@ -464,8 +441,6 @@
"pretty printing style key: normal raw"),
OPT_STRING('s', "sort", &sort_order, "key[,key2...]",
"sort by key(s): pid, comm, dso, symbol, parent"),
- OPT_BOOLEAN('P', "full-paths", &symbol_conf.full_paths,
- "Don't shorten the pathnames taking into account the cwd"),
OPT_BOOLEAN(0, "showcpuutilization", &symbol_conf.show_cpu_utilization,
"Show sample percentage for different cpu modes"),
OPT_STRING('p', "parent", &parent_pattern, "regex",
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 9a39ca3..a6b4d44 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -69,7 +69,7 @@
};
static bool system_wide = false;
-static unsigned int nr_cpus = 0;
+static int nr_cpus = 0;
static int run_idx = 0;
static int run_count = 1;
@@ -82,6 +82,7 @@
static pid_t child_pid = -1;
static bool null_run = false;
static bool big_num = false;
+static const char *cpu_list;
static int *fd[MAX_NR_CPUS][MAX_COUNTERS];
@@ -158,7 +159,7 @@
PERF_FORMAT_TOTAL_TIME_RUNNING;
if (system_wide) {
- unsigned int cpu;
+ int cpu;
for (cpu = 0; cpu < nr_cpus; cpu++) {
fd[cpu][counter][0] = sys_perf_event_open(attr,
@@ -208,7 +209,7 @@
static void read_counter(int counter)
{
u64 count[3], single_count[3];
- unsigned int cpu;
+ int cpu;
size_t res, nv;
int scaled;
int i, thread;
@@ -542,6 +543,8 @@
"null run - dont start any counters"),
OPT_BOOLEAN('B', "big-num", &big_num,
"print large numbers with thousands\' separators"),
+ OPT_STRING('C', "cpu", &cpu_list, "cpu",
+ "list of cpus to monitor in system-wide"),
OPT_END()
};
@@ -566,10 +569,13 @@
}
if (system_wide)
- nr_cpus = read_cpu_map();
+ nr_cpus = read_cpu_map(cpu_list);
else
nr_cpus = 1;
+ if (nr_cpus < 1)
+ usage_with_options(stat_usage, options);
+
if (target_pid != -1) {
target_tid = target_pid;
thread_num = find_all_tid(target_pid, &all_tids);
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index a66f427..b513e40 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -102,6 +102,7 @@
static int sym_pcnt_filter = 5;
static int sym_counter = 0;
static int display_weighted = -1;
+static const char *cpu_list;
/*
* Symbols
@@ -982,6 +983,7 @@
u64 ip = self->ip.ip;
struct sym_entry *syme;
struct addr_location al;
+ struct sample_data data;
struct machine *machine;
u8 origin = self->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
@@ -1024,7 +1026,8 @@
if (self->header.misc & PERF_RECORD_MISC_EXACT_IP)
exact_samples++;
- if (event__preprocess_sample(self, session, &al, symbol_filter) < 0 ||
+ if (event__preprocess_sample(self, session, &al, &data,
+ symbol_filter) < 0 ||
al.filtered)
return;
@@ -1079,26 +1082,6 @@
}
}
-static int event__process(event_t *event, struct perf_session *session)
-{
- switch (event->header.type) {
- case PERF_RECORD_COMM:
- event__process_comm(event, session);
- break;
- case PERF_RECORD_MMAP:
- event__process_mmap(event, session);
- break;
- case PERF_RECORD_FORK:
- case PERF_RECORD_EXIT:
- event__process_task(event, session);
- break;
- default:
- break;
- }
-
- return 0;
-}
-
struct mmap_data {
int counter;
void *base;
@@ -1351,8 +1334,8 @@
"profile events on existing thread id"),
OPT_BOOLEAN('a', "all-cpus", &system_wide,
"system-wide collection from all CPUs"),
- OPT_INTEGER('C', "CPU", &profile_cpu,
- "CPU to profile on"),
+ OPT_STRING('C', "cpu", &cpu_list, "cpu",
+ "list of cpus to monitor"),
OPT_STRING('k', "vmlinux", &symbol_conf.vmlinux_name,
"file", "vmlinux pathname"),
OPT_BOOLEAN('K', "hide_kernel_symbols", &hide_kernel_symbols,
@@ -1428,10 +1411,10 @@
return -ENOMEM;
/* CPU and PID are mutually exclusive */
- if (target_tid > 0 && profile_cpu != -1) {
+ if (target_tid > 0 && cpu_list) {
printf("WARNING: PID switch overriding CPU\n");
sleep(1);
- profile_cpu = -1;
+ cpu_list = NULL;
}
if (!nr_counters)
@@ -1469,10 +1452,13 @@
attrs[counter].sample_period = default_interval;
}
- if (target_tid != -1 || profile_cpu != -1)
+ if (target_tid != -1)
nr_cpus = 1;
else
- nr_cpus = read_cpu_map();
+ nr_cpus = read_cpu_map(cpu_list);
+
+ if (nr_cpus < 1)
+ usage_with_options(top_usage, options);
get_term_dimensions(&winsize);
if (print_entries == 0) {
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index dddf3f0..294da72 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -11,8 +11,9 @@
static char const *script_name;
static char const *generate_script_lang;
-static bool debug_ordering;
+static bool debug_mode;
static u64 last_timestamp;
+static u64 nr_unordered;
static int default_start_script(const char *script __unused,
int argc __unused,
@@ -91,13 +92,15 @@
}
if (session->sample_type & PERF_SAMPLE_RAW) {
- if (debug_ordering) {
+ if (debug_mode) {
if (data.time < last_timestamp) {
pr_err("Samples misordered, previous: %llu "
"this: %llu\n", last_timestamp,
data.time);
+ nr_unordered++;
}
last_timestamp = data.time;
+ return 0;
}
/*
* FIXME: better resolve from pid from the struct trace_entry
@@ -113,6 +116,15 @@
return 0;
}
+static u64 nr_lost;
+
+static int process_lost_event(event_t *event, struct perf_session *session __used)
+{
+ nr_lost += event->lost.lost;
+
+ return 0;
+}
+
static struct perf_event_ops event_ops = {
.sample = process_sample_event,
.comm = event__process_comm,
@@ -120,6 +132,7 @@
.event_type = event__process_event_type,
.tracing_data = event__process_tracing_data,
.build_id = event__process_build_id,
+ .lost = process_lost_event,
.ordered_samples = true,
};
@@ -132,9 +145,18 @@
static int __cmd_trace(struct perf_session *session)
{
+ int ret;
+
signal(SIGINT, sig_handler);
- return perf_session__process_events(session, &event_ops);
+ ret = perf_session__process_events(session, &event_ops);
+
+ if (debug_mode) {
+ pr_err("Misordered timestamps: %llu\n", nr_unordered);
+ pr_err("Lost events: %llu\n", nr_lost);
+ }
+
+ return ret;
}
struct script_spec {
@@ -544,8 +566,8 @@
"generate perf-trace.xx script in specified language"),
OPT_STRING('i', "input", &input_name, "file",
"input file name"),
- OPT_BOOLEAN('d', "debug-ordering", &debug_ordering,
- "check that samples time ordering is monotonic"),
+ OPT_BOOLEAN('d', "debug-mode", &debug_mode,
+ "do various checks like samples ordering and lost events"),
OPT_END()
};
diff --git a/tools/perf/feature-tests.mak b/tools/perf/feature-tests.mak
new file mode 100644
index 0000000..ddb68e6
--- /dev/null
+++ b/tools/perf/feature-tests.mak
@@ -0,0 +1,119 @@
+define SOURCE_HELLO
+#include <stdio.h>
+int main(void)
+{
+ return puts(\"hi\");
+}
+endef
+
+ifndef NO_DWARF
+define SOURCE_DWARF
+#include <dwarf.h>
+#include <libdw.h>
+#include <version.h>
+#ifndef _ELFUTILS_PREREQ
+#error
+#endif
+
+int main(void)
+{
+ Dwarf *dbg = dwarf_begin(0, DWARF_C_READ);
+ return (long)dbg;
+}
+endef
+endif
+
+define SOURCE_LIBELF
+#include <libelf.h>
+
+int main(void)
+{
+ Elf *elf = elf_begin(0, ELF_C_READ, 0);
+ return (long)elf;
+}
+endef
+
+define SOURCE_GLIBC
+#include <gnu/libc-version.h>
+
+int main(void)
+{
+ const char *version = gnu_get_libc_version();
+ return (long)version;
+}
+endef
+
+define SOURCE_ELF_MMAP
+#include <libelf.h>
+int main(void)
+{
+ Elf *elf = elf_begin(0, ELF_C_READ_MMAP, 0);
+ return (long)elf;
+}
+endef
+
+ifndef NO_NEWT
+define SOURCE_NEWT
+#include <newt.h>
+
+int main(void)
+{
+ newtInit();
+ newtCls();
+ return newtFinished();
+}
+endef
+endif
+
+ifndef NO_LIBPERL
+define SOURCE_PERL_EMBED
+#include <EXTERN.h>
+#include <perl.h>
+
+int main(void)
+{
+perl_alloc();
+return 0;
+}
+endef
+endif
+
+ifndef NO_LIBPYTHON
+define SOURCE_PYTHON_EMBED
+#include <Python.h>
+
+int main(void)
+{
+ Py_Initialize();
+ return 0;
+}
+endef
+endif
+
+define SOURCE_BFD
+#include <bfd.h>
+
+int main(void)
+{
+ bfd_demangle(0, 0, 0);
+ return 0;
+}
+endef
+
+define SOURCE_CPLUS_DEMANGLE
+extern char *cplus_demangle(const char *, int);
+
+int main(void)
+{
+ cplus_demangle(0, 0);
+ return 0;
+}
+endef
+
+# try-cc
+# Usage: option = $(call try-cc, source-to-build, cc-options)
+try-cc = $(shell sh -c \
+ 'TMP="$(TMPOUT).$$$$"; \
+ echo "$(1)" | \
+ $(CC) -x c - $(2) -o "$$TMP" > /dev/null 2>&1 && echo y; \
+ rm -f "$$TMP"')
diff --git a/tools/perf/perf-archive.sh b/tools/perf/perf-archive.sh
index 2e7a4f4..677e59d 100644
--- a/tools/perf/perf-archive.sh
+++ b/tools/perf/perf-archive.sh
@@ -7,7 +7,17 @@
PERF_DATA=$1
fi
-DEBUGDIR=~/.debug/
+#
+# PERF_BUILDID_DIR environment variable set by perf
+# path to buildid directory, default to $HOME/.debug
+#
+if [ -z $PERF_BUILDID_DIR ]; then
+ PERF_BUILDID_DIR=~/.debug/
+else
+ # append / to make substitutions work
+ PERF_BUILDID_DIR=$PERF_BUILDID_DIR/
+fi
+
BUILDIDS=$(mktemp /tmp/perf-archive-buildids.XXXXXX)
NOBUILDID=0000000000000000000000000000000000000000
@@ -22,13 +32,13 @@
cut -d ' ' -f 1 $BUILDIDS | \
while read build_id ; do
- linkname=$DEBUGDIR.build-id/${build_id:0:2}/${build_id:2}
+ linkname=$PERF_BUILDID_DIR.build-id/${build_id:0:2}/${build_id:2}
filename=$(readlink -f $linkname)
- echo ${linkname#$DEBUGDIR} >> $MANIFEST
- echo ${filename#$DEBUGDIR} >> $MANIFEST
+ echo ${linkname#$PERF_BUILDID_DIR} >> $MANIFEST
+ echo ${filename#$PERF_BUILDID_DIR} >> $MANIFEST
done
-tar cfj $PERF_DATA.tar.bz2 -C $DEBUGDIR -T $MANIFEST
+tar cfj $PERF_DATA.tar.bz2 -C $PERF_BUILDID_DIR -T $MANIFEST
rm -f $MANIFEST $BUILDIDS
echo -e "Now please run:\n"
echo -e "$ tar xvf $PERF_DATA.tar.bz2 -C ~/.debug\n"
diff --git a/tools/perf/perf.c b/tools/perf/perf.c
index 6e48711..cdd6c03 100644
--- a/tools/perf/perf.c
+++ b/tools/perf/perf.c
@@ -458,6 +458,8 @@
handle_options(&argv, &argc, NULL);
commit_pager_choice();
set_debugfs_path();
+ set_buildid_dir();
+
if (argc > 0) {
if (!prefixcmp(argv[0], "--"))
argv[0] += 2;
diff --git a/tools/perf/scripts/python/Perf-Trace-Util/lib/Perf/Trace/Core.py b/tools/perf/scripts/python/Perf-Trace-Util/lib/Perf/Trace/Core.py
index 1dc464e..aad7525 100644
--- a/tools/perf/scripts/python/Perf-Trace-Util/lib/Perf/Trace/Core.py
+++ b/tools/perf/scripts/python/Perf-Trace-Util/lib/Perf/Trace/Core.py
@@ -89,3 +89,33 @@
value &= ~idx
return string
+
+
+def taskState(state):
+ states = {
+ 0 : "R",
+ 1 : "S",
+ 2 : "D",
+ 64: "DEAD"
+ }
+
+ if state not in states:
+ return "Unknown"
+
+ return states[state]
+
+
+class EventHeaders:
+ def __init__(self, common_cpu, common_secs, common_nsecs,
+ common_pid, common_comm):
+ self.cpu = common_cpu
+ self.secs = common_secs
+ self.nsecs = common_nsecs
+ self.pid = common_pid
+ self.comm = common_comm
+
+ def ts(self):
+ return (self.secs * (10 ** 9)) + self.nsecs
+
+ def ts_format(self):
+ return "%d.%d" % (self.secs, int(self.nsecs / 1000))
diff --git a/tools/perf/scripts/python/Perf-Trace-Util/lib/Perf/Trace/SchedGui.py b/tools/perf/scripts/python/Perf-Trace-Util/lib/Perf/Trace/SchedGui.py
new file mode 100644
index 0000000..ae9a56e
--- /dev/null
+++ b/tools/perf/scripts/python/Perf-Trace-Util/lib/Perf/Trace/SchedGui.py
@@ -0,0 +1,184 @@
+# SchedGui.py - Python extension for perf trace, basic GUI code for
+# traces drawing and overview.
+#
+# Copyright (C) 2010 by Frederic Weisbecker <fweisbec@gmail.com>
+#
+# This software is distributed under the terms of the GNU General
+# Public License ("GPL") version 2 as published by the Free Software
+# Foundation.
+
+
+try:
+ import wx
+except ImportError:
+ raise ImportError, "You need to install the wxpython lib for this script"
+
+
+class RootFrame(wx.Frame):
+ Y_OFFSET = 100
+ RECT_HEIGHT = 100
+ RECT_SPACE = 50
+ EVENT_MARKING_WIDTH = 5
+
+ def __init__(self, sched_tracer, title, parent = None, id = -1):
+ wx.Frame.__init__(self, parent, id, title)
+
+ (self.screen_width, self.screen_height) = wx.GetDisplaySize()
+ self.screen_width -= 10
+ self.screen_height -= 10
+ self.zoom = 0.5
+ self.scroll_scale = 20
+ self.sched_tracer = sched_tracer
+ self.sched_tracer.set_root_win(self)
+ (self.ts_start, self.ts_end) = sched_tracer.interval()
+ self.update_width_virtual()
+ self.nr_rects = sched_tracer.nr_rectangles() + 1
+ self.height_virtual = RootFrame.Y_OFFSET + (self.nr_rects * (RootFrame.RECT_HEIGHT + RootFrame.RECT_SPACE))
+
+ # whole window panel
+ self.panel = wx.Panel(self, size=(self.screen_width, self.screen_height))
+
+ # scrollable container
+ self.scroll = wx.ScrolledWindow(self.panel)
+ self.scroll.SetScrollbars(self.scroll_scale, self.scroll_scale, self.width_virtual / self.scroll_scale, self.height_virtual / self.scroll_scale)
+ self.scroll.EnableScrolling(True, True)
+ self.scroll.SetFocus()
+
+ # scrollable drawing area
+ self.scroll_panel = wx.Panel(self.scroll, size=(self.screen_width - 15, self.screen_height / 2))
+ self.scroll_panel.Bind(wx.EVT_PAINT, self.on_paint)
+ self.scroll_panel.Bind(wx.EVT_KEY_DOWN, self.on_key_press)
+ self.scroll_panel.Bind(wx.EVT_LEFT_DOWN, self.on_mouse_down)
+ self.scroll.Bind(wx.EVT_PAINT, self.on_paint)
+ self.scroll.Bind(wx.EVT_KEY_DOWN, self.on_key_press)
+ self.scroll.Bind(wx.EVT_LEFT_DOWN, self.on_mouse_down)
+
+ self.scroll.Fit()
+ self.Fit()
+
+ self.scroll_panel.SetDimensions(-1, -1, self.width_virtual, self.height_virtual, wx.SIZE_USE_EXISTING)
+
+ self.txt = None
+
+ self.Show(True)
+
+ def us_to_px(self, val):
+ return val / (10 ** 3) * self.zoom
+
+ def px_to_us(self, val):
+ return (val / self.zoom) * (10 ** 3)
+
+ def scroll_start(self):
+ (x, y) = self.scroll.GetViewStart()
+ return (x * self.scroll_scale, y * self.scroll_scale)
+
+ def scroll_start_us(self):
+ (x, y) = self.scroll_start()
+ return self.px_to_us(x)
+
+ def paint_rectangle_zone(self, nr, color, top_color, start, end):
+ offset_px = self.us_to_px(start - self.ts_start)
+ width_px = self.us_to_px(end - self.ts_start)
+
+ offset_py = RootFrame.Y_OFFSET + (nr * (RootFrame.RECT_HEIGHT + RootFrame.RECT_SPACE))
+ width_py = RootFrame.RECT_HEIGHT
+
+ dc = self.dc
+
+ if top_color is not None:
+ (r, g, b) = top_color
+ top_color = wx.Colour(r, g, b)
+ brush = wx.Brush(top_color, wx.SOLID)
+ dc.SetBrush(brush)
+ dc.DrawRectangle(offset_px, offset_py, width_px, RootFrame.EVENT_MARKING_WIDTH)
+ width_py -= RootFrame.EVENT_MARKING_WIDTH
+ offset_py += RootFrame.EVENT_MARKING_WIDTH
+
+ (r ,g, b) = color
+ color = wx.Colour(r, g, b)
+ brush = wx.Brush(color, wx.SOLID)
+ dc.SetBrush(brush)
+ dc.DrawRectangle(offset_px, offset_py, width_px, width_py)
+
+ def update_rectangles(self, dc, start, end):
+ start += self.ts_start
+ end += self.ts_start
+ self.sched_tracer.fill_zone(start, end)
+
+ def on_paint(self, event):
+ dc = wx.PaintDC(self.scroll_panel)
+ self.dc = dc
+
+ width = min(self.width_virtual, self.screen_width)
+ (x, y) = self.scroll_start()
+ start = self.px_to_us(x)
+ end = self.px_to_us(x + width)
+ self.update_rectangles(dc, start, end)
+
+ def rect_from_ypixel(self, y):
+ y -= RootFrame.Y_OFFSET
+ rect = y / (RootFrame.RECT_HEIGHT + RootFrame.RECT_SPACE)
+ height = y % (RootFrame.RECT_HEIGHT + RootFrame.RECT_SPACE)
+
+ if rect < 0 or rect > self.nr_rects - 1 or height > RootFrame.RECT_HEIGHT:
+ return -1
+
+ return rect
+
+ def update_summary(self, txt):
+ if self.txt:
+ self.txt.Destroy()
+ self.txt = wx.StaticText(self.panel, -1, txt, (0, (self.screen_height / 2) + 50))
+
+
+ def on_mouse_down(self, event):
+ (x, y) = event.GetPositionTuple()
+ rect = self.rect_from_ypixel(y)
+ if rect == -1:
+ return
+
+ t = self.px_to_us(x) + self.ts_start
+
+ self.sched_tracer.mouse_down(rect, t)
+
+
+ def update_width_virtual(self):
+ self.width_virtual = self.us_to_px(self.ts_end - self.ts_start)
+
+ def __zoom(self, x):
+ self.update_width_virtual()
+ (xpos, ypos) = self.scroll.GetViewStart()
+ xpos = self.us_to_px(x) / self.scroll_scale
+ self.scroll.SetScrollbars(self.scroll_scale, self.scroll_scale, self.width_virtual / self.scroll_scale, self.height_virtual / self.scroll_scale, xpos, ypos)
+ self.Refresh()
+
+ def zoom_in(self):
+ x = self.scroll_start_us()
+ self.zoom *= 2
+ self.__zoom(x)
+
+ def zoom_out(self):
+ x = self.scroll_start_us()
+ self.zoom /= 2
+ self.__zoom(x)
+
+
+ def on_key_press(self, event):
+ key = event.GetRawKeyCode()
+ if key == ord("+"):
+ self.zoom_in()
+ return
+ if key == ord("-"):
+ self.zoom_out()
+ return
+
+ key = event.GetKeyCode()
+ (x, y) = self.scroll.GetViewStart()
+ if key == wx.WXK_RIGHT:
+ self.scroll.Scroll(x + 1, y)
+ elif key == wx.WXK_LEFT:
+ self.scroll.Scroll(x - 1, y)
+ elif key == wx.WXK_DOWN:
+ self.scroll.Scroll(x, y + 1)
+ elif key == wx.WXK_UP:
+ self.scroll.Scroll(x, y - 1)
diff --git a/tools/perf/scripts/python/bin/sched-migration-record b/tools/perf/scripts/python/bin/sched-migration-record
new file mode 100644
index 0000000..17a3e9b
--- /dev/null
+++ b/tools/perf/scripts/python/bin/sched-migration-record
@@ -0,0 +1,2 @@
+#!/bin/bash
+perf record -m 16384 -a -e sched:sched_wakeup -e sched:sched_wakeup_new -e sched:sched_switch -e sched:sched_migrate_task $@
diff --git a/tools/perf/scripts/python/bin/sched-migration-report b/tools/perf/scripts/python/bin/sched-migration-report
new file mode 100644
index 0000000..61d05f7
--- /dev/null
+++ b/tools/perf/scripts/python/bin/sched-migration-report
@@ -0,0 +1,3 @@
+#!/bin/bash
+# description: sched migration overview
+perf trace $@ -s ~/libexec/perf-core/scripts/python/sched-migration.py
diff --git a/tools/perf/scripts/python/sched-migration.py b/tools/perf/scripts/python/sched-migration.py
new file mode 100644
index 0000000..b934383
--- /dev/null
+++ b/tools/perf/scripts/python/sched-migration.py
@@ -0,0 +1,461 @@
+#!/usr/bin/python
+#
+# Cpu task migration overview toy
+#
+# Copyright (C) 2010 Frederic Weisbecker <fweisbec@gmail.com>
+#
+# perf trace event handlers have been generated by perf trace -g python
+#
+# This software is distributed under the terms of the GNU General
+# Public License ("GPL") version 2 as published by the Free Software
+# Foundation.
+
+
+import os
+import sys
+
+from collections import defaultdict
+from UserList import UserList
+
+sys.path.append(os.environ['PERF_EXEC_PATH'] + \
+ '/scripts/python/Perf-Trace-Util/lib/Perf/Trace')
+sys.path.append('scripts/python/Perf-Trace-Util/lib/Perf/Trace')
+
+from perf_trace_context import *
+from Core import *
+from SchedGui import *
+
+
+threads = { 0 : "idle"}
+
+def thread_name(pid):
+ return "%s:%d" % (threads[pid], pid)
+
+class RunqueueEventUnknown:
+ @staticmethod
+ def color():
+ return None
+
+ def __repr__(self):
+ return "unknown"
+
+class RunqueueEventSleep:
+ @staticmethod
+ def color():
+ return (0, 0, 0xff)
+
+ def __init__(self, sleeper):
+ self.sleeper = sleeper
+
+ def __repr__(self):
+ return "%s gone to sleep" % thread_name(self.sleeper)
+
+class RunqueueEventWakeup:
+ @staticmethod
+ def color():
+ return (0xff, 0xff, 0)
+
+ def __init__(self, wakee):
+ self.wakee = wakee
+
+ def __repr__(self):
+ return "%s woke up" % thread_name(self.wakee)
+
+class RunqueueEventFork:
+ @staticmethod
+ def color():
+ return (0, 0xff, 0)
+
+ def __init__(self, child):
+ self.child = child
+
+ def __repr__(self):
+ return "new forked task %s" % thread_name(self.child)
+
+class RunqueueMigrateIn:
+ @staticmethod
+ def color():
+ return (0, 0xf0, 0xff)
+
+ def __init__(self, new):
+ self.new = new
+
+ def __repr__(self):
+ return "task migrated in %s" % thread_name(self.new)
+
+class RunqueueMigrateOut:
+ @staticmethod
+ def color():
+ return (0xff, 0, 0xff)
+
+ def __init__(self, old):
+ self.old = old
+
+ def __repr__(self):
+ return "task migrated out %s" % thread_name(self.old)
+
+class RunqueueSnapshot:
+ def __init__(self, tasks = [0], event = RunqueueEventUnknown()):
+ self.tasks = tuple(tasks)
+ self.event = event
+
+ def sched_switch(self, prev, prev_state, next):
+ event = RunqueueEventUnknown()
+
+ if taskState(prev_state) == "R" and next in self.tasks \
+ and prev in self.tasks:
+ return self
+
+ if taskState(prev_state) != "R":
+ event = RunqueueEventSleep(prev)
+
+ next_tasks = list(self.tasks[:])
+ if prev in self.tasks:
+ if taskState(prev_state) != "R":
+ next_tasks.remove(prev)
+ elif taskState(prev_state) == "R":
+ next_tasks.append(prev)
+
+ if next not in next_tasks:
+ next_tasks.append(next)
+
+ return RunqueueSnapshot(next_tasks, event)
+
+ def migrate_out(self, old):
+ if old not in self.tasks:
+ return self
+ next_tasks = [task for task in self.tasks if task != old]
+
+ return RunqueueSnapshot(next_tasks, RunqueueMigrateOut(old))
+
+ def __migrate_in(self, new, event):
+ if new in self.tasks:
+ self.event = event
+ return self
+ next_tasks = self.tasks[:] + tuple([new])
+
+ return RunqueueSnapshot(next_tasks, event)
+
+ def migrate_in(self, new):
+ return self.__migrate_in(new, RunqueueMigrateIn(new))
+
+ def wake_up(self, new):
+ return self.__migrate_in(new, RunqueueEventWakeup(new))
+
+ def wake_up_new(self, new):
+ return self.__migrate_in(new, RunqueueEventFork(new))
+
+ def load(self):
+ """ Provide the number of tasks on the runqueue.
+ Don't count idle"""
+ return len(self.tasks) - 1
+
+ def __repr__(self):
+ ret = self.tasks.__repr__()
+ ret += self.origin_tostring()
+
+ return ret
+
+class TimeSlice:
+ def __init__(self, start, prev):
+ self.start = start
+ self.prev = prev
+ self.end = start
+ # cpus that triggered the event
+ self.event_cpus = []
+ if prev is not None:
+ self.total_load = prev.total_load
+ self.rqs = prev.rqs.copy()
+ else:
+ self.rqs = defaultdict(RunqueueSnapshot)
+ self.total_load = 0
+
+ def __update_total_load(self, old_rq, new_rq):
+ diff = new_rq.load() - old_rq.load()
+ self.total_load += diff
+
+ def sched_switch(self, ts_list, prev, prev_state, next, cpu):
+ old_rq = self.prev.rqs[cpu]
+ new_rq = old_rq.sched_switch(prev, prev_state, next)
+
+ if old_rq is new_rq:
+ return
+
+ self.rqs[cpu] = new_rq
+ self.__update_total_load(old_rq, new_rq)
+ ts_list.append(self)
+ self.event_cpus = [cpu]
+
+ def migrate(self, ts_list, new, old_cpu, new_cpu):
+ if old_cpu == new_cpu:
+ return
+ old_rq = self.prev.rqs[old_cpu]
+ out_rq = old_rq.migrate_out(new)
+ self.rqs[old_cpu] = out_rq
+ self.__update_total_load(old_rq, out_rq)
+
+ new_rq = self.prev.rqs[new_cpu]
+ in_rq = new_rq.migrate_in(new)
+ self.rqs[new_cpu] = in_rq
+ self.__update_total_load(new_rq, in_rq)
+
+ ts_list.append(self)
+
+ if old_rq is not out_rq:
+ self.event_cpus.append(old_cpu)
+ self.event_cpus.append(new_cpu)
+
+ def wake_up(self, ts_list, pid, cpu, fork):
+ old_rq = self.prev.rqs[cpu]
+ if fork:
+ new_rq = old_rq.wake_up_new(pid)
+ else:
+ new_rq = old_rq.wake_up(pid)
+
+ if new_rq is old_rq:
+ return
+ self.rqs[cpu] = new_rq
+ self.__update_total_load(old_rq, new_rq)
+ ts_list.append(self)
+ self.event_cpus = [cpu]
+
+ def next(self, t):
+ self.end = t
+ return TimeSlice(t, self)
+
+class TimeSliceList(UserList):
+ def __init__(self, arg = []):
+ self.data = arg
+
+ def get_time_slice(self, ts):
+ if len(self.data) == 0:
+ slice = TimeSlice(ts, TimeSlice(-1, None))
+ else:
+ slice = self.data[-1].next(ts)
+ return slice
+
+ def find_time_slice(self, ts):
+ start = 0
+ end = len(self.data)
+ found = -1
+ searching = True
+ while searching:
+ if start == end or start == end - 1:
+ searching = False
+
+ i = (end + start) / 2
+ if self.data[i].start <= ts and self.data[i].end >= ts:
+ found = i
+ end = i
+ continue
+
+ if self.data[i].end < ts:
+ start = i
+
+ elif self.data[i].start > ts:
+ end = i
+
+ return found
+
+ def set_root_win(self, win):
+ self.root_win = win
+
+ def mouse_down(self, cpu, t):
+ idx = self.find_time_slice(t)
+ if idx == -1:
+ return
+
+ ts = self[idx]
+ rq = ts.rqs[cpu]
+ raw = "CPU: %d\n" % cpu
+ raw += "Last event : %s\n" % rq.event.__repr__()
+ raw += "Timestamp : %d.%06d\n" % (ts.start / (10 ** 9), (ts.start % (10 ** 9)) / 1000)
+ raw += "Duration : %6d us\n" % ((ts.end - ts.start) / (10 ** 6))
+ raw += "Load = %d\n" % rq.load()
+ for t in rq.tasks:
+ raw += "%s \n" % thread_name(t)
+
+ self.root_win.update_summary(raw)
+
+ def update_rectangle_cpu(self, slice, cpu):
+ rq = slice.rqs[cpu]
+
+ if slice.total_load != 0:
+ load_rate = rq.load() / float(slice.total_load)
+ else:
+ load_rate = 0
+
+ red_power = int(0xff - (0xff * load_rate))
+ color = (0xff, red_power, red_power)
+
+ top_color = None
+
+ if cpu in slice.event_cpus:
+ top_color = rq.event.color()
+
+ self.root_win.paint_rectangle_zone(cpu, color, top_color, slice.start, slice.end)
+
+ def fill_zone(self, start, end):
+ i = self.find_time_slice(start)
+ if i == -1:
+ return
+
+ for i in xrange(i, len(self.data)):
+ timeslice = self.data[i]
+ if timeslice.start > end:
+ return
+
+ for cpu in timeslice.rqs:
+ self.update_rectangle_cpu(timeslice, cpu)
+
+ def interval(self):
+ if len(self.data) == 0:
+ return (0, 0)
+
+ return (self.data[0].start, self.data[-1].end)
+
+ def nr_rectangles(self):
+ last_ts = self.data[-1]
+ max_cpu = 0
+ for cpu in last_ts.rqs:
+ if cpu > max_cpu:
+ max_cpu = cpu
+ return max_cpu
+
+
+class SchedEventProxy:
+ def __init__(self):
+ self.current_tsk = defaultdict(lambda : -1)
+ self.timeslices = TimeSliceList()
+
+ def sched_switch(self, headers, prev_comm, prev_pid, prev_prio, prev_state,
+ next_comm, next_pid, next_prio):
+ """ Ensure the task we sched out this cpu is really the one
+ we logged. Otherwise we may have missed traces """
+
+ on_cpu_task = self.current_tsk[headers.cpu]
+
+ if on_cpu_task != -1 and on_cpu_task != prev_pid:
+ print "Sched switch event rejected ts: %s cpu: %d prev: %s(%d) next: %s(%d)" % \
+ (headers.ts_format(), headers.cpu, prev_comm, prev_pid, next_comm, next_pid)
+
+ threads[prev_pid] = prev_comm
+ threads[next_pid] = next_comm
+ self.current_tsk[headers.cpu] = next_pid
+
+ ts = self.timeslices.get_time_slice(headers.ts())
+ ts.sched_switch(self.timeslices, prev_pid, prev_state, next_pid, headers.cpu)
+
+ def migrate(self, headers, pid, prio, orig_cpu, dest_cpu):
+ ts = self.timeslices.get_time_slice(headers.ts())
+ ts.migrate(self.timeslices, pid, orig_cpu, dest_cpu)
+
+ def wake_up(self, headers, comm, pid, success, target_cpu, fork):
+ if success == 0:
+ return
+ ts = self.timeslices.get_time_slice(headers.ts())
+ ts.wake_up(self.timeslices, pid, target_cpu, fork)
+
+
+def trace_begin():
+ global parser
+ parser = SchedEventProxy()
+
+def trace_end():
+ app = wx.App(False)
+ timeslices = parser.timeslices
+ frame = RootFrame(timeslices, "Migration")
+ app.MainLoop()
+
+def sched__sched_stat_runtime(event_name, context, common_cpu,
+ common_secs, common_nsecs, common_pid, common_comm,
+ comm, pid, runtime, vruntime):
+ pass
+
+def sched__sched_stat_iowait(event_name, context, common_cpu,
+ common_secs, common_nsecs, common_pid, common_comm,
+ comm, pid, delay):
+ pass
+
+def sched__sched_stat_sleep(event_name, context, common_cpu,
+ common_secs, common_nsecs, common_pid, common_comm,
+ comm, pid, delay):
+ pass
+
+def sched__sched_stat_wait(event_name, context, common_cpu,
+ common_secs, common_nsecs, common_pid, common_comm,
+ comm, pid, delay):
+ pass
+
+def sched__sched_process_fork(event_name, context, common_cpu,
+ common_secs, common_nsecs, common_pid, common_comm,
+ parent_comm, parent_pid, child_comm, child_pid):
+ pass
+
+def sched__sched_process_wait(event_name, context, common_cpu,
+ common_secs, common_nsecs, common_pid, common_comm,
+ comm, pid, prio):
+ pass
+
+def sched__sched_process_exit(event_name, context, common_cpu,
+ common_secs, common_nsecs, common_pid, common_comm,
+ comm, pid, prio):
+ pass
+
+def sched__sched_process_free(event_name, context, common_cpu,
+ common_secs, common_nsecs, common_pid, common_comm,
+ comm, pid, prio):
+ pass
+
+def sched__sched_migrate_task(event_name, context, common_cpu,
+ common_secs, common_nsecs, common_pid, common_comm,
+ comm, pid, prio, orig_cpu,
+ dest_cpu):
+ headers = EventHeaders(common_cpu, common_secs, common_nsecs,
+ common_pid, common_comm)
+ parser.migrate(headers, pid, prio, orig_cpu, dest_cpu)
+
+def sched__sched_switch(event_name, context, common_cpu,
+ common_secs, common_nsecs, common_pid, common_comm,
+ prev_comm, prev_pid, prev_prio, prev_state,
+ next_comm, next_pid, next_prio):
+
+ headers = EventHeaders(common_cpu, common_secs, common_nsecs,
+ common_pid, common_comm)
+ parser.sched_switch(headers, prev_comm, prev_pid, prev_prio, prev_state,
+ next_comm, next_pid, next_prio)
+
+def sched__sched_wakeup_new(event_name, context, common_cpu,
+ common_secs, common_nsecs, common_pid, common_comm,
+ comm, pid, prio, success,
+ target_cpu):
+ headers = EventHeaders(common_cpu, common_secs, common_nsecs,
+ common_pid, common_comm)
+ parser.wake_up(headers, comm, pid, success, target_cpu, 1)
+
+def sched__sched_wakeup(event_name, context, common_cpu,
+ common_secs, common_nsecs, common_pid, common_comm,
+ comm, pid, prio, success,
+ target_cpu):
+ headers = EventHeaders(common_cpu, common_secs, common_nsecs,
+ common_pid, common_comm)
+ parser.wake_up(headers, comm, pid, success, target_cpu, 0)
+
+def sched__sched_wait_task(event_name, context, common_cpu,
+ common_secs, common_nsecs, common_pid, common_comm,
+ comm, pid, prio):
+ pass
+
+def sched__sched_kthread_stop_ret(event_name, context, common_cpu,
+ common_secs, common_nsecs, common_pid, common_comm,
+ ret):
+ pass
+
+def sched__sched_kthread_stop(event_name, context, common_cpu,
+ common_secs, common_nsecs, common_pid, common_comm,
+ comm, pid):
+ pass
+
+def trace_unhandled(event_name, context, common_cpu, common_secs, common_nsecs,
+ common_pid, common_comm):
+ pass
diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
index 70c5cf8..e437edb 100644
--- a/tools/perf/util/build-id.c
+++ b/tools/perf/util/build-id.c
@@ -12,6 +12,7 @@
#include "event.h"
#include "symbol.h"
#include <linux/kernel.h>
+#include "debug.h"
static int build_id__mark_dso_hit(event_t *event, struct perf_session *session)
{
@@ -34,28 +35,43 @@
return 0;
}
+static int event__exit_del_thread(event_t *self, struct perf_session *session)
+{
+ struct thread *thread = perf_session__findnew(session, self->fork.tid);
+
+ dump_printf("(%d:%d):(%d:%d)\n", self->fork.pid, self->fork.tid,
+ self->fork.ppid, self->fork.ptid);
+
+ if (thread) {
+ rb_erase(&thread->rb_node, &session->threads);
+ session->last_match = NULL;
+ thread__delete(thread);
+ }
+
+ return 0;
+}
+
struct perf_event_ops build_id__mark_dso_hit_ops = {
.sample = build_id__mark_dso_hit,
.mmap = event__process_mmap,
.fork = event__process_task,
+ .exit = event__exit_del_thread,
};
char *dso__build_id_filename(struct dso *self, char *bf, size_t size)
{
char build_id_hex[BUILD_ID_SIZE * 2 + 1];
- const char *home;
if (!self->has_build_id)
return NULL;
build_id__sprintf(self->build_id, sizeof(self->build_id), build_id_hex);
- home = getenv("HOME");
if (bf == NULL) {
- if (asprintf(&bf, "%s/%s/.build-id/%.2s/%s", home,
- DEBUG_CACHE_DIR, build_id_hex, build_id_hex + 2) < 0)
+ if (asprintf(&bf, "%s/.build-id/%.2s/%s", buildid_dir,
+ build_id_hex, build_id_hex + 2) < 0)
return NULL;
} else
- snprintf(bf, size, "%s/%s/.build-id/%.2s/%s", home,
- DEBUG_CACHE_DIR, build_id_hex, build_id_hex + 2);
+ snprintf(bf, size, "%s/.build-id/%.2s/%s", buildid_dir,
+ build_id_hex, build_id_hex + 2);
return bf;
}
diff --git a/tools/perf/util/cache.h b/tools/perf/util/cache.h
index 65fe664..27e9ebe 100644
--- a/tools/perf/util/cache.h
+++ b/tools/perf/util/cache.h
@@ -23,6 +23,7 @@
extern int perf_config_int(const char *, const char *);
extern int perf_config_bool(const char *, const char *);
extern int config_error_nonbool(const char *);
+extern const char *perf_config_dirname(const char *, const char *);
/* pager.c */
extern void setup_pager(void);
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 52c777e..f231f43 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -18,7 +18,7 @@
#include "util.h"
#include "callchain.h"
-bool ip_callchain__valid(struct ip_callchain *chain, event_t *event)
+bool ip_callchain__valid(struct ip_callchain *chain, const event_t *event)
{
unsigned int chain_size = event->header.size;
chain_size -= (unsigned long)&event->ip.__more_data - (unsigned long)event;
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index f2e9ee1..624a96c 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -63,5 +63,5 @@
int append_chain(struct callchain_node *root, struct ip_callchain *chain,
struct map_symbol *syms, u64 period);
-bool ip_callchain__valid(struct ip_callchain *chain, event_t *event);
+bool ip_callchain__valid(struct ip_callchain *chain, const event_t *event);
#endif /* __PERF_CALLCHAIN_H */
diff --git a/tools/perf/util/config.c b/tools/perf/util/config.c
index dabe892..e02d78c 100644
--- a/tools/perf/util/config.c
+++ b/tools/perf/util/config.c
@@ -11,6 +11,11 @@
#define MAXNAME (256)
+#define DEBUG_CACHE_DIR ".debug"
+
+
+char buildid_dir[MAXPATHLEN]; /* root dir for buildid, binary cache */
+
static FILE *config_file;
static const char *config_file_name;
static int config_linenr;
@@ -127,7 +132,7 @@
break;
if (!iskeychar(c))
break;
- name[len++] = tolower(c);
+ name[len++] = c;
if (len >= MAXNAME)
return -1;
}
@@ -327,6 +332,13 @@
return !!perf_config_bool_or_int(name, value, &discard);
}
+const char *perf_config_dirname(const char *name, const char *value)
+{
+ if (!name)
+ return NULL;
+ return value;
+}
+
static int perf_default_core_config(const char *var __used, const char *value __used)
{
/* Add other config variables here and to Documentation/config.txt. */
@@ -428,3 +440,53 @@
{
return error("Missing value for '%s'", var);
}
+
+struct buildid_dir_config {
+ char *dir;
+};
+
+static int buildid_dir_command_config(const char *var, const char *value,
+ void *data)
+{
+ struct buildid_dir_config *c = data;
+ const char *v;
+
+ /* same dir for all commands */
+ if (!prefixcmp(var, "buildid.") && !strcmp(var + 8, "dir")) {
+ v = perf_config_dirname(var, value);
+ if (!v)
+ return -1;
+ strncpy(c->dir, v, MAXPATHLEN-1);
+ c->dir[MAXPATHLEN-1] = '\0';
+ }
+ return 0;
+}
+
+static void check_buildid_dir_config(void)
+{
+ struct buildid_dir_config c;
+ c.dir = buildid_dir;
+ perf_config(buildid_dir_command_config, &c);
+}
+
+void set_buildid_dir(void)
+{
+ buildid_dir[0] = '\0';
+
+ /* try config file */
+ check_buildid_dir_config();
+
+ /* default to $HOME/.debug */
+ if (buildid_dir[0] == '\0') {
+ char *v = getenv("HOME");
+ if (v) {
+ snprintf(buildid_dir, MAXPATHLEN-1, "%s/%s",
+ v, DEBUG_CACHE_DIR);
+ } else {
+ strncpy(buildid_dir, DEBUG_CACHE_DIR, MAXPATHLEN-1);
+ }
+ buildid_dir[MAXPATHLEN-1] = '\0';
+ }
+ /* for communicating with external commands */
+ setenv("PERF_BUILDID_DIR", buildid_dir, 1);
+}
diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index 4e01490..0f9b8d7 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -20,7 +20,7 @@
return nr_cpus;
}
-int read_cpu_map(void)
+static int read_all_cpu_map(void)
{
FILE *onlnf;
int nr_cpus = 0;
@@ -57,3 +57,58 @@
return default_cpu_map();
}
+
+int read_cpu_map(const char *cpu_list)
+{
+ unsigned long start_cpu, end_cpu = 0;
+ char *p = NULL;
+ int i, nr_cpus = 0;
+
+ if (!cpu_list)
+ return read_all_cpu_map();
+
+ if (!isdigit(*cpu_list))
+ goto invalid;
+
+ while (isdigit(*cpu_list)) {
+ p = NULL;
+ start_cpu = strtoul(cpu_list, &p, 0);
+ if (start_cpu >= INT_MAX
+ || (*p != '\0' && *p != ',' && *p != '-'))
+ goto invalid;
+
+ if (*p == '-') {
+ cpu_list = ++p;
+ p = NULL;
+ end_cpu = strtoul(cpu_list, &p, 0);
+
+ if (end_cpu >= INT_MAX || (*p != '\0' && *p != ','))
+ goto invalid;
+
+ if (end_cpu < start_cpu)
+ goto invalid;
+ } else {
+ end_cpu = start_cpu;
+ }
+
+ for (; start_cpu <= end_cpu; start_cpu++) {
+ /* check for duplicates */
+ for (i = 0; i < nr_cpus; i++)
+ if (cpumap[i] == (int)start_cpu)
+ goto invalid;
+
+ assert(nr_cpus < MAX_NR_CPUS);
+ cpumap[nr_cpus++] = (int)start_cpu;
+ }
+ if (*p)
+ ++p;
+
+ cpu_list = p;
+ }
+ if (nr_cpus > 0)
+ return nr_cpus;
+
+ return default_cpu_map();
+invalid:
+ return -1;
+}
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index 86c78bb..3e60f56 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -1,7 +1,7 @@
#ifndef __PERF_CPUMAP_H
#define __PERF_CPUMAP_H
-extern int read_cpu_map(void);
+extern int read_cpu_map(const char *cpu_list);
extern int cpumap[];
#endif /* __PERF_CPUMAP_H */
diff --git a/tools/perf/util/debug.c b/tools/perf/util/debug.c
index 6cddff2..318dab1 100644
--- a/tools/perf/util/debug.c
+++ b/tools/perf/util/debug.c
@@ -86,12 +86,10 @@
dump_printf_color(" ", color);
for (j = 0; j < 15-(i & 15); j++)
dump_printf_color(" ", color);
- for (j = 0; j < (i & 15); j++) {
- if (isprint(raw_event[i-15+j]))
- dump_printf_color("%c", color,
- raw_event[i-15+j]);
- else
- dump_printf_color(".", color);
+ for (j = i & ~15; j <= i; j++) {
+ dump_printf_color("%c", color,
+ isprint(raw_event[j]) ?
+ raw_event[j] : '.');
}
dump_printf_color("\n", color);
}
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 2fbf6a4..dab9e75 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -151,7 +151,6 @@
continue;
pbf += n + 3;
if (*pbf == 'x') { /* vm_exec */
- u64 vm_pgoff;
char *execname = strchr(bf, '/');
/* Catch VDSO */
@@ -162,12 +161,7 @@
continue;
pbf += 3;
- n = hex2u64(pbf, &vm_pgoff);
- /* pgoff is in bytes, not pages */
- if (n >= 0)
- ev.mmap.pgoff = vm_pgoff << getpagesize();
- else
- ev.mmap.pgoff = 0;
+ n = hex2u64(pbf, &ev.mmap.pgoff);
size = strlen(execname);
execname[size - 1] = '\0'; /* Remove \n */
@@ -340,30 +334,29 @@
return process(&ev, session);
}
-static void thread__comm_adjust(struct thread *self)
+static void thread__comm_adjust(struct thread *self, struct hists *hists)
{
char *comm = self->comm;
if (!symbol_conf.col_width_list_str && !symbol_conf.field_sep &&
(!symbol_conf.comm_list ||
strlist__has_entry(symbol_conf.comm_list, comm))) {
- unsigned int slen = strlen(comm);
+ u16 slen = strlen(comm);
- if (slen > comms__col_width) {
- comms__col_width = slen;
- threads__col_width = slen + 6;
- }
+ if (hists__new_col_len(hists, HISTC_COMM, slen))
+ hists__set_col_len(hists, HISTC_THREAD, slen + 6);
}
}
-static int thread__set_comm_adjust(struct thread *self, const char *comm)
+static int thread__set_comm_adjust(struct thread *self, const char *comm,
+ struct hists *hists)
{
int ret = thread__set_comm(self, comm);
if (ret)
return ret;
- thread__comm_adjust(self);
+ thread__comm_adjust(self, hists);
return 0;
}
@@ -374,7 +367,8 @@
dump_printf(": %s:%d\n", self->comm.comm, self->comm.tid);
- if (thread == NULL || thread__set_comm_adjust(thread, self->comm.comm)) {
+ if (thread == NULL || thread__set_comm_adjust(thread, self->comm.comm,
+ &session->hists)) {
dump_printf("problem processing PERF_RECORD_COMM, skipping event.\n");
return -1;
}
@@ -456,6 +450,7 @@
goto out_problem;
map->dso->short_name = name;
+ map->dso->sname_alloc = 1;
map->end = map->start + self->mmap.len;
} else if (is_kernel_mmap) {
const char *symbol_name = (self->mmap.filename +
@@ -514,12 +509,13 @@
if (machine == NULL)
goto out_problem;
thread = perf_session__findnew(session, self->mmap.pid);
+ if (thread == NULL)
+ goto out_problem;
map = map__new(&machine->user_dsos, self->mmap.start,
self->mmap.len, self->mmap.pgoff,
self->mmap.pid, self->mmap.filename,
- MAP__FUNCTION, session->cwd, session->cwdlen);
-
- if (thread == NULL || map == NULL)
+ MAP__FUNCTION);
+ if (map == NULL)
goto out_problem;
thread__insert_map(thread, map);
@@ -552,6 +548,26 @@
return 0;
}
+int event__process(event_t *event, struct perf_session *session)
+{
+ switch (event->header.type) {
+ case PERF_RECORD_COMM:
+ event__process_comm(event, session);
+ break;
+ case PERF_RECORD_MMAP:
+ event__process_mmap(event, session);
+ break;
+ case PERF_RECORD_FORK:
+ case PERF_RECORD_EXIT:
+ event__process_task(event, session);
+ break;
+ default:
+ break;
+ }
+
+ return 0;
+}
+
void thread__find_addr_map(struct thread *self,
struct perf_session *session, u8 cpumode,
enum map_type type, pid_t pid, u64 addr,
@@ -641,27 +657,49 @@
al->sym = NULL;
}
-static void dso__calc_col_width(struct dso *self)
+static void dso__calc_col_width(struct dso *self, struct hists *hists)
{
if (!symbol_conf.col_width_list_str && !symbol_conf.field_sep &&
(!symbol_conf.dso_list ||
strlist__has_entry(symbol_conf.dso_list, self->name))) {
- u16 slen = self->short_name_len;
- if (verbose)
- slen = self->long_name_len;
- if (dsos__col_width < slen)
- dsos__col_width = slen;
+ u16 slen = dso__name_len(self);
+ hists__new_col_len(hists, HISTC_DSO, slen);
}
self->slen_calculated = 1;
}
int event__preprocess_sample(const event_t *self, struct perf_session *session,
- struct addr_location *al, symbol_filter_t filter)
+ struct addr_location *al, struct sample_data *data,
+ symbol_filter_t filter)
{
u8 cpumode = self->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
- struct thread *thread = perf_session__findnew(session, self->ip.pid);
+ struct thread *thread;
+ event__parse_sample(self, session->sample_type, data);
+
+ dump_printf("(IP, %d): %d/%d: %#Lx period: %Ld cpu:%d\n",
+ self->header.misc, data->pid, data->tid, data->ip,
+ data->period, data->cpu);
+
+ if (session->sample_type & PERF_SAMPLE_CALLCHAIN) {
+ unsigned int i;
+
+ dump_printf("... chain: nr:%Lu\n", data->callchain->nr);
+
+ if (!ip_callchain__valid(data->callchain, self)) {
+ pr_debug("call-chain problem with event, "
+ "skipping it.\n");
+ goto out_filtered;
+ }
+
+ if (dump_trace) {
+ for (i = 0; i < data->callchain->nr; i++)
+ dump_printf("..... %2d: %016Lx\n",
+ i, data->callchain->ips[i]);
+ }
+ }
+ thread = perf_session__findnew(session, self->ip.pid);
if (thread == NULL)
return -1;
@@ -687,6 +725,7 @@
al->map ? al->map->dso->long_name :
al->level == 'H' ? "[hypervisor]" : "<not found>");
al->sym = NULL;
+ al->cpu = data->cpu;
if (al->map) {
if (symbol_conf.dso_list &&
@@ -703,16 +742,17 @@
* sampled.
*/
if (!sort_dso.elide && !al->map->dso->slen_calculated)
- dso__calc_col_width(al->map->dso);
+ dso__calc_col_width(al->map->dso, &session->hists);
al->sym = map__find_symbol(al->map, al->addr, filter);
} else {
const unsigned int unresolved_col_width = BITS_PER_LONG / 4;
- if (dsos__col_width < unresolved_col_width &&
+ if (hists__col_len(&session->hists, HISTC_DSO) < unresolved_col_width &&
!symbol_conf.col_width_list_str && !symbol_conf.field_sep &&
!symbol_conf.dso_list)
- dsos__col_width = unresolved_col_width;
+ hists__set_col_len(&session->hists, HISTC_DSO,
+ unresolved_col_width);
}
if (symbol_conf.sym_list && al->sym &&
@@ -726,9 +766,9 @@
return 0;
}
-int event__parse_sample(event_t *event, u64 type, struct sample_data *data)
+int event__parse_sample(const event_t *event, u64 type, struct sample_data *data)
{
- u64 *array = event->sample.array;
+ const u64 *array = event->sample.array;
if (type & PERF_SAMPLE_IP) {
data->ip = event->ip.ip;
@@ -767,7 +807,8 @@
u32 *p = (u32 *)array;
data->cpu = *p;
array++;
- }
+ } else
+ data->cpu = -1;
if (type & PERF_SAMPLE_PERIOD) {
data->period = *array;
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 8577085..8e790da 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -154,11 +154,13 @@
int event__process_lost(event_t *self, struct perf_session *session);
int event__process_mmap(event_t *self, struct perf_session *session);
int event__process_task(event_t *self, struct perf_session *session);
+int event__process(event_t *event, struct perf_session *session);
struct addr_location;
int event__preprocess_sample(const event_t *self, struct perf_session *session,
- struct addr_location *al, symbol_filter_t filter);
-int event__parse_sample(event_t *event, u64 type, struct sample_data *data);
+ struct addr_location *al, struct sample_data *data,
+ symbol_filter_t filter);
+int event__parse_sample(const event_t *event, u64 type, struct sample_data *data);
extern const char *event__name[];
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 1f62435..d7e67b1 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -16,6 +16,8 @@
#include "symbol.h"
#include "debug.h"
+static bool no_buildid_cache = false;
+
/*
* Create new perf.data header attribute:
*/
@@ -385,8 +387,7 @@
int ret;
char debugdir[PATH_MAX];
- snprintf(debugdir, sizeof(debugdir), "%s/%s", getenv("HOME"),
- DEBUG_CACHE_DIR);
+ snprintf(debugdir, sizeof(debugdir), "%s", buildid_dir);
if (mkdir(debugdir, 0755) != 0 && errno != EEXIST)
return -1;
@@ -471,7 +472,8 @@
}
buildid_sec->size = lseek(fd, 0, SEEK_CUR) -
buildid_sec->offset;
- perf_session__cache_build_ids(session);
+ if (!no_buildid_cache)
+ perf_session__cache_build_ids(session);
}
lseek(fd, sec_start, SEEK_SET);
@@ -1190,3 +1192,8 @@
session);
return 0;
}
+
+void disable_buildid_cache(void)
+{
+ no_buildid_cache = true;
+}
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 784ee0b..e7263d4 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -5,11 +5,61 @@
#include "sort.h"
#include <math.h>
+enum hist_filter {
+ HIST_FILTER__DSO,
+ HIST_FILTER__THREAD,
+ HIST_FILTER__PARENT,
+};
+
struct callchain_param callchain_param = {
.mode = CHAIN_GRAPH_REL,
.min_percent = 0.5
};
+u16 hists__col_len(struct hists *self, enum hist_column col)
+{
+ return self->col_len[col];
+}
+
+void hists__set_col_len(struct hists *self, enum hist_column col, u16 len)
+{
+ self->col_len[col] = len;
+}
+
+bool hists__new_col_len(struct hists *self, enum hist_column col, u16 len)
+{
+ if (len > hists__col_len(self, col)) {
+ hists__set_col_len(self, col, len);
+ return true;
+ }
+ return false;
+}
+
+static void hists__reset_col_len(struct hists *self)
+{
+ enum hist_column col;
+
+ for (col = 0; col < HISTC_NR_COLS; ++col)
+ hists__set_col_len(self, col, 0);
+}
+
+static void hists__calc_col_len(struct hists *self, struct hist_entry *h)
+{
+ u16 len;
+
+ if (h->ms.sym)
+ hists__new_col_len(self, HISTC_SYMBOL, h->ms.sym->namelen);
+
+ len = thread__comm_len(h->thread);
+ if (hists__new_col_len(self, HISTC_COMM, len))
+ hists__set_col_len(self, HISTC_THREAD, len + 6);
+
+ if (h->ms.map) {
+ len = dso__name_len(h->ms.map->dso);
+ hists__new_col_len(self, HISTC_DSO, len);
+ }
+}
+
static void hist_entry__add_cpumode_period(struct hist_entry *self,
unsigned int cpumode, u64 period)
{
@@ -43,6 +93,8 @@
if (self != NULL) {
*self = *template;
self->nr_events = 1;
+ if (self->ms.map)
+ self->ms.map->referenced = true;
if (symbol_conf.use_callchain)
callchain_init(self->callchain);
}
@@ -50,11 +102,19 @@
return self;
}
-static void hists__inc_nr_entries(struct hists *self, struct hist_entry *entry)
+static void hists__inc_nr_entries(struct hists *self, struct hist_entry *h)
{
- if (entry->ms.sym && self->max_sym_namelen < entry->ms.sym->namelen)
- self->max_sym_namelen = entry->ms.sym->namelen;
- ++self->nr_entries;
+ if (!h->filtered) {
+ hists__calc_col_len(self, h);
+ ++self->nr_entries;
+ }
+}
+
+static u8 symbol__parent_filter(const struct symbol *parent)
+{
+ if (symbol_conf.exclude_other && parent == NULL)
+ return 1 << HIST_FILTER__PARENT;
+ return 0;
}
struct hist_entry *__hists__add_entry(struct hists *self,
@@ -70,10 +130,12 @@
.map = al->map,
.sym = al->sym,
},
+ .cpu = al->cpu,
.ip = al->addr,
.level = al->level,
.period = period,
.parent = sym_parent,
+ .filtered = symbol__parent_filter(sym_parent),
};
int cmp;
@@ -191,7 +253,7 @@
tmp = RB_ROOT;
next = rb_first(&self->entries);
self->nr_entries = 0;
- self->max_sym_namelen = 0;
+ hists__reset_col_len(self);
while (next) {
n = rb_entry(next, struct hist_entry, rb_node);
@@ -248,7 +310,7 @@
next = rb_first(&self->entries);
self->nr_entries = 0;
- self->max_sym_namelen = 0;
+ hists__reset_col_len(self);
while (next) {
n = rb_entry(next, struct hist_entry, rb_node);
@@ -515,8 +577,9 @@
}
int hist_entry__snprintf(struct hist_entry *self, char *s, size_t size,
- struct hists *pair_hists, bool show_displacement,
- long displacement, bool color, u64 session_total)
+ struct hists *hists, struct hists *pair_hists,
+ bool show_displacement, long displacement,
+ bool color, u64 session_total)
{
struct sort_entry *se;
u64 period, total, period_sys, period_us, period_guest_sys, period_guest_us;
@@ -620,29 +683,25 @@
ret += snprintf(s + ret, size - ret, "%s", sep ?: " ");
ret += se->se_snprintf(self, s + ret, size - ret,
- se->se_width ? *se->se_width : 0);
+ hists__col_len(hists, se->se_width_idx));
}
return ret;
}
-int hist_entry__fprintf(struct hist_entry *self, struct hists *pair_hists,
- bool show_displacement, long displacement, FILE *fp,
- u64 session_total)
+int hist_entry__fprintf(struct hist_entry *self, struct hists *hists,
+ struct hists *pair_hists, bool show_displacement,
+ long displacement, FILE *fp, u64 session_total)
{
char bf[512];
- int ret;
-
- ret = hist_entry__snprintf(self, bf, sizeof(bf), pair_hists,
- show_displacement, displacement,
- true, session_total);
- if (!ret)
- return 0;
-
+ hist_entry__snprintf(self, bf, sizeof(bf), hists, pair_hists,
+ show_displacement, displacement,
+ true, session_total);
return fprintf(fp, "%s\n", bf);
}
-static size_t hist_entry__fprintf_callchain(struct hist_entry *self, FILE *fp,
+static size_t hist_entry__fprintf_callchain(struct hist_entry *self,
+ struct hists *hists, FILE *fp,
u64 session_total)
{
int left_margin = 0;
@@ -650,7 +709,7 @@
if (sort__first_dimension == SORT_COMM) {
struct sort_entry *se = list_first_entry(&hist_entry__sort_list,
typeof(*se), list);
- left_margin = se->se_width ? *se->se_width : 0;
+ left_margin = hists__col_len(hists, se->se_width_idx);
left_margin -= thread__comm_len(self->thread);
}
@@ -721,17 +780,17 @@
continue;
}
width = strlen(se->se_header);
- if (se->se_width) {
- if (symbol_conf.col_width_list_str) {
- if (col_width) {
- *se->se_width = atoi(col_width);
- col_width = strchr(col_width, ',');
- if (col_width)
- ++col_width;
- }
+ if (symbol_conf.col_width_list_str) {
+ if (col_width) {
+ hists__set_col_len(self, se->se_width_idx,
+ atoi(col_width));
+ col_width = strchr(col_width, ',');
+ if (col_width)
+ ++col_width;
}
- width = *se->se_width = max(*se->se_width, width);
}
+ if (!hists__new_col_len(self, se->se_width_idx, width))
+ width = hists__col_len(self, se->se_width_idx);
fprintf(fp, " %*s", width, se->se_header);
}
fprintf(fp, "\n");
@@ -754,9 +813,8 @@
continue;
fprintf(fp, " ");
- if (se->se_width)
- width = *se->se_width;
- else
+ width = hists__col_len(self, se->se_width_idx);
+ if (width == 0)
width = strlen(se->se_header);
for (i = 0; i < width; i++)
fprintf(fp, ".");
@@ -767,7 +825,6 @@
print_entries:
for (nd = rb_first(&self->entries); nd; nd = rb_next(nd)) {
struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
- int cnt;
if (show_displacement) {
if (h->pair != NULL)
@@ -777,17 +834,12 @@
displacement = 0;
++position;
}
- cnt = hist_entry__fprintf(h, pair, show_displacement,
- displacement, fp, self->stats.total_period);
- /* Ignore those that didn't match the parent filter */
- if (!cnt)
- continue;
-
- ret += cnt;
+ ret += hist_entry__fprintf(h, self, pair, show_displacement,
+ displacement, fp, self->stats.total_period);
if (symbol_conf.use_callchain)
- ret += hist_entry__fprintf_callchain(h, fp, self->stats.total_period);
-
+ ret += hist_entry__fprintf_callchain(h, self, fp,
+ self->stats.total_period);
if (h->ms.map == NULL && verbose > 1) {
__map_groups__fprintf_maps(&h->thread->mg,
MAP__FUNCTION, verbose, fp);
@@ -800,10 +852,49 @@
return ret;
}
-enum hist_filter {
- HIST_FILTER__DSO,
- HIST_FILTER__THREAD,
-};
+/*
+ * See hists__fprintf to match the column widths
+ */
+unsigned int hists__sort_list_width(struct hists *self)
+{
+ struct sort_entry *se;
+ int ret = 9; /* total % */
+
+ if (symbol_conf.show_cpu_utilization) {
+ ret += 7; /* count_sys % */
+ ret += 6; /* count_us % */
+ if (perf_guest) {
+ ret += 13; /* count_guest_sys % */
+ ret += 12; /* count_guest_us % */
+ }
+ }
+
+ if (symbol_conf.show_nr_samples)
+ ret += 11;
+
+ list_for_each_entry(se, &hist_entry__sort_list, list)
+ if (!se->elide)
+ ret += 2 + hists__col_len(self, se->se_width_idx);
+
+ return ret;
+}
+
+static void hists__remove_entry_filter(struct hists *self, struct hist_entry *h,
+ enum hist_filter filter)
+{
+ h->filtered &= ~(1 << filter);
+ if (h->filtered)
+ return;
+
+ ++self->nr_entries;
+ if (h->ms.unfolded)
+ self->nr_entries += h->nr_rows;
+ h->row_offset = 0;
+ self->stats.total_period += h->period;
+ self->stats.nr_events[PERF_RECORD_SAMPLE] += h->nr_events;
+
+ hists__calc_col_len(self, h);
+}
void hists__filter_by_dso(struct hists *self, const struct dso *dso)
{
@@ -811,7 +902,7 @@
self->nr_entries = self->stats.total_period = 0;
self->stats.nr_events[PERF_RECORD_SAMPLE] = 0;
- self->max_sym_namelen = 0;
+ hists__reset_col_len(self);
for (nd = rb_first(&self->entries); nd; nd = rb_next(nd)) {
struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
@@ -824,15 +915,7 @@
continue;
}
- h->filtered &= ~(1 << HIST_FILTER__DSO);
- if (!h->filtered) {
- ++self->nr_entries;
- self->stats.total_period += h->period;
- self->stats.nr_events[PERF_RECORD_SAMPLE] += h->nr_events;
- if (h->ms.sym &&
- self->max_sym_namelen < h->ms.sym->namelen)
- self->max_sym_namelen = h->ms.sym->namelen;
- }
+ hists__remove_entry_filter(self, h, HIST_FILTER__DSO);
}
}
@@ -842,7 +925,7 @@
self->nr_entries = self->stats.total_period = 0;
self->stats.nr_events[PERF_RECORD_SAMPLE] = 0;
- self->max_sym_namelen = 0;
+ hists__reset_col_len(self);
for (nd = rb_first(&self->entries); nd; nd = rb_next(nd)) {
struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
@@ -851,15 +934,8 @@
h->filtered |= (1 << HIST_FILTER__THREAD);
continue;
}
- h->filtered &= ~(1 << HIST_FILTER__THREAD);
- if (!h->filtered) {
- ++self->nr_entries;
- self->stats.total_period += h->period;
- self->stats.nr_events[PERF_RECORD_SAMPLE] += h->nr_events;
- if (h->ms.sym &&
- self->max_sym_namelen < h->ms.sym->namelen)
- self->max_sym_namelen = h->ms.sym->namelen;
- }
+
+ hists__remove_entry_filter(self, h, HIST_FILTER__THREAD);
}
}
@@ -1052,7 +1128,7 @@
dso, dso->long_name, sym, sym->name);
snprintf(command, sizeof(command),
- "objdump --start-address=0x%016Lx --stop-address=0x%016Lx -dS %s|grep -v %s|expand",
+ "objdump --start-address=0x%016Lx --stop-address=0x%016Lx -dS -C %s|grep -v %s|expand",
map__rip_2objdump(map, sym->start),
map__rip_2objdump(map, sym->end),
filename, filename);
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 83fa33a..65a48db 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -56,6 +56,16 @@
u32 nr_unknown_events;
};
+enum hist_column {
+ HISTC_SYMBOL,
+ HISTC_DSO,
+ HISTC_THREAD,
+ HISTC_COMM,
+ HISTC_PARENT,
+ HISTC_CPU,
+ HISTC_NR_COLS, /* Last entry */
+};
+
struct hists {
struct rb_node rb_node;
struct rb_root entries;
@@ -64,7 +74,7 @@
u64 config;
u64 event_stream;
u32 type;
- u32 max_sym_namelen;
+ u16 col_len[HISTC_NR_COLS];
};
struct hist_entry *__hists__add_entry(struct hists *self,
@@ -72,12 +82,13 @@
struct symbol *parent, u64 period);
extern int64_t hist_entry__cmp(struct hist_entry *, struct hist_entry *);
extern int64_t hist_entry__collapse(struct hist_entry *, struct hist_entry *);
-int hist_entry__fprintf(struct hist_entry *self, struct hists *pair_hists,
- bool show_displacement, long displacement, FILE *fp,
- u64 total);
+int hist_entry__fprintf(struct hist_entry *self, struct hists *hists,
+ struct hists *pair_hists, bool show_displacement,
+ long displacement, FILE *fp, u64 total);
int hist_entry__snprintf(struct hist_entry *self, char *bf, size_t size,
- struct hists *pair_hists, bool show_displacement,
- long displacement, bool color, u64 total);
+ struct hists *hists, struct hists *pair_hists,
+ bool show_displacement, long displacement,
+ bool color, u64 total);
void hist_entry__free(struct hist_entry *);
void hists__output_resort(struct hists *self);
@@ -95,6 +106,10 @@
void hists__filter_by_dso(struct hists *self, const struct dso *dso);
void hists__filter_by_thread(struct hists *self, const struct thread *thread);
+u16 hists__col_len(struct hists *self, enum hist_column col);
+void hists__set_col_len(struct hists *self, enum hist_column col, u16 len);
+bool hists__new_col_len(struct hists *self, enum hist_column col, u16 len);
+
#ifdef NO_NEWT_SUPPORT
static inline int hists__browse(struct hists *self __used,
const char *helpline __used,
@@ -126,4 +141,7 @@
int hists__tui_browse_tree(struct rb_root *self, const char *help);
#endif
+
+unsigned int hists__sort_list_width(struct hists *self);
+
#endif /* __PERF_HIST_H */
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index e672f2f..3a7eb6e 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -17,16 +17,6 @@
return strcmp(filename, "//anon") == 0;
}
-static int strcommon(const char *pathname, char *cwd, int cwdlen)
-{
- int n = 0;
-
- while (n < cwdlen && pathname[n] == cwd[n])
- ++n;
-
- return n;
-}
-
void map__init(struct map *self, enum map_type type,
u64 start, u64 end, u64 pgoff, struct dso *dso)
{
@@ -39,11 +29,12 @@
self->unmap_ip = map__unmap_ip;
RB_CLEAR_NODE(&self->rb_node);
self->groups = NULL;
+ self->referenced = false;
}
struct map *map__new(struct list_head *dsos__list, u64 start, u64 len,
u64 pgoff, u32 pid, char *filename,
- enum map_type type, char *cwd, int cwdlen)
+ enum map_type type)
{
struct map *self = malloc(sizeof(*self));
@@ -52,16 +43,6 @@
struct dso *dso;
int anon;
- if (cwd) {
- int n = strcommon(filename, cwd, cwdlen);
-
- if (n == cwdlen) {
- snprintf(newfilename, sizeof(newfilename),
- ".%s", filename + n);
- filename = newfilename;
- }
- }
-
anon = is_anon_memory(filename);
if (anon) {
@@ -248,6 +229,39 @@
self->machine = NULL;
}
+static void maps__delete(struct rb_root *self)
+{
+ struct rb_node *next = rb_first(self);
+
+ while (next) {
+ struct map *pos = rb_entry(next, struct map, rb_node);
+
+ next = rb_next(&pos->rb_node);
+ rb_erase(&pos->rb_node, self);
+ map__delete(pos);
+ }
+}
+
+static void maps__delete_removed(struct list_head *self)
+{
+ struct map *pos, *n;
+
+ list_for_each_entry_safe(pos, n, self, node) {
+ list_del(&pos->node);
+ map__delete(pos);
+ }
+}
+
+void map_groups__exit(struct map_groups *self)
+{
+ int i;
+
+ for (i = 0; i < MAP__NR_TYPES; ++i) {
+ maps__delete(&self->maps[i]);
+ maps__delete_removed(&self->removed_maps[i]);
+ }
+}
+
void map_groups__flush(struct map_groups *self)
{
int type;
@@ -374,6 +388,7 @@
{
struct rb_root *root = &self->maps[map->type];
struct rb_node *next = rb_first(root);
+ int err = 0;
while (next) {
struct map *pos = rb_entry(next, struct map, rb_node);
@@ -390,20 +405,16 @@
rb_erase(&pos->rb_node, root);
/*
- * We may have references to this map, for instance in some
- * hist_entry instances, so just move them to a separate
- * list.
- */
- list_add_tail(&pos->node, &self->removed_maps[map->type]);
- /*
* Now check if we need to create new maps for areas not
* overlapped by the new map:
*/
if (map->start > pos->start) {
struct map *before = map__clone(pos);
- if (before == NULL)
- return -ENOMEM;
+ if (before == NULL) {
+ err = -ENOMEM;
+ goto move_map;
+ }
before->end = map->start - 1;
map_groups__insert(self, before);
@@ -414,14 +425,27 @@
if (map->end < pos->end) {
struct map *after = map__clone(pos);
- if (after == NULL)
- return -ENOMEM;
+ if (after == NULL) {
+ err = -ENOMEM;
+ goto move_map;
+ }
after->start = map->end + 1;
map_groups__insert(self, after);
if (verbose >= 2)
map__fprintf(after, fp);
}
+move_map:
+ /*
+ * If we have references, just move them to a separate list.
+ */
+ if (pos->referenced)
+ list_add_tail(&pos->node, &self->removed_maps[map->type]);
+ else
+ map__delete(pos);
+
+ if (err)
+ return err;
}
return 0;
@@ -493,6 +517,11 @@
rb_insert_color(&map->rb_node, maps);
}
+void maps__remove(struct rb_root *self, struct map *map)
+{
+ rb_erase(&map->rb_node, self);
+}
+
struct map *maps__find(struct rb_root *maps, u64 ip)
{
struct rb_node **p = &maps->rb_node;
@@ -526,6 +555,31 @@
return self->root_dir == NULL ? -ENOMEM : 0;
}
+static void dsos__delete(struct list_head *self)
+{
+ struct dso *pos, *n;
+
+ list_for_each_entry_safe(pos, n, self, node) {
+ list_del(&pos->node);
+ dso__delete(pos);
+ }
+}
+
+void machine__exit(struct machine *self)
+{
+ map_groups__exit(&self->kmaps);
+ dsos__delete(&self->user_dsos);
+ dsos__delete(&self->kernel_dsos);
+ free(self->root_dir);
+ self->root_dir = NULL;
+}
+
+void machine__delete(struct machine *self)
+{
+ machine__exit(self);
+ free(self);
+}
+
struct machine *machines__add(struct rb_root *self, pid_t pid,
const char *root_dir)
{
diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h
index f391345..7857579 100644
--- a/tools/perf/util/map.h
+++ b/tools/perf/util/map.h
@@ -29,7 +29,8 @@
};
u64 start;
u64 end;
- enum map_type type;
+ u8 /* enum map_type */ type;
+ bool referenced;
u32 priv;
u64 pgoff;
@@ -106,7 +107,7 @@
u64 start, u64 end, u64 pgoff, struct dso *dso);
struct map *map__new(struct list_head *dsos__list, u64 start, u64 len,
u64 pgoff, u32 pid, char *filename,
- enum map_type type, char *cwd, int cwdlen);
+ enum map_type type);
void map__delete(struct map *self);
struct map *map__clone(struct map *self);
int map__overlap(struct map *l, struct map *r);
@@ -125,8 +126,10 @@
size_t __map_groups__fprintf_maps(struct map_groups *self,
enum map_type type, int verbose, FILE *fp);
void maps__insert(struct rb_root *maps, struct map *map);
+void maps__remove(struct rb_root *self, struct map *map);
struct map *maps__find(struct rb_root *maps, u64 addr);
void map_groups__init(struct map_groups *self);
+void map_groups__exit(struct map_groups *self);
int map_groups__clone(struct map_groups *self,
struct map_groups *parent, enum map_type type);
size_t map_groups__fprintf(struct map_groups *self, int verbose, FILE *fp);
@@ -142,6 +145,8 @@
struct machine *machines__findnew(struct rb_root *self, pid_t pid);
char *machine__mmap_name(struct machine *self, char *bf, size_t size);
int machine__init(struct machine *self, const char *root_dir, pid_t pid);
+void machine__exit(struct machine *self);
+void machine__delete(struct machine *self);
/*
* Default guest kernel is defined by parameter --guestkallsyms
@@ -163,6 +168,11 @@
map->groups = self;
}
+static inline void map_groups__remove(struct map_groups *self, struct map *map)
+{
+ maps__remove(&self->maps[map->type], map);
+}
+
static inline struct map *map_groups__find(struct map_groups *self,
enum map_type type, u64 addr)
{
diff --git a/tools/perf/util/newt.c b/tools/perf/util/newt.c
index 7537ca1..91de99b 100644
--- a/tools/perf/util/newt.c
+++ b/tools/perf/util/newt.c
@@ -11,6 +11,7 @@
#define HAVE_LONG_LONG __GLIBC_HAVE_LONG_LONG
#endif
#include <slang.h>
+#include <signal.h>
#include <stdlib.h>
#include <newt.h>
#include <sys/ttydefaults.h>
@@ -278,9 +279,48 @@
void *first_visible_entry, *entries;
u16 top, left, width, height;
void *priv;
+ unsigned int (*refresh_entries)(struct ui_browser *self);
+ void (*seek)(struct ui_browser *self,
+ off_t offset, int whence);
u32 nr_entries;
};
+static void ui_browser__list_head_seek(struct ui_browser *self,
+ off_t offset, int whence)
+{
+ struct list_head *head = self->entries;
+ struct list_head *pos;
+
+ switch (whence) {
+ case SEEK_SET:
+ pos = head->next;
+ break;
+ case SEEK_CUR:
+ pos = self->first_visible_entry;
+ break;
+ case SEEK_END:
+ pos = head->prev;
+ break;
+ default:
+ return;
+ }
+
+ if (offset > 0) {
+ while (offset-- != 0)
+ pos = pos->next;
+ } else {
+ while (offset++ != 0)
+ pos = pos->prev;
+ }
+
+ self->first_visible_entry = pos;
+}
+
+static bool ui_browser__is_current_entry(struct ui_browser *self, unsigned row)
+{
+ return (self->first_visible_entry_idx + row) == self->index;
+}
+
static void ui_browser__refresh_dimensions(struct ui_browser *self)
{
int cols, rows;
@@ -297,8 +337,36 @@
static void ui_browser__reset_index(struct ui_browser *self)
{
- self->index = self->first_visible_entry_idx = 0;
- self->first_visible_entry = NULL;
+ self->index = self->first_visible_entry_idx = 0;
+ self->seek(self, 0, SEEK_SET);
+}
+
+static int ui_browser__show(struct ui_browser *self, const char *title)
+{
+ if (self->form != NULL) {
+ newtFormDestroy(self->form);
+ newtPopWindow();
+ }
+ ui_browser__refresh_dimensions(self);
+ newtCenteredWindow(self->width, self->height, title);
+ self->form = newt_form__new();
+ if (self->form == NULL)
+ return -1;
+
+ self->sb = newtVerticalScrollbar(self->width, 0, self->height,
+ HE_COLORSET_NORMAL,
+ HE_COLORSET_SELECTED);
+ if (self->sb == NULL)
+ return -1;
+
+ newtFormAddHotKey(self->form, NEWT_KEY_UP);
+ newtFormAddHotKey(self->form, NEWT_KEY_DOWN);
+ newtFormAddHotKey(self->form, NEWT_KEY_PGUP);
+ newtFormAddHotKey(self->form, NEWT_KEY_PGDN);
+ newtFormAddHotKey(self->form, NEWT_KEY_HOME);
+ newtFormAddHotKey(self->form, NEWT_KEY_END);
+ newtFormAddComponent(self->form, self->sb);
+ return 0;
}
static int objdump_line__show(struct objdump_line *self, struct list_head *head,
@@ -352,26 +420,10 @@
static int ui_browser__refresh_entries(struct ui_browser *self)
{
- struct objdump_line *pos;
- struct list_head *head = self->entries;
- struct hist_entry *he = self->priv;
- int row = 0;
- int len = he->ms.sym->end - he->ms.sym->start;
+ int row;
- if (self->first_visible_entry == NULL || self->first_visible_entry == self->entries)
- self->first_visible_entry = head->next;
-
- pos = list_entry(self->first_visible_entry, struct objdump_line, node);
-
- list_for_each_entry_from(pos, head, node) {
- bool current_entry = (self->first_visible_entry_idx + row) == self->index;
- SLsmg_gotorc(self->top + row, self->left);
- objdump_line__show(pos, head, self->width,
- he, len, current_entry);
- if (++row == self->height)
- break;
- }
-
+ newtScrollbarSet(self->sb, self->index, self->nr_entries - 1);
+ row = self->refresh_entries(self);
SLsmg_set_color(HE_COLORSET_NORMAL);
SLsmg_fill_region(self->top + row, self->left,
self->height - row, self->width, ' ');
@@ -379,42 +431,13 @@
return 0;
}
-static int ui_browser__run(struct ui_browser *self, const char *title,
- struct newtExitStruct *es)
+static int ui_browser__run(struct ui_browser *self, struct newtExitStruct *es)
{
- if (self->form) {
- newtFormDestroy(self->form);
- newtPopWindow();
- }
-
- ui_browser__refresh_dimensions(self);
- newtCenteredWindow(self->width + 2, self->height, title);
- self->form = newt_form__new();
- if (self->form == NULL)
- return -1;
-
- self->sb = newtVerticalScrollbar(self->width + 1, 0, self->height,
- HE_COLORSET_NORMAL,
- HE_COLORSET_SELECTED);
- if (self->sb == NULL)
- return -1;
-
- newtFormAddHotKey(self->form, NEWT_KEY_UP);
- newtFormAddHotKey(self->form, NEWT_KEY_DOWN);
- newtFormAddHotKey(self->form, NEWT_KEY_PGUP);
- newtFormAddHotKey(self->form, NEWT_KEY_PGDN);
- newtFormAddHotKey(self->form, ' ');
- newtFormAddHotKey(self->form, NEWT_KEY_HOME);
- newtFormAddHotKey(self->form, NEWT_KEY_END);
- newtFormAddHotKey(self->form, NEWT_KEY_TAB);
- newtFormAddHotKey(self->form, NEWT_KEY_RIGHT);
-
if (ui_browser__refresh_entries(self) < 0)
return -1;
- newtFormAddComponent(self->form, self->sb);
while (1) {
- unsigned int offset;
+ off_t offset;
newtFormRun(self->form, es);
@@ -428,9 +451,8 @@
break;
++self->index;
if (self->index == self->first_visible_entry_idx + self->height) {
- struct list_head *pos = self->first_visible_entry;
++self->first_visible_entry_idx;
- self->first_visible_entry = pos->next;
+ self->seek(self, +1, SEEK_CUR);
}
break;
case NEWT_KEY_UP:
@@ -438,9 +460,8 @@
break;
--self->index;
if (self->index < self->first_visible_entry_idx) {
- struct list_head *pos = self->first_visible_entry;
--self->first_visible_entry_idx;
- self->first_visible_entry = pos->prev;
+ self->seek(self, -1, SEEK_CUR);
}
break;
case NEWT_KEY_PGDN:
@@ -453,12 +474,7 @@
offset = self->nr_entries - 1 - self->index;
self->index += offset;
self->first_visible_entry_idx += offset;
-
- while (offset--) {
- struct list_head *pos = self->first_visible_entry;
- self->first_visible_entry = pos->next;
- }
-
+ self->seek(self, +offset, SEEK_CUR);
break;
case NEWT_KEY_PGUP:
if (self->first_visible_entry_idx == 0)
@@ -471,36 +487,22 @@
self->index -= offset;
self->first_visible_entry_idx -= offset;
-
- while (offset--) {
- struct list_head *pos = self->first_visible_entry;
- self->first_visible_entry = pos->prev;
- }
+ self->seek(self, -offset, SEEK_CUR);
break;
case NEWT_KEY_HOME:
ui_browser__reset_index(self);
break;
- case NEWT_KEY_END: {
- struct list_head *head = self->entries;
+ case NEWT_KEY_END:
offset = self->height - 1;
+ if (offset >= self->nr_entries)
+ offset = self->nr_entries - 1;
- if (offset > self->nr_entries)
- offset = self->nr_entries;
-
- self->index = self->first_visible_entry_idx = self->nr_entries - 1 - offset;
- self->first_visible_entry = head->prev;
- while (offset-- != 0) {
- struct list_head *pos = self->first_visible_entry;
- self->first_visible_entry = pos->prev;
- }
- }
+ self->index = self->nr_entries - 1;
+ self->first_visible_entry_idx = self->index - offset;
+ self->seek(self, -offset, SEEK_END);
break;
- case NEWT_KEY_RIGHT:
- case NEWT_KEY_LEFT:
- case NEWT_KEY_TAB:
- return es->u.key;
default:
- continue;
+ return es->u.key;
}
if (ui_browser__refresh_entries(self) < 0)
return -1;
@@ -508,38 +510,6 @@
return 0;
}
-/*
- * When debugging newt problems it was useful to be able to "unroll"
- * the calls to newtCheckBoxTreeAdd{Array,Item}, so that we can generate
- * a source file with the sequence of calls to these methods, to then
- * tweak the arrays to get the intended results, so I'm keeping this code
- * here, may be useful again in the future.
- */
-#undef NEWT_DEBUG
-
-static void newt_checkbox_tree__add(newtComponent tree, const char *str,
- void *priv, int *indexes)
-{
-#ifdef NEWT_DEBUG
- /* Print the newtCheckboxTreeAddArray to tinker with its index arrays */
- int i = 0, len = 40 - strlen(str);
-
- fprintf(stderr,
- "\tnewtCheckboxTreeAddItem(tree, %*.*s\"%s\", (void *)%p, 0, ",
- len, len, " ", str, priv);
- while (indexes[i] != NEWT_ARG_LAST) {
- if (indexes[i] != NEWT_ARG_APPEND)
- fprintf(stderr, " %d,", indexes[i]);
- else
- fprintf(stderr, " %s,", "NEWT_ARG_APPEND");
- ++i;
- }
- fprintf(stderr, " %s", " NEWT_ARG_LAST);\n");
- fflush(stderr);
-#endif
- newtCheckboxTreeAddArray(tree, str, priv, 0, indexes);
-}
-
static char *callchain_list__sym_name(struct callchain_list *self,
char *bf, size_t bfsize)
{
@@ -550,144 +520,29 @@
return bf;
}
-static void __callchain__append_graph_browser(struct callchain_node *self,
- newtComponent tree, u64 total,
- int *indexes, int depth)
+static unsigned int hist_entry__annotate_browser_refresh(struct ui_browser *self)
{
- struct rb_node *node;
- u64 new_total, remaining;
- int idx = 0;
+ struct objdump_line *pos;
+ struct list_head *head = self->entries;
+ struct hist_entry *he = self->priv;
+ int row = 0;
+ int len = he->ms.sym->end - he->ms.sym->start;
- if (callchain_param.mode == CHAIN_GRAPH_REL)
- new_total = self->children_hit;
- else
- new_total = total;
+ if (self->first_visible_entry == NULL || self->first_visible_entry == self->entries)
+ self->first_visible_entry = head->next;
- remaining = new_total;
- node = rb_first(&self->rb_root);
- while (node) {
- struct callchain_node *child = rb_entry(node, struct callchain_node, rb_node);
- struct rb_node *next = rb_next(node);
- u64 cumul = cumul_hits(child);
- struct callchain_list *chain;
- int first = true, printed = 0;
- int chain_idx = -1;
- remaining -= cumul;
+ pos = list_entry(self->first_visible_entry, struct objdump_line, node);
- indexes[depth] = NEWT_ARG_APPEND;
- indexes[depth + 1] = NEWT_ARG_LAST;
-
- list_for_each_entry(chain, &child->val, list) {
- char ipstr[BITS_PER_LONG / 4 + 1],
- *alloc_str = NULL;
- const char *str = callchain_list__sym_name(chain, ipstr, sizeof(ipstr));
-
- if (first) {
- double percent = cumul * 100.0 / new_total;
-
- first = false;
- if (asprintf(&alloc_str, "%2.2f%% %s", percent, str) < 0)
- str = "Not enough memory!";
- else
- str = alloc_str;
- } else {
- indexes[depth] = idx;
- indexes[depth + 1] = NEWT_ARG_APPEND;
- indexes[depth + 2] = NEWT_ARG_LAST;
- ++chain_idx;
- }
- newt_checkbox_tree__add(tree, str, &chain->ms, indexes);
- free(alloc_str);
- ++printed;
- }
-
- indexes[depth] = idx;
- if (chain_idx != -1)
- indexes[depth + 1] = chain_idx;
- if (printed != 0)
- ++idx;
- __callchain__append_graph_browser(child, tree, new_total, indexes,
- depth + (chain_idx != -1 ? 2 : 1));
- node = next;
- }
-}
-
-static void callchain__append_graph_browser(struct callchain_node *self,
- newtComponent tree, u64 total,
- int *indexes, int parent_idx)
-{
- struct callchain_list *chain;
- int i = 0;
-
- indexes[1] = NEWT_ARG_APPEND;
- indexes[2] = NEWT_ARG_LAST;
-
- list_for_each_entry(chain, &self->val, list) {
- char ipstr[BITS_PER_LONG / 4 + 1], *str;
-
- if (chain->ip >= PERF_CONTEXT_MAX)
- continue;
-
- if (!i++ && sort__first_dimension == SORT_SYM)
- continue;
-
- str = callchain_list__sym_name(chain, ipstr, sizeof(ipstr));
- newt_checkbox_tree__add(tree, str, &chain->ms, indexes);
+ list_for_each_entry_from(pos, head, node) {
+ bool current_entry = ui_browser__is_current_entry(self, row);
+ SLsmg_gotorc(self->top + row, self->left);
+ objdump_line__show(pos, head, self->width,
+ he, len, current_entry);
+ if (++row == self->height)
+ break;
}
- indexes[1] = parent_idx;
- indexes[2] = NEWT_ARG_APPEND;
- indexes[3] = NEWT_ARG_LAST;
- __callchain__append_graph_browser(self, tree, total, indexes, 2);
-}
-
-static void hist_entry__append_callchain_browser(struct hist_entry *self,
- newtComponent tree, u64 total, int parent_idx)
-{
- struct rb_node *rb_node;
- int indexes[1024] = { [0] = parent_idx, };
- int idx = 0;
- struct callchain_node *chain;
-
- rb_node = rb_first(&self->sorted_chain);
- while (rb_node) {
- chain = rb_entry(rb_node, struct callchain_node, rb_node);
- switch (callchain_param.mode) {
- case CHAIN_FLAT:
- break;
- case CHAIN_GRAPH_ABS: /* falldown */
- case CHAIN_GRAPH_REL:
- callchain__append_graph_browser(chain, tree, total, indexes, idx++);
- break;
- case CHAIN_NONE:
- default:
- break;
- }
- rb_node = rb_next(rb_node);
- }
-}
-
-static size_t hist_entry__append_browser(struct hist_entry *self,
- newtComponent tree, u64 total)
-{
- char s[256];
- size_t ret;
-
- if (symbol_conf.exclude_other && !self->parent)
- return 0;
-
- ret = hist_entry__snprintf(self, s, sizeof(s), NULL,
- false, 0, false, total);
- if (symbol_conf.use_callchain) {
- int indexes[2];
-
- indexes[0] = NEWT_ARG_APPEND;
- indexes[1] = NEWT_ARG_LAST;
- newt_checkbox_tree__add(tree, s, &self->ms, indexes);
- } else
- newtListboxAppendEntry(tree, s, &self->ms);
-
- return ret;
+ return row;
}
int hist_entry__tui_annotate(struct hist_entry *self)
@@ -712,7 +567,9 @@
ui_helpline__push("Press <- or ESC to exit");
memset(&browser, 0, sizeof(browser));
- browser.entries = &head;
+ browser.entries = &head;
+ browser.refresh_entries = hist_entry__annotate_browser_refresh;
+ browser.seek = ui_browser__list_head_seek;
browser.priv = self;
list_for_each_entry(pos, &head, node) {
size_t line_len = strlen(pos->line);
@@ -722,7 +579,9 @@
}
browser.width += 18; /* Percentage */
- ret = ui_browser__run(&browser, self->ms.sym->name, &es);
+ ui_browser__show(&browser, self->ms.sym->name);
+ newtFormAddHotKey(browser.form, ' ');
+ ret = ui_browser__run(&browser, &es);
newtFormDestroy(browser.form);
newtPopWindow();
list_for_each_entry_safe(pos, n, &head, node) {
@@ -733,157 +592,48 @@
return ret;
}
-static const void *newt__symbol_tree_get_current(newtComponent self)
-{
- if (symbol_conf.use_callchain)
- return newtCheckboxTreeGetCurrent(self);
- return newtListboxGetCurrent(self);
-}
-
-static void hist_browser__selection(newtComponent self, void *data)
-{
- const struct map_symbol **symbol_ptr = data;
- *symbol_ptr = newt__symbol_tree_get_current(self);
-}
-
struct hist_browser {
- newtComponent form, tree;
- const struct map_symbol *selection;
+ struct ui_browser b;
+ struct hists *hists;
+ struct hist_entry *he_selection;
+ struct map_symbol *selection;
};
-static struct hist_browser *hist_browser__new(void)
-{
- struct hist_browser *self = malloc(sizeof(*self));
+static void hist_browser__reset(struct hist_browser *self);
+static int hist_browser__run(struct hist_browser *self, const char *title,
+ struct newtExitStruct *es);
+static unsigned int hist_browser__refresh_entries(struct ui_browser *self);
+static void ui_browser__hists_seek(struct ui_browser *self,
+ off_t offset, int whence);
- if (self != NULL)
- self->form = NULL;
+static struct hist_browser *hist_browser__new(struct hists *hists)
+{
+ struct hist_browser *self = zalloc(sizeof(*self));
+
+ if (self) {
+ self->hists = hists;
+ self->b.refresh_entries = hist_browser__refresh_entries;
+ self->b.seek = ui_browser__hists_seek;
+ }
return self;
}
static void hist_browser__delete(struct hist_browser *self)
{
- newtFormDestroy(self->form);
+ newtFormDestroy(self->b.form);
newtPopWindow();
free(self);
}
-static int hist_browser__populate(struct hist_browser *self, struct hists *hists,
- const char *title)
-{
- int max_len = 0, idx, cols, rows;
- struct ui_progress *progress;
- struct rb_node *nd;
- u64 curr_hist = 0;
- char seq[] = ".", unit;
- char str[256];
- unsigned long nr_events = hists->stats.nr_events[PERF_RECORD_SAMPLE];
-
- if (self->form) {
- newtFormDestroy(self->form);
- newtPopWindow();
- }
-
- nr_events = convert_unit(nr_events, &unit);
- snprintf(str, sizeof(str), "Events: %lu%c ",
- nr_events, unit);
- newtDrawRootText(0, 0, str);
-
- newtGetScreenSize(NULL, &rows);
-
- if (symbol_conf.use_callchain)
- self->tree = newtCheckboxTreeMulti(0, 0, rows - 5, seq,
- NEWT_FLAG_SCROLL);
- else
- self->tree = newtListbox(0, 0, rows - 5,
- (NEWT_FLAG_SCROLL |
- NEWT_FLAG_RETURNEXIT));
-
- newtComponentAddCallback(self->tree, hist_browser__selection,
- &self->selection);
-
- progress = ui_progress__new("Adding entries to the browser...",
- hists->nr_entries);
- if (progress == NULL)
- return -1;
-
- idx = 0;
- for (nd = rb_first(&hists->entries); nd; nd = rb_next(nd)) {
- struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
- int len;
-
- if (h->filtered)
- continue;
-
- len = hist_entry__append_browser(h, self->tree, hists->stats.total_period);
- if (len > max_len)
- max_len = len;
- if (symbol_conf.use_callchain)
- hist_entry__append_callchain_browser(h, self->tree,
- hists->stats.total_period, idx++);
- ++curr_hist;
- if (curr_hist % 5)
- ui_progress__update(progress, curr_hist);
- }
-
- ui_progress__delete(progress);
-
- newtGetScreenSize(&cols, &rows);
-
- if (max_len > cols)
- max_len = cols - 3;
-
- if (!symbol_conf.use_callchain)
- newtListboxSetWidth(self->tree, max_len);
-
- newtCenteredWindow(max_len + (symbol_conf.use_callchain ? 5 : 0),
- rows - 5, title);
- self->form = newt_form__new();
- if (self->form == NULL)
- return -1;
-
- newtFormAddHotKey(self->form, 'A');
- newtFormAddHotKey(self->form, 'a');
- newtFormAddHotKey(self->form, 'D');
- newtFormAddHotKey(self->form, 'd');
- newtFormAddHotKey(self->form, 'T');
- newtFormAddHotKey(self->form, 't');
- newtFormAddHotKey(self->form, '?');
- newtFormAddHotKey(self->form, 'H');
- newtFormAddHotKey(self->form, 'h');
- newtFormAddHotKey(self->form, NEWT_KEY_F1);
- newtFormAddHotKey(self->form, NEWT_KEY_RIGHT);
- newtFormAddHotKey(self->form, NEWT_KEY_TAB);
- newtFormAddHotKey(self->form, NEWT_KEY_UNTAB);
- newtFormAddComponents(self->form, self->tree, NULL);
- self->selection = newt__symbol_tree_get_current(self->tree);
-
- return 0;
-}
-
static struct hist_entry *hist_browser__selected_entry(struct hist_browser *self)
{
- int *indexes;
-
- if (!symbol_conf.use_callchain)
- goto out;
-
- indexes = newtCheckboxTreeFindItem(self->tree, (void *)self->selection);
- if (indexes) {
- bool is_hist_entry = indexes[1] == NEWT_ARG_LAST;
- free(indexes);
- if (is_hist_entry)
- goto out;
- }
- return NULL;
-out:
- return container_of(self->selection, struct hist_entry, ms);
+ return self->he_selection;
}
static struct thread *hist_browser__selected_thread(struct hist_browser *self)
{
- struct hist_entry *he = hist_browser__selected_entry(self);
- return he ? he->thread : NULL;
+ return self->he_selection->thread;
}
static int hist_browser__title(char *bf, size_t size, const char *ev_name,
@@ -905,7 +655,7 @@
int hists__browse(struct hists *self, const char *helpline, const char *ev_name)
{
- struct hist_browser *browser = hist_browser__new();
+ struct hist_browser *browser = hist_browser__new(self);
struct pstack *fstack;
const struct thread *thread_filter = NULL;
const struct dso *dso_filter = NULL;
@@ -924,8 +674,6 @@
hist_browser__title(msg, sizeof(msg), ev_name,
dso_filter, thread_filter);
- if (hist_browser__populate(browser, self, msg) < 0)
- goto out_free_stack;
while (1) {
const struct thread *thread;
@@ -934,7 +682,8 @@
int nr_options = 0, choice = 0, i,
annotate = -2, zoom_dso = -2, zoom_thread = -2;
- newtFormRun(browser->form, &es);
+ if (hist_browser__run(browser, msg, &es))
+ break;
thread = hist_browser__selected_thread(browser);
dso = browser->selection->map ? browser->selection->map->dso : NULL;
@@ -1069,8 +818,7 @@
hists__filter_by_dso(self, dso_filter);
hist_browser__title(msg, sizeof(msg), ev_name,
dso_filter, thread_filter);
- if (hist_browser__populate(browser, self, msg) < 0)
- goto out;
+ hist_browser__reset(browser);
} else if (choice == zoom_thread) {
zoom_thread:
if (thread_filter) {
@@ -1088,8 +836,7 @@
hists__filter_by_thread(self, thread_filter);
hist_browser__title(msg, sizeof(msg), ev_name,
dso_filter, thread_filter);
- if (hist_browser__populate(browser, self, msg) < 0)
- goto out;
+ hist_browser__reset(browser);
}
}
out_free_stack:
@@ -1145,6 +892,13 @@
"blue", "lightgray",
};
+static void newt_suspend(void *d __used)
+{
+ newtSuspend();
+ raise(SIGTSTP);
+ newtResume();
+}
+
void setup_browser(void)
{
struct newtPercentTreeColors *c = &defaultPercentTreeColors;
@@ -1158,6 +912,7 @@
use_browser = 1;
newtInit();
newtCls();
+ newtSetSuspendCallback(newt_suspend, NULL);
ui_helpline__puts(" ");
sltt_set_color(HE_COLORSET_TOP, NULL, c->topColorFg, c->topColorBg);
sltt_set_color(HE_COLORSET_MEDIUM, NULL, c->mediumColorFg, c->mediumColorBg);
@@ -1176,3 +931,638 @@
newtFinished();
}
}
+
+static void hist_browser__refresh_dimensions(struct hist_browser *self)
+{
+ /* 3 == +/- toggle symbol before actual hist_entry rendering */
+ self->b.width = 3 + (hists__sort_list_width(self->hists) +
+ sizeof("[k]"));
+}
+
+static void hist_browser__reset(struct hist_browser *self)
+{
+ self->b.nr_entries = self->hists->nr_entries;
+ hist_browser__refresh_dimensions(self);
+ ui_browser__reset_index(&self->b);
+}
+
+static char tree__folded_sign(bool unfolded)
+{
+ return unfolded ? '-' : '+';
+}
+
+static char map_symbol__folded(const struct map_symbol *self)
+{
+ return self->has_children ? tree__folded_sign(self->unfolded) : ' ';
+}
+
+static char hist_entry__folded(const struct hist_entry *self)
+{
+ return map_symbol__folded(&self->ms);
+}
+
+static char callchain_list__folded(const struct callchain_list *self)
+{
+ return map_symbol__folded(&self->ms);
+}
+
+static bool map_symbol__toggle_fold(struct map_symbol *self)
+{
+ if (!self->has_children)
+ return false;
+
+ self->unfolded = !self->unfolded;
+ return true;
+}
+
+#define LEVEL_OFFSET_STEP 3
+
+static int hist_browser__show_callchain_node_rb_tree(struct hist_browser *self,
+ struct callchain_node *chain_node,
+ u64 total, int level,
+ unsigned short row,
+ off_t *row_offset,
+ bool *is_current_entry)
+{
+ struct rb_node *node;
+ int first_row = row, width, offset = level * LEVEL_OFFSET_STEP;
+ u64 new_total, remaining;
+
+ if (callchain_param.mode == CHAIN_GRAPH_REL)
+ new_total = chain_node->children_hit;
+ else
+ new_total = total;
+
+ remaining = new_total;
+ node = rb_first(&chain_node->rb_root);
+ while (node) {
+ struct callchain_node *child = rb_entry(node, struct callchain_node, rb_node);
+ struct rb_node *next = rb_next(node);
+ u64 cumul = cumul_hits(child);
+ struct callchain_list *chain;
+ char folded_sign = ' ';
+ int first = true;
+ int extra_offset = 0;
+
+ remaining -= cumul;
+
+ list_for_each_entry(chain, &child->val, list) {
+ char ipstr[BITS_PER_LONG / 4 + 1], *alloc_str;
+ const char *str;
+ int color;
+ bool was_first = first;
+
+ if (first) {
+ first = false;
+ chain->ms.has_children = chain->list.next != &child->val ||
+ rb_first(&child->rb_root) != NULL;
+ } else {
+ extra_offset = LEVEL_OFFSET_STEP;
+ chain->ms.has_children = chain->list.next == &child->val &&
+ rb_first(&child->rb_root) != NULL;
+ }
+
+ folded_sign = callchain_list__folded(chain);
+ if (*row_offset != 0) {
+ --*row_offset;
+ goto do_next;
+ }
+
+ alloc_str = NULL;
+ str = callchain_list__sym_name(chain, ipstr, sizeof(ipstr));
+ if (was_first) {
+ double percent = cumul * 100.0 / new_total;
+
+ if (asprintf(&alloc_str, "%2.2f%% %s", percent, str) < 0)
+ str = "Not enough memory!";
+ else
+ str = alloc_str;
+ }
+
+ color = HE_COLORSET_NORMAL;
+ width = self->b.width - (offset + extra_offset + 2);
+ if (ui_browser__is_current_entry(&self->b, row)) {
+ self->selection = &chain->ms;
+ color = HE_COLORSET_SELECTED;
+ *is_current_entry = true;
+ }
+
+ SLsmg_set_color(color);
+ SLsmg_gotorc(self->b.top + row, self->b.left);
+ slsmg_write_nstring(" ", offset + extra_offset);
+ slsmg_printf("%c ", folded_sign);
+ slsmg_write_nstring(str, width);
+ free(alloc_str);
+
+ if (++row == self->b.height)
+ goto out;
+do_next:
+ if (folded_sign == '+')
+ break;
+ }
+
+ if (folded_sign == '-') {
+ const int new_level = level + (extra_offset ? 2 : 1);
+ row += hist_browser__show_callchain_node_rb_tree(self, child, new_total,
+ new_level, row, row_offset,
+ is_current_entry);
+ }
+ if (row == self->b.height)
+ goto out;
+ node = next;
+ }
+out:
+ return row - first_row;
+}
+
+static int hist_browser__show_callchain_node(struct hist_browser *self,
+ struct callchain_node *node,
+ int level, unsigned short row,
+ off_t *row_offset,
+ bool *is_current_entry)
+{
+ struct callchain_list *chain;
+ int first_row = row,
+ offset = level * LEVEL_OFFSET_STEP,
+ width = self->b.width - offset;
+ char folded_sign = ' ';
+
+ list_for_each_entry(chain, &node->val, list) {
+ char ipstr[BITS_PER_LONG / 4 + 1], *s;
+ int color;
+ /*
+ * FIXME: This should be moved to somewhere else,
+ * probably when the callchain is created, so as not to
+ * traverse it all over again
+ */
+ chain->ms.has_children = rb_first(&node->rb_root) != NULL;
+ folded_sign = callchain_list__folded(chain);
+
+ if (*row_offset != 0) {
+ --*row_offset;
+ continue;
+ }
+
+ color = HE_COLORSET_NORMAL;
+ if (ui_browser__is_current_entry(&self->b, row)) {
+ self->selection = &chain->ms;
+ color = HE_COLORSET_SELECTED;
+ *is_current_entry = true;
+ }
+
+ s = callchain_list__sym_name(chain, ipstr, sizeof(ipstr));
+ SLsmg_gotorc(self->b.top + row, self->b.left);
+ SLsmg_set_color(color);
+ slsmg_write_nstring(" ", offset);
+ slsmg_printf("%c ", folded_sign);
+ slsmg_write_nstring(s, width - 2);
+
+ if (++row == self->b.height)
+ goto out;
+ }
+
+ if (folded_sign == '-')
+ row += hist_browser__show_callchain_node_rb_tree(self, node,
+ self->hists->stats.total_period,
+ level + 1, row,
+ row_offset,
+ is_current_entry);
+out:
+ return row - first_row;
+}
+
+static int hist_browser__show_callchain(struct hist_browser *self,
+ struct rb_root *chain,
+ int level, unsigned short row,
+ off_t *row_offset,
+ bool *is_current_entry)
+{
+ struct rb_node *nd;
+ int first_row = row;
+
+ for (nd = rb_first(chain); nd; nd = rb_next(nd)) {
+ struct callchain_node *node = rb_entry(nd, struct callchain_node, rb_node);
+
+ row += hist_browser__show_callchain_node(self, node, level,
+ row, row_offset,
+ is_current_entry);
+ if (row == self->b.height)
+ break;
+ }
+
+ return row - first_row;
+}
+
+static int hist_browser__show_entry(struct hist_browser *self,
+ struct hist_entry *entry,
+ unsigned short row)
+{
+ char s[256];
+ double percent;
+ int printed = 0;
+ int color, width = self->b.width;
+ char folded_sign = ' ';
+ bool current_entry = ui_browser__is_current_entry(&self->b, row);
+ off_t row_offset = entry->row_offset;
+
+ if (current_entry) {
+ self->he_selection = entry;
+ self->selection = &entry->ms;
+ }
+
+ if (symbol_conf.use_callchain) {
+ entry->ms.has_children = !RB_EMPTY_ROOT(&entry->sorted_chain);
+ folded_sign = hist_entry__folded(entry);
+ }
+
+ if (row_offset == 0) {
+ hist_entry__snprintf(entry, s, sizeof(s), self->hists, NULL, false,
+ 0, false, self->hists->stats.total_period);
+ percent = (entry->period * 100.0) / self->hists->stats.total_period;
+
+ color = HE_COLORSET_SELECTED;
+ if (!current_entry) {
+ if (percent >= MIN_RED)
+ color = HE_COLORSET_TOP;
+ else if (percent >= MIN_GREEN)
+ color = HE_COLORSET_MEDIUM;
+ else
+ color = HE_COLORSET_NORMAL;
+ }
+
+ SLsmg_set_color(color);
+ SLsmg_gotorc(self->b.top + row, self->b.left);
+ if (symbol_conf.use_callchain) {
+ slsmg_printf("%c ", folded_sign);
+ width -= 2;
+ }
+ slsmg_write_nstring(s, width);
+ ++row;
+ ++printed;
+ } else
+ --row_offset;
+
+ if (folded_sign == '-' && row != self->b.height) {
+ printed += hist_browser__show_callchain(self, &entry->sorted_chain,
+ 1, row, &row_offset,
+ ¤t_entry);
+ if (current_entry)
+ self->he_selection = entry;
+ }
+
+ return printed;
+}
+
+static unsigned int hist_browser__refresh_entries(struct ui_browser *self)
+{
+ unsigned row = 0;
+ struct rb_node *nd;
+ struct hist_browser *hb = container_of(self, struct hist_browser, b);
+
+ if (self->first_visible_entry == NULL)
+ self->first_visible_entry = rb_first(&hb->hists->entries);
+
+ for (nd = self->first_visible_entry; nd; nd = rb_next(nd)) {
+ struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
+
+ if (h->filtered)
+ continue;
+
+ row += hist_browser__show_entry(hb, h, row);
+ if (row == self->height)
+ break;
+ }
+
+ return row;
+}
+
+static void callchain_node__init_have_children_rb_tree(struct callchain_node *self)
+{
+ struct rb_node *nd = rb_first(&self->rb_root);
+
+ for (nd = rb_first(&self->rb_root); nd; nd = rb_next(nd)) {
+ struct callchain_node *child = rb_entry(nd, struct callchain_node, rb_node);
+ struct callchain_list *chain;
+ int first = true;
+
+ list_for_each_entry(chain, &child->val, list) {
+ if (first) {
+ first = false;
+ chain->ms.has_children = chain->list.next != &child->val ||
+ rb_first(&child->rb_root) != NULL;
+ } else
+ chain->ms.has_children = chain->list.next == &child->val &&
+ rb_first(&child->rb_root) != NULL;
+ }
+
+ callchain_node__init_have_children_rb_tree(child);
+ }
+}
+
+static void callchain_node__init_have_children(struct callchain_node *self)
+{
+ struct callchain_list *chain;
+
+ list_for_each_entry(chain, &self->val, list)
+ chain->ms.has_children = rb_first(&self->rb_root) != NULL;
+
+ callchain_node__init_have_children_rb_tree(self);
+}
+
+static void callchain__init_have_children(struct rb_root *self)
+{
+ struct rb_node *nd;
+
+ for (nd = rb_first(self); nd; nd = rb_next(nd)) {
+ struct callchain_node *node = rb_entry(nd, struct callchain_node, rb_node);
+ callchain_node__init_have_children(node);
+ }
+}
+
+static void hist_entry__init_have_children(struct hist_entry *self)
+{
+ if (!self->init_have_children) {
+ callchain__init_have_children(&self->sorted_chain);
+ self->init_have_children = true;
+ }
+}
+
+static struct rb_node *hists__filter_entries(struct rb_node *nd)
+{
+ while (nd != NULL) {
+ struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
+ if (!h->filtered)
+ return nd;
+
+ nd = rb_next(nd);
+ }
+
+ return NULL;
+}
+
+static struct rb_node *hists__filter_prev_entries(struct rb_node *nd)
+{
+ while (nd != NULL) {
+ struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
+ if (!h->filtered)
+ return nd;
+
+ nd = rb_prev(nd);
+ }
+
+ return NULL;
+}
+
+static void ui_browser__hists_seek(struct ui_browser *self,
+ off_t offset, int whence)
+{
+ struct hist_entry *h;
+ struct rb_node *nd;
+ bool first = true;
+
+ switch (whence) {
+ case SEEK_SET:
+ nd = hists__filter_entries(rb_first(self->entries));
+ break;
+ case SEEK_CUR:
+ nd = self->first_visible_entry;
+ goto do_offset;
+ case SEEK_END:
+ nd = hists__filter_prev_entries(rb_last(self->entries));
+ first = false;
+ break;
+ default:
+ return;
+ }
+
+ /*
+ * Moves not relative to the first visible entry invalidates its
+ * row_offset:
+ */
+ h = rb_entry(self->first_visible_entry, struct hist_entry, rb_node);
+ h->row_offset = 0;
+
+ /*
+ * Here we have to check if nd is expanded (+), if it is we can't go
+ * the next top level hist_entry, instead we must compute an offset of
+ * what _not_ to show and not change the first visible entry.
+ *
+ * This offset increments when we are going from top to bottom and
+ * decreases when we're going from bottom to top.
+ *
+ * As we don't have backpointers to the top level in the callchains
+ * structure, we need to always print the whole hist_entry callchain,
+ * skipping the first ones that are before the first visible entry
+ * and stop when we printed enough lines to fill the screen.
+ */
+do_offset:
+ if (offset > 0) {
+ do {
+ h = rb_entry(nd, struct hist_entry, rb_node);
+ if (h->ms.unfolded) {
+ u16 remaining = h->nr_rows - h->row_offset;
+ if (offset > remaining) {
+ offset -= remaining;
+ h->row_offset = 0;
+ } else {
+ h->row_offset += offset;
+ offset = 0;
+ self->first_visible_entry = nd;
+ break;
+ }
+ }
+ nd = hists__filter_entries(rb_next(nd));
+ if (nd == NULL)
+ break;
+ --offset;
+ self->first_visible_entry = nd;
+ } while (offset != 0);
+ } else if (offset < 0) {
+ while (1) {
+ h = rb_entry(nd, struct hist_entry, rb_node);
+ if (h->ms.unfolded) {
+ if (first) {
+ if (-offset > h->row_offset) {
+ offset += h->row_offset;
+ h->row_offset = 0;
+ } else {
+ h->row_offset += offset;
+ offset = 0;
+ self->first_visible_entry = nd;
+ break;
+ }
+ } else {
+ if (-offset > h->nr_rows) {
+ offset += h->nr_rows;
+ h->row_offset = 0;
+ } else {
+ h->row_offset = h->nr_rows + offset;
+ offset = 0;
+ self->first_visible_entry = nd;
+ break;
+ }
+ }
+ }
+
+ nd = hists__filter_prev_entries(rb_prev(nd));
+ if (nd == NULL)
+ break;
+ ++offset;
+ self->first_visible_entry = nd;
+ if (offset == 0) {
+ /*
+ * Last unfiltered hist_entry, check if it is
+ * unfolded, if it is then we should have
+ * row_offset at its last entry.
+ */
+ h = rb_entry(nd, struct hist_entry, rb_node);
+ if (h->ms.unfolded)
+ h->row_offset = h->nr_rows;
+ break;
+ }
+ first = false;
+ }
+ } else {
+ self->first_visible_entry = nd;
+ h = rb_entry(nd, struct hist_entry, rb_node);
+ h->row_offset = 0;
+ }
+}
+
+static int callchain_node__count_rows_rb_tree(struct callchain_node *self)
+{
+ int n = 0;
+ struct rb_node *nd;
+
+ for (nd = rb_first(&self->rb_root); nd; nd = rb_next(nd)) {
+ struct callchain_node *child = rb_entry(nd, struct callchain_node, rb_node);
+ struct callchain_list *chain;
+ char folded_sign = ' '; /* No children */
+
+ list_for_each_entry(chain, &child->val, list) {
+ ++n;
+ /* We need this because we may not have children */
+ folded_sign = callchain_list__folded(chain);
+ if (folded_sign == '+')
+ break;
+ }
+
+ if (folded_sign == '-') /* Have children and they're unfolded */
+ n += callchain_node__count_rows_rb_tree(child);
+ }
+
+ return n;
+}
+
+static int callchain_node__count_rows(struct callchain_node *node)
+{
+ struct callchain_list *chain;
+ bool unfolded = false;
+ int n = 0;
+
+ list_for_each_entry(chain, &node->val, list) {
+ ++n;
+ unfolded = chain->ms.unfolded;
+ }
+
+ if (unfolded)
+ n += callchain_node__count_rows_rb_tree(node);
+
+ return n;
+}
+
+static int callchain__count_rows(struct rb_root *chain)
+{
+ struct rb_node *nd;
+ int n = 0;
+
+ for (nd = rb_first(chain); nd; nd = rb_next(nd)) {
+ struct callchain_node *node = rb_entry(nd, struct callchain_node, rb_node);
+ n += callchain_node__count_rows(node);
+ }
+
+ return n;
+}
+
+static bool hist_browser__toggle_fold(struct hist_browser *self)
+{
+ if (map_symbol__toggle_fold(self->selection)) {
+ struct hist_entry *he = self->he_selection;
+
+ hist_entry__init_have_children(he);
+ self->hists->nr_entries -= he->nr_rows;
+
+ if (he->ms.unfolded)
+ he->nr_rows = callchain__count_rows(&he->sorted_chain);
+ else
+ he->nr_rows = 0;
+ self->hists->nr_entries += he->nr_rows;
+ self->b.nr_entries = self->hists->nr_entries;
+
+ return true;
+ }
+
+ /* If it doesn't have children, no toggling performed */
+ return false;
+}
+
+static int hist_browser__run(struct hist_browser *self, const char *title,
+ struct newtExitStruct *es)
+{
+ char str[256], unit;
+ unsigned long nr_events = self->hists->stats.nr_events[PERF_RECORD_SAMPLE];
+
+ self->b.entries = &self->hists->entries;
+ self->b.nr_entries = self->hists->nr_entries;
+
+ hist_browser__refresh_dimensions(self);
+
+ nr_events = convert_unit(nr_events, &unit);
+ snprintf(str, sizeof(str), "Events: %lu%c ",
+ nr_events, unit);
+ newtDrawRootText(0, 0, str);
+
+ if (ui_browser__show(&self->b, title) < 0)
+ return -1;
+
+ newtFormAddHotKey(self->b.form, 'A');
+ newtFormAddHotKey(self->b.form, 'a');
+ newtFormAddHotKey(self->b.form, '?');
+ newtFormAddHotKey(self->b.form, 'h');
+ newtFormAddHotKey(self->b.form, 'H');
+ newtFormAddHotKey(self->b.form, 'd');
+
+ newtFormAddHotKey(self->b.form, NEWT_KEY_LEFT);
+ newtFormAddHotKey(self->b.form, NEWT_KEY_RIGHT);
+ newtFormAddHotKey(self->b.form, NEWT_KEY_ENTER);
+
+ while (1) {
+ ui_browser__run(&self->b, es);
+
+ if (es->reason != NEWT_EXIT_HOTKEY)
+ break;
+ switch (es->u.key) {
+ case 'd': { /* Debug */
+ static int seq;
+ struct hist_entry *h = rb_entry(self->b.first_visible_entry,
+ struct hist_entry, rb_node);
+ ui_helpline__pop();
+ ui_helpline__fpush("%d: nr_ent=(%d,%d), height=%d, idx=%d, fve: idx=%d, row_off=%d, nrows=%d",
+ seq++, self->b.nr_entries,
+ self->hists->nr_entries,
+ self->b.height,
+ self->b.index,
+ self->b.first_visible_entry_idx,
+ h->row_offset, h->nr_rows);
+ }
+ continue;
+ case NEWT_KEY_ENTER:
+ if (hist_browser__toggle_fold(self))
+ break;
+ /* fall thru */
+ default:
+ return 0;
+ }
+ }
+ return 0;
+}
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 9bf0f40..4af5bd5 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -602,8 +602,15 @@
return EVT_FAILED;
}
- /* We should find a nice way to override the access type */
- attr->bp_len = HW_BREAKPOINT_LEN_4;
+ /*
+ * We should find a nice way to override the access length
+ * Provide some defaults for now
+ */
+ if (attr->bp_type == HW_BREAKPOINT_X)
+ attr->bp_len = sizeof(long);
+ else
+ attr->bp_len = HW_BREAKPOINT_LEN_4;
+
attr->type = PERF_TYPE_BREAKPOINT;
return EVT_HANDLED;
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 914c670..2e665cb 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -1,5 +1,5 @@
/*
- * probe-event.c : perf-probe definition to kprobe_events format converter
+ * probe-event.c : perf-probe definition to probe_events format converter
*
* Written by Masami Hiramatsu <mhiramat@redhat.com>
*
@@ -120,8 +120,11 @@
return open(machine.vmlinux_maps[MAP__FUNCTION]->dso->long_name, O_RDONLY);
}
-/* Convert trace point to probe point with debuginfo */
-static int convert_to_perf_probe_point(struct kprobe_trace_point *tp,
+/*
+ * Convert trace point to probe point with debuginfo
+ * Currently only handles kprobes.
+ */
+static int kprobe_convert_to_perf_probe(struct probe_trace_point *tp,
struct perf_probe_point *pp)
{
struct symbol *sym;
@@ -151,8 +154,8 @@
}
/* Try to find perf_probe_event with debuginfo */
-static int try_to_find_kprobe_trace_events(struct perf_probe_event *pev,
- struct kprobe_trace_event **tevs,
+static int try_to_find_probe_trace_events(struct perf_probe_event *pev,
+ struct probe_trace_event **tevs,
int max_tevs)
{
bool need_dwarf = perf_probe_event_need_dwarf(pev);
@@ -169,11 +172,11 @@
}
/* Searching trace events corresponding to probe event */
- ntevs = find_kprobe_trace_events(fd, pev, tevs, max_tevs);
+ ntevs = find_probe_trace_events(fd, pev, tevs, max_tevs);
close(fd);
if (ntevs > 0) { /* Succeeded to find trace events */
- pr_debug("find %d kprobe_trace_events.\n", ntevs);
+ pr_debug("find %d probe_trace_events.\n", ntevs);
return ntevs;
}
@@ -195,6 +198,65 @@
return ntevs;
}
+/*
+ * Find a src file from a DWARF tag path. Prepend optional source path prefix
+ * and chop off leading directories that do not exist. Result is passed back as
+ * a newly allocated path on success.
+ * Return 0 if file was found and readable, -errno otherwise.
+ */
+static int get_real_path(const char *raw_path, const char *comp_dir,
+ char **new_path)
+{
+ const char *prefix = symbol_conf.source_prefix;
+
+ if (!prefix) {
+ if (raw_path[0] != '/' && comp_dir)
+ /* If not an absolute path, try to use comp_dir */
+ prefix = comp_dir;
+ else {
+ if (access(raw_path, R_OK) == 0) {
+ *new_path = strdup(raw_path);
+ return 0;
+ } else
+ return -errno;
+ }
+ }
+
+ *new_path = malloc((strlen(prefix) + strlen(raw_path) + 2));
+ if (!*new_path)
+ return -ENOMEM;
+
+ for (;;) {
+ sprintf(*new_path, "%s/%s", prefix, raw_path);
+
+ if (access(*new_path, R_OK) == 0)
+ return 0;
+
+ if (!symbol_conf.source_prefix)
+ /* In case of searching comp_dir, don't retry */
+ return -errno;
+
+ switch (errno) {
+ case ENAMETOOLONG:
+ case ENOENT:
+ case EROFS:
+ case EFAULT:
+ raw_path = strchr(++raw_path, '/');
+ if (!raw_path) {
+ free(*new_path);
+ *new_path = NULL;
+ return -ENOENT;
+ }
+ continue;
+
+ default:
+ free(*new_path);
+ *new_path = NULL;
+ return -errno;
+ }
+ }
+}
+
#define LINEBUF_SIZE 256
#define NR_ADDITIONAL_LINES 2
@@ -244,6 +306,7 @@
struct line_node *ln;
FILE *fp;
int fd, ret;
+ char *tmp;
/* Search a line range */
ret = init_vmlinux();
@@ -266,6 +329,15 @@
return ret;
}
+ /* Convert source file path */
+ tmp = lr->path;
+ ret = get_real_path(tmp, lr->comp_dir, &lr->path);
+ free(tmp); /* Free old path */
+ if (ret < 0) {
+ pr_warning("Failed to find source file. (%d)\n", ret);
+ return ret;
+ }
+
setup_pager();
if (lr->function)
@@ -308,8 +380,8 @@
#else /* !DWARF_SUPPORT */
-static int convert_to_perf_probe_point(struct kprobe_trace_point *tp,
- struct perf_probe_point *pp)
+static int kprobe_convert_to_perf_probe(struct probe_trace_point *tp,
+ struct perf_probe_point *pp)
{
pp->function = strdup(tp->symbol);
if (pp->function == NULL)
@@ -320,8 +392,8 @@
return 0;
}
-static int try_to_find_kprobe_trace_events(struct perf_probe_event *pev,
- struct kprobe_trace_event **tevs __unused,
+static int try_to_find_probe_trace_events(struct perf_probe_event *pev,
+ struct probe_trace_event **tevs __unused,
int max_tevs __unused)
{
if (perf_probe_event_need_dwarf(pev)) {
@@ -557,7 +629,7 @@
/* Parse perf-probe event argument */
static int parse_perf_probe_arg(char *str, struct perf_probe_arg *arg)
{
- char *tmp;
+ char *tmp, *goodname;
struct perf_probe_arg_field **fieldp;
pr_debug("parsing arg: %s into ", str);
@@ -580,7 +652,7 @@
pr_debug("type:%s ", arg->type);
}
- tmp = strpbrk(str, "-.");
+ tmp = strpbrk(str, "-.[");
if (!is_c_varname(str) || !tmp) {
/* A variable, register, symbol or special value */
arg->var = strdup(str);
@@ -590,10 +662,11 @@
return 0;
}
- /* Structure fields */
+ /* Structure fields or array element */
arg->var = strndup(str, tmp - str);
if (arg->var == NULL)
return -ENOMEM;
+ goodname = arg->var;
pr_debug("%s, ", arg->var);
fieldp = &arg->field;
@@ -601,22 +674,38 @@
*fieldp = zalloc(sizeof(struct perf_probe_arg_field));
if (*fieldp == NULL)
return -ENOMEM;
- if (*tmp == '.') {
- str = tmp + 1;
- (*fieldp)->ref = false;
- } else if (tmp[1] == '>') {
- str = tmp + 2;
+ if (*tmp == '[') { /* Array */
+ str = tmp;
+ (*fieldp)->index = strtol(str + 1, &tmp, 0);
(*fieldp)->ref = true;
- } else {
- semantic_error("Argument parse error: %s\n", str);
- return -EINVAL;
+ if (*tmp != ']' || tmp == str + 1) {
+ semantic_error("Array index must be a"
+ " number.\n");
+ return -EINVAL;
+ }
+ tmp++;
+ if (*tmp == '\0')
+ tmp = NULL;
+ } else { /* Structure */
+ if (*tmp == '.') {
+ str = tmp + 1;
+ (*fieldp)->ref = false;
+ } else if (tmp[1] == '>') {
+ str = tmp + 2;
+ (*fieldp)->ref = true;
+ } else {
+ semantic_error("Argument parse error: %s\n",
+ str);
+ return -EINVAL;
+ }
+ tmp = strpbrk(str, "-.[");
}
-
- tmp = strpbrk(str, "-.");
if (tmp) {
(*fieldp)->name = strndup(str, tmp - str);
if ((*fieldp)->name == NULL)
return -ENOMEM;
+ if (*str != '[')
+ goodname = (*fieldp)->name;
pr_debug("%s(%d), ", (*fieldp)->name, (*fieldp)->ref);
fieldp = &(*fieldp)->next;
}
@@ -624,11 +713,13 @@
(*fieldp)->name = strdup(str);
if ((*fieldp)->name == NULL)
return -ENOMEM;
+ if (*str != '[')
+ goodname = (*fieldp)->name;
pr_debug("%s(%d)\n", (*fieldp)->name, (*fieldp)->ref);
- /* If no name is specified, set the last field name */
+ /* If no name is specified, set the last field name (not array index)*/
if (!arg->name) {
- arg->name = strdup((*fieldp)->name);
+ arg->name = strdup(goodname);
if (arg->name == NULL)
return -ENOMEM;
}
@@ -693,16 +784,17 @@
return false;
}
-/* Parse kprobe_events event into struct probe_point */
-int parse_kprobe_trace_command(const char *cmd, struct kprobe_trace_event *tev)
+/* Parse probe_events event into struct probe_point */
+static int parse_probe_trace_command(const char *cmd,
+ struct probe_trace_event *tev)
{
- struct kprobe_trace_point *tp = &tev->point;
+ struct probe_trace_point *tp = &tev->point;
char pr;
char *p;
int ret, i, argc;
char **argv;
- pr_debug("Parsing kprobe_events: %s\n", cmd);
+ pr_debug("Parsing probe_events: %s\n", cmd);
argv = argv_split(cmd, &argc);
if (!argv) {
pr_debug("Failed to split arguments.\n");
@@ -734,7 +826,7 @@
tp->offset = 0;
tev->nargs = argc - 2;
- tev->args = zalloc(sizeof(struct kprobe_trace_arg) * tev->nargs);
+ tev->args = zalloc(sizeof(struct probe_trace_arg) * tev->nargs);
if (tev->args == NULL) {
ret = -ENOMEM;
goto out;
@@ -776,8 +868,11 @@
len -= ret;
while (field) {
- ret = e_snprintf(tmp, len, "%s%s", field->ref ? "->" : ".",
- field->name);
+ if (field->name[0] == '[')
+ ret = e_snprintf(tmp, len, "%s", field->name);
+ else
+ ret = e_snprintf(tmp, len, "%s%s",
+ field->ref ? "->" : ".", field->name);
if (ret <= 0)
goto error;
tmp += ret;
@@ -877,13 +972,13 @@
}
#endif
-static int __synthesize_kprobe_trace_arg_ref(struct kprobe_trace_arg_ref *ref,
+static int __synthesize_probe_trace_arg_ref(struct probe_trace_arg_ref *ref,
char **buf, size_t *buflen,
int depth)
{
int ret;
if (ref->next) {
- depth = __synthesize_kprobe_trace_arg_ref(ref->next, buf,
+ depth = __synthesize_probe_trace_arg_ref(ref->next, buf,
buflen, depth + 1);
if (depth < 0)
goto out;
@@ -901,9 +996,10 @@
}
-static int synthesize_kprobe_trace_arg(struct kprobe_trace_arg *arg,
+static int synthesize_probe_trace_arg(struct probe_trace_arg *arg,
char *buf, size_t buflen)
{
+ struct probe_trace_arg_ref *ref = arg->ref;
int ret, depth = 0;
char *tmp = buf;
@@ -917,16 +1013,24 @@
buf += ret;
buflen -= ret;
+ /* Special case: @XXX */
+ if (arg->value[0] == '@' && arg->ref)
+ ref = ref->next;
+
/* Dereferencing arguments */
- if (arg->ref) {
- depth = __synthesize_kprobe_trace_arg_ref(arg->ref, &buf,
+ if (ref) {
+ depth = __synthesize_probe_trace_arg_ref(ref, &buf,
&buflen, 1);
if (depth < 0)
return depth;
}
/* Print argument value */
- ret = e_snprintf(buf, buflen, "%s", arg->value);
+ if (arg->value[0] == '@' && arg->ref)
+ ret = e_snprintf(buf, buflen, "%s%+ld", arg->value,
+ arg->ref->offset);
+ else
+ ret = e_snprintf(buf, buflen, "%s", arg->value);
if (ret < 0)
return ret;
buf += ret;
@@ -951,9 +1055,9 @@
return buf - tmp;
}
-char *synthesize_kprobe_trace_command(struct kprobe_trace_event *tev)
+char *synthesize_probe_trace_command(struct probe_trace_event *tev)
{
- struct kprobe_trace_point *tp = &tev->point;
+ struct probe_trace_point *tp = &tev->point;
char *buf;
int i, len, ret;
@@ -969,7 +1073,7 @@
goto error;
for (i = 0; i < tev->nargs; i++) {
- ret = synthesize_kprobe_trace_arg(&tev->args[i], buf + len,
+ ret = synthesize_probe_trace_arg(&tev->args[i], buf + len,
MAX_CMDLEN - len);
if (ret <= 0)
goto error;
@@ -982,7 +1086,7 @@
return NULL;
}
-int convert_to_perf_probe_event(struct kprobe_trace_event *tev,
+static int convert_to_perf_probe_event(struct probe_trace_event *tev,
struct perf_probe_event *pev)
{
char buf[64] = "";
@@ -995,7 +1099,7 @@
return -ENOMEM;
/* Convert trace_point to probe_point */
- ret = convert_to_perf_probe_point(&tev->point, &pev->point);
+ ret = kprobe_convert_to_perf_probe(&tev->point, &pev->point);
if (ret < 0)
return ret;
@@ -1008,7 +1112,7 @@
if (tev->args[i].name)
pev->args[i].name = strdup(tev->args[i].name);
else {
- ret = synthesize_kprobe_trace_arg(&tev->args[i],
+ ret = synthesize_probe_trace_arg(&tev->args[i],
buf, 64);
pev->args[i].name = strdup(buf);
}
@@ -1059,9 +1163,9 @@
memset(pev, 0, sizeof(*pev));
}
-void clear_kprobe_trace_event(struct kprobe_trace_event *tev)
+static void clear_probe_trace_event(struct probe_trace_event *tev)
{
- struct kprobe_trace_arg_ref *ref, *next;
+ struct probe_trace_arg_ref *ref, *next;
int i;
if (tev->event)
@@ -1122,7 +1226,7 @@
}
/* Get raw string list of current kprobe_events */
-static struct strlist *get_kprobe_trace_command_rawlist(int fd)
+static struct strlist *get_probe_trace_command_rawlist(int fd)
{
int ret, idx;
FILE *fp;
@@ -1190,7 +1294,7 @@
int show_perf_probe_events(void)
{
int fd, ret;
- struct kprobe_trace_event tev;
+ struct probe_trace_event tev;
struct perf_probe_event pev;
struct strlist *rawlist;
struct str_node *ent;
@@ -1207,20 +1311,20 @@
if (fd < 0)
return fd;
- rawlist = get_kprobe_trace_command_rawlist(fd);
+ rawlist = get_probe_trace_command_rawlist(fd);
close(fd);
if (!rawlist)
return -ENOENT;
strlist__for_each(ent, rawlist) {
- ret = parse_kprobe_trace_command(ent->s, &tev);
+ ret = parse_probe_trace_command(ent->s, &tev);
if (ret >= 0) {
ret = convert_to_perf_probe_event(&tev, &pev);
if (ret >= 0)
ret = show_perf_probe_event(&pev);
}
clear_perf_probe_event(&pev);
- clear_kprobe_trace_event(&tev);
+ clear_probe_trace_event(&tev);
if (ret < 0)
break;
}
@@ -1230,20 +1334,19 @@
}
/* Get current perf-probe event names */
-static struct strlist *get_kprobe_trace_event_names(int fd, bool include_group)
+static struct strlist *get_probe_trace_event_names(int fd, bool include_group)
{
char buf[128];
struct strlist *sl, *rawlist;
struct str_node *ent;
- struct kprobe_trace_event tev;
+ struct probe_trace_event tev;
int ret = 0;
memset(&tev, 0, sizeof(tev));
-
- rawlist = get_kprobe_trace_command_rawlist(fd);
+ rawlist = get_probe_trace_command_rawlist(fd);
sl = strlist__new(true, NULL);
strlist__for_each(ent, rawlist) {
- ret = parse_kprobe_trace_command(ent->s, &tev);
+ ret = parse_probe_trace_command(ent->s, &tev);
if (ret < 0)
break;
if (include_group) {
@@ -1253,7 +1356,7 @@
ret = strlist__add(sl, buf);
} else
ret = strlist__add(sl, tev.event);
- clear_kprobe_trace_event(&tev);
+ clear_probe_trace_event(&tev);
if (ret < 0)
break;
}
@@ -1266,13 +1369,13 @@
return sl;
}
-static int write_kprobe_trace_event(int fd, struct kprobe_trace_event *tev)
+static int write_probe_trace_event(int fd, struct probe_trace_event *tev)
{
int ret = 0;
- char *buf = synthesize_kprobe_trace_command(tev);
+ char *buf = synthesize_probe_trace_command(tev);
if (!buf) {
- pr_debug("Failed to synthesize kprobe trace event.\n");
+ pr_debug("Failed to synthesize probe trace event.\n");
return -EINVAL;
}
@@ -1325,12 +1428,12 @@
return ret;
}
-static int __add_kprobe_trace_events(struct perf_probe_event *pev,
- struct kprobe_trace_event *tevs,
+static int __add_probe_trace_events(struct perf_probe_event *pev,
+ struct probe_trace_event *tevs,
int ntevs, bool allow_suffix)
{
int i, fd, ret;
- struct kprobe_trace_event *tev = NULL;
+ struct probe_trace_event *tev = NULL;
char buf[64];
const char *event, *group;
struct strlist *namelist;
@@ -1339,7 +1442,7 @@
if (fd < 0)
return fd;
/* Get current event names */
- namelist = get_kprobe_trace_event_names(fd, false);
+ namelist = get_probe_trace_event_names(fd, false);
if (!namelist) {
pr_debug("Failed to get current event list.\n");
return -EIO;
@@ -1374,7 +1477,7 @@
ret = -ENOMEM;
break;
}
- ret = write_kprobe_trace_event(fd, tev);
+ ret = write_probe_trace_event(fd, tev);
if (ret < 0)
break;
/* Add added event name to namelist */
@@ -1411,21 +1514,21 @@
return ret;
}
-static int convert_to_kprobe_trace_events(struct perf_probe_event *pev,
- struct kprobe_trace_event **tevs,
+static int convert_to_probe_trace_events(struct perf_probe_event *pev,
+ struct probe_trace_event **tevs,
int max_tevs)
{
struct symbol *sym;
int ret = 0, i;
- struct kprobe_trace_event *tev;
+ struct probe_trace_event *tev;
/* Convert perf_probe_event with debuginfo */
- ret = try_to_find_kprobe_trace_events(pev, tevs, max_tevs);
+ ret = try_to_find_probe_trace_events(pev, tevs, max_tevs);
if (ret != 0)
return ret;
/* Allocate trace event buffer */
- tev = *tevs = zalloc(sizeof(struct kprobe_trace_event));
+ tev = *tevs = zalloc(sizeof(struct probe_trace_event));
if (tev == NULL)
return -ENOMEM;
@@ -1438,7 +1541,7 @@
tev->point.offset = pev->point.offset;
tev->nargs = pev->nargs;
if (tev->nargs) {
- tev->args = zalloc(sizeof(struct kprobe_trace_arg)
+ tev->args = zalloc(sizeof(struct probe_trace_arg)
* tev->nargs);
if (tev->args == NULL) {
ret = -ENOMEM;
@@ -1479,7 +1582,7 @@
return 1;
error:
- clear_kprobe_trace_event(tev);
+ clear_probe_trace_event(tev);
free(tev);
*tevs = NULL;
return ret;
@@ -1487,7 +1590,7 @@
struct __event_package {
struct perf_probe_event *pev;
- struct kprobe_trace_event *tevs;
+ struct probe_trace_event *tevs;
int ntevs;
};
@@ -1510,7 +1613,7 @@
for (i = 0; i < npevs; i++) {
pkgs[i].pev = &pevs[i];
/* Convert with or without debuginfo */
- ret = convert_to_kprobe_trace_events(pkgs[i].pev,
+ ret = convert_to_probe_trace_events(pkgs[i].pev,
&pkgs[i].tevs, max_tevs);
if (ret < 0)
goto end;
@@ -1519,24 +1622,24 @@
/* Loop 2: add all events */
for (i = 0; i < npevs && ret >= 0; i++)
- ret = __add_kprobe_trace_events(pkgs[i].pev, pkgs[i].tevs,
+ ret = __add_probe_trace_events(pkgs[i].pev, pkgs[i].tevs,
pkgs[i].ntevs, force_add);
end:
/* Loop 3: cleanup trace events */
for (i = 0; i < npevs; i++)
for (j = 0; j < pkgs[i].ntevs; j++)
- clear_kprobe_trace_event(&pkgs[i].tevs[j]);
+ clear_probe_trace_event(&pkgs[i].tevs[j]);
return ret;
}
-static int __del_trace_kprobe_event(int fd, struct str_node *ent)
+static int __del_trace_probe_event(int fd, struct str_node *ent)
{
char *p;
char buf[128];
int ret;
- /* Convert from perf-probe event to trace-kprobe event */
+ /* Convert from perf-probe event to trace-probe event */
ret = e_snprintf(buf, 128, "-:%s", ent->s);
if (ret < 0)
goto error;
@@ -1562,7 +1665,7 @@
return ret;
}
-static int del_trace_kprobe_event(int fd, const char *group,
+static int del_trace_probe_event(int fd, const char *group,
const char *event, struct strlist *namelist)
{
char buf[128];
@@ -1579,7 +1682,7 @@
strlist__for_each_safe(ent, n, namelist)
if (strglobmatch(ent->s, buf)) {
found++;
- ret = __del_trace_kprobe_event(fd, ent);
+ ret = __del_trace_probe_event(fd, ent);
if (ret < 0)
break;
strlist__remove(namelist, ent);
@@ -1588,7 +1691,7 @@
ent = strlist__find(namelist, buf);
if (ent) {
found++;
- ret = __del_trace_kprobe_event(fd, ent);
+ ret = __del_trace_probe_event(fd, ent);
if (ret >= 0)
strlist__remove(namelist, ent);
}
@@ -1612,7 +1715,7 @@
return fd;
/* Get current event names */
- namelist = get_kprobe_trace_event_names(fd, true);
+ namelist = get_probe_trace_event_names(fd, true);
if (namelist == NULL)
return -EINVAL;
@@ -1633,7 +1736,7 @@
event = str;
}
pr_debug("Group: %s, Event: %s\n", group, event);
- ret = del_trace_kprobe_event(fd, group, event, namelist);
+ ret = del_trace_probe_event(fd, group, event, namelist);
free(str);
if (ret < 0)
break;
diff --git a/tools/perf/util/probe-event.h b/tools/perf/util/probe-event.h
index e9db1a2..5af3924 100644
--- a/tools/perf/util/probe-event.h
+++ b/tools/perf/util/probe-event.h
@@ -7,33 +7,33 @@
extern bool probe_event_dry_run;
/* kprobe-tracer tracing point */
-struct kprobe_trace_point {
+struct probe_trace_point {
char *symbol; /* Base symbol */
unsigned long offset; /* Offset from symbol */
bool retprobe; /* Return probe flag */
};
-/* kprobe-tracer tracing argument referencing offset */
-struct kprobe_trace_arg_ref {
- struct kprobe_trace_arg_ref *next; /* Next reference */
+/* probe-tracer tracing argument referencing offset */
+struct probe_trace_arg_ref {
+ struct probe_trace_arg_ref *next; /* Next reference */
long offset; /* Offset value */
};
/* kprobe-tracer tracing argument */
-struct kprobe_trace_arg {
+struct probe_trace_arg {
char *name; /* Argument name */
char *value; /* Base value */
char *type; /* Type name */
- struct kprobe_trace_arg_ref *ref; /* Referencing offset */
+ struct probe_trace_arg_ref *ref; /* Referencing offset */
};
/* kprobe-tracer tracing event (point + arg) */
-struct kprobe_trace_event {
+struct probe_trace_event {
char *event; /* Event name */
char *group; /* Group name */
- struct kprobe_trace_point point; /* Trace point */
+ struct probe_trace_point point; /* Trace point */
int nargs; /* Number of args */
- struct kprobe_trace_arg *args; /* Arguments */
+ struct probe_trace_arg *args; /* Arguments */
};
/* Perf probe probing point */
@@ -50,6 +50,7 @@
struct perf_probe_arg_field {
struct perf_probe_arg_field *next; /* Next field */
char *name; /* Name of the field */
+ long index; /* Array index number */
bool ref; /* Referencing flag */
};
@@ -85,31 +86,25 @@
int end; /* End line number */
int offset; /* Start line offset */
char *path; /* Real path name */
+ char *comp_dir; /* Compile directory */
struct list_head line_list; /* Visible lines */
};
/* Command string to events */
extern int parse_perf_probe_command(const char *cmd,
struct perf_probe_event *pev);
-extern int parse_kprobe_trace_command(const char *cmd,
- struct kprobe_trace_event *tev);
/* Events to command string */
extern char *synthesize_perf_probe_command(struct perf_probe_event *pev);
-extern char *synthesize_kprobe_trace_command(struct kprobe_trace_event *tev);
+extern char *synthesize_probe_trace_command(struct probe_trace_event *tev);
extern int synthesize_perf_probe_arg(struct perf_probe_arg *pa, char *buf,
size_t len);
/* Check the perf_probe_event needs debuginfo */
extern bool perf_probe_event_need_dwarf(struct perf_probe_event *pev);
-/* Convert from kprobe_trace_event to perf_probe_event */
-extern int convert_to_perf_probe_event(struct kprobe_trace_event *tev,
- struct perf_probe_event *pev);
-
/* Release event contents */
extern void clear_perf_probe_event(struct perf_probe_event *pev);
-extern void clear_kprobe_trace_event(struct kprobe_trace_event *tev);
/* Command string to line-range */
extern int parse_line_range_desc(const char *cmd, struct line_range *lr);
diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
index d964cb1..840f1aa 100644
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -37,6 +37,7 @@
#include "event.h"
#include "debug.h"
#include "util.h"
+#include "symbol.h"
#include "probe-finder.h"
/* Kprobe tracer basic type is up to u64 */
@@ -143,12 +144,21 @@
return src;
}
+/* Get DW_AT_comp_dir (should be NULL with older gcc) */
+static const char *cu_get_comp_dir(Dwarf_Die *cu_die)
+{
+ Dwarf_Attribute attr;
+ if (dwarf_attr(cu_die, DW_AT_comp_dir, &attr) == NULL)
+ return NULL;
+ return dwarf_formstring(&attr);
+}
+
/* Compare diename and tname */
static bool die_compare_name(Dwarf_Die *dw_die, const char *tname)
{
const char *name;
name = dwarf_diename(dw_die);
- return name ? strcmp(tname, name) : -1;
+ return name ? (strcmp(tname, name) == 0) : false;
}
/* Get type die, but skip qualifiers and typedef */
@@ -319,7 +329,7 @@
tag = dwarf_tag(die_mem);
if ((tag == DW_TAG_formal_parameter ||
tag == DW_TAG_variable) &&
- (die_compare_name(die_mem, name) == 0))
+ die_compare_name(die_mem, name))
return DIE_FIND_CB_FOUND;
return DIE_FIND_CB_CONTINUE;
@@ -338,7 +348,7 @@
const char *name = data;
if ((dwarf_tag(die_mem) == DW_TAG_member) &&
- (die_compare_name(die_mem, name) == 0))
+ die_compare_name(die_mem, name))
return DIE_FIND_CB_FOUND;
return DIE_FIND_CB_SIBLING;
@@ -356,14 +366,50 @@
* Probe finder related functions
*/
-/* Show a location */
-static int convert_location(Dwarf_Op *op, struct probe_finder *pf)
+static struct probe_trace_arg_ref *alloc_trace_arg_ref(long offs)
{
+ struct probe_trace_arg_ref *ref;
+ ref = zalloc(sizeof(struct probe_trace_arg_ref));
+ if (ref != NULL)
+ ref->offset = offs;
+ return ref;
+}
+
+/* Show a location */
+static int convert_variable_location(Dwarf_Die *vr_die, struct probe_finder *pf)
+{
+ Dwarf_Attribute attr;
+ Dwarf_Op *op;
+ size_t nops;
unsigned int regn;
Dwarf_Word offs = 0;
bool ref = false;
const char *regs;
- struct kprobe_trace_arg *tvar = pf->tvar;
+ struct probe_trace_arg *tvar = pf->tvar;
+ int ret;
+
+ /* TODO: handle more than 1 exprs */
+ if (dwarf_attr(vr_die, DW_AT_location, &attr) == NULL ||
+ dwarf_getlocation_addr(&attr, pf->addr, &op, &nops, 1) <= 0 ||
+ nops == 0) {
+ /* TODO: Support const_value */
+ pr_err("Failed to find the location of %s at this address.\n"
+ " Perhaps, it has been optimized out.\n", pf->pvar->var);
+ return -ENOENT;
+ }
+
+ if (op->atom == DW_OP_addr) {
+ /* Static variables on memory (not stack), make @varname */
+ ret = strlen(dwarf_diename(vr_die));
+ tvar->value = zalloc(ret + 2);
+ if (tvar->value == NULL)
+ return -ENOMEM;
+ snprintf(tvar->value, ret + 2, "@%s", dwarf_diename(vr_die));
+ tvar->ref = alloc_trace_arg_ref((long)offs);
+ if (tvar->ref == NULL)
+ return -ENOMEM;
+ return 0;
+ }
/* If this is based on frame buffer, set the offset */
if (op->atom == DW_OP_fbreg) {
@@ -405,27 +451,72 @@
return -ENOMEM;
if (ref) {
- tvar->ref = zalloc(sizeof(struct kprobe_trace_arg_ref));
+ tvar->ref = alloc_trace_arg_ref((long)offs);
if (tvar->ref == NULL)
return -ENOMEM;
- tvar->ref->offset = (long)offs;
}
return 0;
}
static int convert_variable_type(Dwarf_Die *vr_die,
- struct kprobe_trace_arg *targ)
+ struct probe_trace_arg *tvar,
+ const char *cast)
{
+ struct probe_trace_arg_ref **ref_ptr = &tvar->ref;
Dwarf_Die type;
char buf[16];
int ret;
+ /* TODO: check all types */
+ if (cast && strcmp(cast, "string") != 0) {
+ /* Non string type is OK */
+ tvar->type = strdup(cast);
+ return (tvar->type == NULL) ? -ENOMEM : 0;
+ }
+
if (die_get_real_type(vr_die, &type) == NULL) {
pr_warning("Failed to get a type information of %s.\n",
dwarf_diename(vr_die));
return -ENOENT;
}
+ pr_debug("%s type is %s.\n",
+ dwarf_diename(vr_die), dwarf_diename(&type));
+
+ if (cast && strcmp(cast, "string") == 0) { /* String type */
+ ret = dwarf_tag(&type);
+ if (ret != DW_TAG_pointer_type &&
+ ret != DW_TAG_array_type) {
+ pr_warning("Failed to cast into string: "
+ "%s(%s) is not a pointer nor array.",
+ dwarf_diename(vr_die), dwarf_diename(&type));
+ return -EINVAL;
+ }
+ if (ret == DW_TAG_pointer_type) {
+ if (die_get_real_type(&type, &type) == NULL) {
+ pr_warning("Failed to get a type information.");
+ return -ENOENT;
+ }
+ while (*ref_ptr)
+ ref_ptr = &(*ref_ptr)->next;
+ /* Add new reference with offset +0 */
+ *ref_ptr = zalloc(sizeof(struct probe_trace_arg_ref));
+ if (*ref_ptr == NULL) {
+ pr_warning("Out of memory error\n");
+ return -ENOMEM;
+ }
+ }
+ if (!die_compare_name(&type, "char") &&
+ !die_compare_name(&type, "unsigned char")) {
+ pr_warning("Failed to cast into string: "
+ "%s is not (unsigned) char *.",
+ dwarf_diename(vr_die));
+ return -EINVAL;
+ }
+ tvar->type = strdup(cast);
+ return (tvar->type == NULL) ? -ENOMEM : 0;
+ }
+
ret = die_get_byte_size(&type) * 8;
if (ret) {
/* Check the bitwidth */
@@ -445,8 +536,8 @@
strerror(-ret));
return ret;
}
- targ->type = strdup(buf);
- if (targ->type == NULL)
+ tvar->type = strdup(buf);
+ if (tvar->type == NULL)
return -ENOMEM;
}
return 0;
@@ -454,22 +545,50 @@
static int convert_variable_fields(Dwarf_Die *vr_die, const char *varname,
struct perf_probe_arg_field *field,
- struct kprobe_trace_arg_ref **ref_ptr,
+ struct probe_trace_arg_ref **ref_ptr,
Dwarf_Die *die_mem)
{
- struct kprobe_trace_arg_ref *ref = *ref_ptr;
+ struct probe_trace_arg_ref *ref = *ref_ptr;
Dwarf_Die type;
Dwarf_Word offs;
- int ret;
+ int ret, tag;
pr_debug("converting %s in %s\n", field->name, varname);
if (die_get_real_type(vr_die, &type) == NULL) {
pr_warning("Failed to get the type of %s.\n", varname);
return -ENOENT;
}
+ pr_debug2("Var real type: (%x)\n", (unsigned)dwarf_dieoffset(&type));
+ tag = dwarf_tag(&type);
- /* Check the pointer and dereference */
- if (dwarf_tag(&type) == DW_TAG_pointer_type) {
+ if (field->name[0] == '[' &&
+ (tag == DW_TAG_array_type || tag == DW_TAG_pointer_type)) {
+ if (field->next)
+ /* Save original type for next field */
+ memcpy(die_mem, &type, sizeof(*die_mem));
+ /* Get the type of this array */
+ if (die_get_real_type(&type, &type) == NULL) {
+ pr_warning("Failed to get the type of %s.\n", varname);
+ return -ENOENT;
+ }
+ pr_debug2("Array real type: (%x)\n",
+ (unsigned)dwarf_dieoffset(&type));
+ if (tag == DW_TAG_pointer_type) {
+ ref = zalloc(sizeof(struct probe_trace_arg_ref));
+ if (ref == NULL)
+ return -ENOMEM;
+ if (*ref_ptr)
+ (*ref_ptr)->next = ref;
+ else
+ *ref_ptr = ref;
+ }
+ ref->offset += die_get_byte_size(&type) * field->index;
+ if (!field->next)
+ /* Save vr_die for converting types */
+ memcpy(die_mem, vr_die, sizeof(*die_mem));
+ goto next;
+ } else if (tag == DW_TAG_pointer_type) {
+ /* Check the pointer and dereference */
if (!field->ref) {
pr_err("Semantic error: %s must be referred by '->'\n",
field->name);
@@ -486,7 +605,7 @@
return -EINVAL;
}
- ref = zalloc(sizeof(struct kprobe_trace_arg_ref));
+ ref = zalloc(sizeof(struct probe_trace_arg_ref));
if (ref == NULL)
return -ENOMEM;
if (*ref_ptr)
@@ -495,10 +614,15 @@
*ref_ptr = ref;
} else {
/* Verify it is a data structure */
- if (dwarf_tag(&type) != DW_TAG_structure_type) {
+ if (tag != DW_TAG_structure_type) {
pr_warning("%s is not a data structure.\n", varname);
return -EINVAL;
}
+ if (field->name[0] == '[') {
+ pr_err("Semantic error: %s is not a pointor nor array.",
+ varname);
+ return -EINVAL;
+ }
if (field->ref) {
pr_err("Semantic error: %s must be referred by '.'\n",
field->name);
@@ -525,6 +649,7 @@
}
ref->offset += (long)offs;
+next:
/* Converting next field */
if (field->next)
return convert_variable_fields(die_mem, field->name,
@@ -536,51 +661,32 @@
/* Show a variables in kprobe event format */
static int convert_variable(Dwarf_Die *vr_die, struct probe_finder *pf)
{
- Dwarf_Attribute attr;
Dwarf_Die die_mem;
- Dwarf_Op *expr;
- size_t nexpr;
int ret;
- if (dwarf_attr(vr_die, DW_AT_location, &attr) == NULL)
- goto error;
- /* TODO: handle more than 1 exprs */
- ret = dwarf_getlocation_addr(&attr, pf->addr, &expr, &nexpr, 1);
- if (ret <= 0 || nexpr == 0)
- goto error;
+ pr_debug("Converting variable %s into trace event.\n",
+ dwarf_diename(vr_die));
- ret = convert_location(expr, pf);
+ ret = convert_variable_location(vr_die, pf);
if (ret == 0 && pf->pvar->field) {
ret = convert_variable_fields(vr_die, pf->pvar->var,
pf->pvar->field, &pf->tvar->ref,
&die_mem);
vr_die = &die_mem;
}
- if (ret == 0) {
- if (pf->pvar->type) {
- pf->tvar->type = strdup(pf->pvar->type);
- if (pf->tvar->type == NULL)
- ret = -ENOMEM;
- } else
- ret = convert_variable_type(vr_die, pf->tvar);
- }
+ if (ret == 0)
+ ret = convert_variable_type(vr_die, pf->tvar, pf->pvar->type);
/* *expr will be cached in libdw. Don't free it. */
return ret;
-error:
- /* TODO: Support const_value */
- pr_err("Failed to find the location of %s at this address.\n"
- " Perhaps, it has been optimized out.\n", pf->pvar->var);
- return -ENOENT;
}
/* Find a variable in a subprogram die */
static int find_variable(Dwarf_Die *sp_die, struct probe_finder *pf)
{
- Dwarf_Die vr_die;
+ Dwarf_Die vr_die, *scopes;
char buf[32], *ptr;
- int ret;
+ int ret, nscopes;
- /* TODO: Support arrays */
if (pf->pvar->name)
pf->tvar->name = strdup(pf->pvar->name);
else {
@@ -607,18 +713,32 @@
pr_debug("Searching '%s' variable in context.\n",
pf->pvar->var);
/* Search child die for local variables and parameters. */
- if (!die_find_variable(sp_die, pf->pvar->var, &vr_die)) {
+ if (die_find_variable(sp_die, pf->pvar->var, &vr_die))
+ ret = convert_variable(&vr_die, pf);
+ else {
+ /* Search upper class */
+ nscopes = dwarf_getscopes_die(sp_die, &scopes);
+ if (nscopes > 0) {
+ ret = dwarf_getscopevar(scopes, nscopes, pf->pvar->var,
+ 0, NULL, 0, 0, &vr_die);
+ if (ret >= 0)
+ ret = convert_variable(&vr_die, pf);
+ else
+ ret = -ENOENT;
+ free(scopes);
+ } else
+ ret = -ENOENT;
+ }
+ if (ret < 0)
pr_warning("Failed to find '%s' in this function.\n",
pf->pvar->var);
- return -ENOENT;
- }
- return convert_variable(&vr_die, pf);
+ return ret;
}
/* Show a probe point to output buffer */
static int convert_probe_point(Dwarf_Die *sp_die, struct probe_finder *pf)
{
- struct kprobe_trace_event *tev;
+ struct probe_trace_event *tev;
Dwarf_Addr eaddr;
Dwarf_Die die_mem;
const char *name;
@@ -683,7 +803,7 @@
/* Find each argument */
tev->nargs = pf->pev->nargs;
- tev->args = zalloc(sizeof(struct kprobe_trace_arg) * tev->nargs);
+ tev->args = zalloc(sizeof(struct probe_trace_arg) * tev->nargs);
if (tev->args == NULL)
return -ENOMEM;
for (i = 0; i < pf->pev->nargs; i++) {
@@ -897,7 +1017,7 @@
/* Check tag and diename */
if (dwarf_tag(sp_die) != DW_TAG_subprogram ||
- die_compare_name(sp_die, pp->function) != 0)
+ !die_compare_name(sp_die, pp->function))
return DWARF_CB_OK;
pf->fname = dwarf_decl_file(sp_die);
@@ -940,9 +1060,9 @@
return _param.retval;
}
-/* Find kprobe_trace_events specified by perf_probe_event from debuginfo */
-int find_kprobe_trace_events(int fd, struct perf_probe_event *pev,
- struct kprobe_trace_event **tevs, int max_tevs)
+/* Find probe_trace_events specified by perf_probe_event from debuginfo */
+int find_probe_trace_events(int fd, struct perf_probe_event *pev,
+ struct probe_trace_event **tevs, int max_tevs)
{
struct probe_finder pf = {.pev = pev, .max_tevs = max_tevs};
struct perf_probe_point *pp = &pev->point;
@@ -952,7 +1072,7 @@
Dwarf *dbg;
int ret = 0;
- pf.tevs = zalloc(sizeof(struct kprobe_trace_event) * max_tevs);
+ pf.tevs = zalloc(sizeof(struct probe_trace_event) * max_tevs);
if (pf.tevs == NULL)
return -ENOMEM;
*tevs = pf.tevs;
@@ -1096,7 +1216,7 @@
static int line_range_add_line(const char *src, unsigned int lineno,
struct line_range *lr)
{
- /* Copy real path */
+ /* Copy source path */
if (!lr->path) {
lr->path = strdup(src);
if (lr->path == NULL)
@@ -1220,7 +1340,7 @@
struct line_range *lr = lf->lr;
if (dwarf_tag(sp_die) == DW_TAG_subprogram &&
- die_compare_name(sp_die, lr->function) == 0) {
+ die_compare_name(sp_die, lr->function)) {
lf->fname = dwarf_decl_file(sp_die);
dwarf_decl_line(sp_die, &lr->offset);
pr_debug("fname: %s, lineno:%d\n", lf->fname, lr->offset);
@@ -1263,6 +1383,7 @@
size_t cuhl;
Dwarf_Die *diep;
Dwarf *dbg;
+ const char *comp_dir;
dbg = dwarf_begin(fd, DWARF_C_READ);
if (!dbg) {
@@ -1298,7 +1419,18 @@
}
off = noff;
}
- pr_debug("path: %lx\n", (unsigned long)lr->path);
+
+ /* Store comp_dir */
+ if (lf.found) {
+ comp_dir = cu_get_comp_dir(&lf.cu_die);
+ if (comp_dir) {
+ lr->comp_dir = strdup(comp_dir);
+ if (!lr->comp_dir)
+ ret = -ENOMEM;
+ }
+ }
+
+ pr_debug("path: %s\n", lr->path);
dwarf_end(dbg);
return (ret < 0) ? ret : lf.found;
diff --git a/tools/perf/util/probe-finder.h b/tools/perf/util/probe-finder.h
index e1f61dc..4507d51 100644
--- a/tools/perf/util/probe-finder.h
+++ b/tools/perf/util/probe-finder.h
@@ -16,9 +16,9 @@
}
#ifdef DWARF_SUPPORT
-/* Find kprobe_trace_events specified by perf_probe_event from debuginfo */
-extern int find_kprobe_trace_events(int fd, struct perf_probe_event *pev,
- struct kprobe_trace_event **tevs,
+/* Find probe_trace_events specified by perf_probe_event from debuginfo */
+extern int find_probe_trace_events(int fd, struct perf_probe_event *pev,
+ struct probe_trace_event **tevs,
int max_tevs);
/* Find a perf_probe_point from debuginfo */
@@ -33,7 +33,7 @@
struct probe_finder {
struct perf_probe_event *pev; /* Target probe event */
- struct kprobe_trace_event *tevs; /* Result trace events */
+ struct probe_trace_event *tevs; /* Result trace events */
int ntevs; /* Number of trace events */
int max_tevs; /* Max number of trace events */
@@ -50,7 +50,7 @@
#endif
Dwarf_Op *fb_ops; /* Frame base attribute */
struct perf_probe_arg *pvar; /* Current target variable */
- struct kprobe_trace_arg *tvar; /* Current result variable */
+ struct probe_trace_arg *tvar; /* Current result variable */
};
struct line_finder {
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index c422cd6..fa9d652 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -27,8 +27,10 @@
self->fd = open(self->filename, O_RDONLY);
if (self->fd < 0) {
- pr_err("failed to open file: %s", self->filename);
- if (!strcmp(self->filename, "perf.data"))
+ int err = errno;
+
+ pr_err("failed to open %s: %s", self->filename, strerror(err));
+ if (err == ENOENT && !strcmp(self->filename, "perf.data"))
pr_err(" (try 'perf record' first)");
pr_err("\n");
return -errno;
@@ -77,6 +79,12 @@
return ret;
}
+static void perf_session__destroy_kernel_maps(struct perf_session *self)
+{
+ machine__destroy_kernel_maps(&self->host_machine);
+ machines__destroy_guest_kernel_maps(&self->machines);
+}
+
struct perf_session *perf_session__new(const char *filename, int mode, bool force, bool repipe)
{
size_t len = filename ? strlen(filename) + 1 : 0;
@@ -94,8 +102,6 @@
self->hists_tree = RB_ROOT;
self->last_match = NULL;
self->mmap_window = 32;
- self->cwd = NULL;
- self->cwdlen = 0;
self->machines = RB_ROOT;
self->repipe = repipe;
INIT_LIST_HEAD(&self->ordered_samples.samples_head);
@@ -124,16 +130,43 @@
return NULL;
}
+static void perf_session__delete_dead_threads(struct perf_session *self)
+{
+ struct thread *n, *t;
+
+ list_for_each_entry_safe(t, n, &self->dead_threads, node) {
+ list_del(&t->node);
+ thread__delete(t);
+ }
+}
+
+static void perf_session__delete_threads(struct perf_session *self)
+{
+ struct rb_node *nd = rb_first(&self->threads);
+
+ while (nd) {
+ struct thread *t = rb_entry(nd, struct thread, rb_node);
+
+ rb_erase(&t->rb_node, &self->threads);
+ nd = rb_next(nd);
+ thread__delete(t);
+ }
+}
+
void perf_session__delete(struct perf_session *self)
{
perf_header__exit(&self->header);
+ perf_session__destroy_kernel_maps(self);
+ perf_session__delete_dead_threads(self);
+ perf_session__delete_threads(self);
+ machine__exit(&self->host_machine);
close(self->fd);
- free(self->cwd);
free(self);
}
void perf_session__remove_thread(struct perf_session *self, struct thread *th)
{
+ self->last_match = NULL;
rb_erase(&th->rb_node, &self->threads);
/*
* We may have references to this thread, for instance in some hist_entry
@@ -830,23 +863,6 @@
if (perf_session__register_idle_thread(self) == NULL)
return -ENOMEM;
- if (!symbol_conf.full_paths) {
- char bf[PATH_MAX];
-
- if (getcwd(bf, sizeof(bf)) == NULL) {
- err = -errno;
-out_getcwd_err:
- pr_err("failed to get the current directory\n");
- goto out_err;
- }
- self->cwd = strdup(bf);
- if (self->cwd == NULL) {
- err = -ENOMEM;
- goto out_getcwd_err;
- }
- self->cwdlen = strlen(self->cwd);
- }
-
if (!self->fd_pipe)
err = __perf_session__process_events(self,
self->header.data_offset,
@@ -854,7 +870,7 @@
self->size, ops);
else
err = __perf_session__process_pipe_events(self, ops);
-out_err:
+
return err;
}
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 2316cb5..1c61a4f 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -1,4 +1,5 @@
#include "sort.h"
+#include "hist.h"
regex_t parent_regex;
const char default_parent_pattern[] = "^sys_|^do_page_fault";
@@ -10,10 +11,6 @@
enum sort_type sort__first_dimension;
-unsigned int dsos__col_width;
-unsigned int comms__col_width;
-unsigned int threads__col_width;
-static unsigned int parent_symbol__col_width;
char * field_sep;
LIST_HEAD(hist_entry__sort_list);
@@ -28,12 +25,14 @@
size_t size, unsigned int width);
static int hist_entry__parent_snprintf(struct hist_entry *self, char *bf,
size_t size, unsigned int width);
+static int hist_entry__cpu_snprintf(struct hist_entry *self, char *bf,
+ size_t size, unsigned int width);
struct sort_entry sort_thread = {
.se_header = "Command: Pid",
.se_cmp = sort__thread_cmp,
.se_snprintf = hist_entry__thread_snprintf,
- .se_width = &threads__col_width,
+ .se_width_idx = HISTC_THREAD,
};
struct sort_entry sort_comm = {
@@ -41,27 +40,35 @@
.se_cmp = sort__comm_cmp,
.se_collapse = sort__comm_collapse,
.se_snprintf = hist_entry__comm_snprintf,
- .se_width = &comms__col_width,
+ .se_width_idx = HISTC_COMM,
};
struct sort_entry sort_dso = {
.se_header = "Shared Object",
.se_cmp = sort__dso_cmp,
.se_snprintf = hist_entry__dso_snprintf,
- .se_width = &dsos__col_width,
+ .se_width_idx = HISTC_DSO,
};
struct sort_entry sort_sym = {
.se_header = "Symbol",
.se_cmp = sort__sym_cmp,
.se_snprintf = hist_entry__sym_snprintf,
+ .se_width_idx = HISTC_SYMBOL,
};
struct sort_entry sort_parent = {
.se_header = "Parent symbol",
.se_cmp = sort__parent_cmp,
.se_snprintf = hist_entry__parent_snprintf,
- .se_width = &parent_symbol__col_width,
+ .se_width_idx = HISTC_PARENT,
+};
+
+struct sort_entry sort_cpu = {
+ .se_header = "CPU",
+ .se_cmp = sort__cpu_cmp,
+ .se_snprintf = hist_entry__cpu_snprintf,
+ .se_width_idx = HISTC_CPU,
};
struct sort_dimension {
@@ -76,6 +83,7 @@
{ .name = "dso", .entry = &sort_dso, },
{ .name = "symbol", .entry = &sort_sym, },
{ .name = "parent", .entry = &sort_parent, },
+ { .name = "cpu", .entry = &sort_cpu, },
};
int64_t cmp_null(void *l, void *r)
@@ -242,6 +250,20 @@
self->parent ? self->parent->name : "[other]");
}
+/* --sort cpu */
+
+int64_t
+sort__cpu_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+ return right->cpu - left->cpu;
+}
+
+static int hist_entry__cpu_snprintf(struct hist_entry *self, char *bf,
+ size_t size, unsigned int width)
+{
+ return repsep_snprintf(bf, size, "%-*d", width, self->cpu);
+}
+
int sort_dimension__add(const char *tok)
{
unsigned int i;
@@ -281,6 +303,8 @@
sort__first_dimension = SORT_SYM;
else if (!strcmp(sd->name, "parent"))
sort__first_dimension = SORT_PARENT;
+ else if (!strcmp(sd->name, "cpu"))
+ sort__first_dimension = SORT_CPU;
}
list_add_tail(&sd->entry->list, &hist_entry__sort_list);
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 0d61c40..46e531d 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -36,11 +36,14 @@
extern struct sort_entry sort_dso;
extern struct sort_entry sort_sym;
extern struct sort_entry sort_parent;
-extern unsigned int dsos__col_width;
-extern unsigned int comms__col_width;
-extern unsigned int threads__col_width;
extern enum sort_type sort__first_dimension;
+/**
+ * struct hist_entry - histogram entry
+ *
+ * @row_offset - offset from the first callchain expanded to appear on screen
+ * @nr_rows - rows expanded in callchain, recalculated on folding/unfolding
+ */
struct hist_entry {
struct rb_node rb_node;
u64 period;
@@ -51,7 +54,14 @@
struct map_symbol ms;
struct thread *thread;
u64 ip;
+ s32 cpu;
u32 nr_events;
+
+ /* XXX These two should move to some tree widget lib */
+ u16 row_offset;
+ u16 nr_rows;
+
+ bool init_have_children;
char level;
u8 filtered;
struct symbol *parent;
@@ -68,7 +78,8 @@
SORT_COMM,
SORT_DSO,
SORT_SYM,
- SORT_PARENT
+ SORT_PARENT,
+ SORT_CPU,
};
/*
@@ -84,7 +95,7 @@
int64_t (*se_collapse)(struct hist_entry *, struct hist_entry *);
int (*se_snprintf)(struct hist_entry *self, char *bf, size_t size,
unsigned int width);
- unsigned int *se_width;
+ u8 se_width_idx;
bool elide;
};
@@ -104,6 +115,7 @@
extern int64_t sort__dso_cmp(struct hist_entry *, struct hist_entry *);
extern int64_t sort__sym_cmp(struct hist_entry *, struct hist_entry *);
extern int64_t sort__parent_cmp(struct hist_entry *, struct hist_entry *);
+int64_t sort__cpu_cmp(struct hist_entry *left, struct hist_entry *right);
extern size_t sort__parent_print(FILE *, struct hist_entry *, unsigned int);
extern int sort_dimension__add(const char *);
void sort_entry__setup_elide(struct sort_entry *self, struct strlist *list,
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 5b27683..6f0dd90 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -12,6 +12,7 @@
#include <fcntl.h>
#include <unistd.h>
#include "build-id.h"
+#include "debug.h"
#include "symbol.h"
#include "strlist.h"
@@ -25,6 +26,8 @@
#define NT_GNU_BUILD_ID 3
#endif
+static bool dso__build_id_equal(const struct dso *self, u8 *build_id);
+static int elf_read_build_id(Elf *elf, void *bf, size_t size);
static void dsos__add(struct list_head *head, struct dso *dso);
static struct map *map__new2(u64 start, struct dso *dso, enum map_type type);
static int dso__load_kernel_sym(struct dso *self, struct map *map,
@@ -40,6 +43,14 @@
.try_vmlinux_path = true,
};
+int dso__name_len(const struct dso *self)
+{
+ if (verbose)
+ return self->long_name_len;
+
+ return self->short_name_len;
+}
+
bool dso__loaded(const struct dso *self, enum map_type type)
{
return self->loaded & (1 << type);
@@ -215,7 +226,9 @@
int i;
for (i = 0; i < MAP__NR_TYPES; ++i)
symbols__delete(&self->symbols[i]);
- if (self->long_name != self->name)
+ if (self->sname_alloc)
+ free((char *)self->short_name);
+ if (self->lname_alloc)
free(self->long_name);
free(self);
}
@@ -933,8 +946,28 @@
}
}
+static size_t elf_addr_to_index(Elf *elf, GElf_Addr addr)
+{
+ Elf_Scn *sec = NULL;
+ GElf_Shdr shdr;
+ size_t cnt = 1;
+
+ while ((sec = elf_nextscn(elf, sec)) != NULL) {
+ gelf_getshdr(sec, &shdr);
+
+ if ((addr >= shdr.sh_addr) &&
+ (addr < (shdr.sh_addr + shdr.sh_size)))
+ return cnt;
+
+ ++cnt;
+ }
+
+ return -1;
+}
+
static int dso__load_sym(struct dso *self, struct map *map, const char *name,
- int fd, symbol_filter_t filter, int kmodule)
+ int fd, symbol_filter_t filter, int kmodule,
+ int want_symtab)
{
struct kmap *kmap = self->kernel ? map__kmap(map) : NULL;
struct map *curr_map = map;
@@ -944,31 +977,51 @@
int err = -1;
uint32_t idx;
GElf_Ehdr ehdr;
- GElf_Shdr shdr;
- Elf_Data *syms;
+ GElf_Shdr shdr, opdshdr;
+ Elf_Data *syms, *opddata = NULL;
GElf_Sym sym;
- Elf_Scn *sec, *sec_strndx;
+ Elf_Scn *sec, *sec_strndx, *opdsec;
Elf *elf;
int nr = 0;
+ size_t opdidx = 0;
elf = elf_begin(fd, PERF_ELF_C_READ_MMAP, NULL);
if (elf == NULL) {
- pr_err("%s: cannot read %s ELF file.\n", __func__, name);
+ pr_debug("%s: cannot read %s ELF file.\n", __func__, name);
goto out_close;
}
if (gelf_getehdr(elf, &ehdr) == NULL) {
- pr_err("%s: cannot get elf header.\n", __func__);
+ pr_debug("%s: cannot get elf header.\n", __func__);
goto out_elf_end;
}
+ /* Always reject images with a mismatched build-id: */
+ if (self->has_build_id) {
+ u8 build_id[BUILD_ID_SIZE];
+
+ if (elf_read_build_id(elf, build_id,
+ BUILD_ID_SIZE) != BUILD_ID_SIZE)
+ goto out_elf_end;
+
+ if (!dso__build_id_equal(self, build_id))
+ goto out_elf_end;
+ }
+
sec = elf_section_by_name(elf, &ehdr, &shdr, ".symtab", NULL);
if (sec == NULL) {
+ if (want_symtab)
+ goto out_elf_end;
+
sec = elf_section_by_name(elf, &ehdr, &shdr, ".dynsym", NULL);
if (sec == NULL)
goto out_elf_end;
}
+ opdsec = elf_section_by_name(elf, &ehdr, &opdshdr, ".opd", &opdidx);
+ if (opdsec)
+ opddata = elf_rawdata(opdsec, NULL);
+
syms = elf_getdata(sec, NULL);
if (syms == NULL)
goto out_elf_end;
@@ -1013,6 +1066,13 @@
if (!is_label && !elf_sym__is_a(&sym, map->type))
continue;
+ if (opdsec && sym.st_shndx == opdidx) {
+ u32 offset = sym.st_value - opdshdr.sh_addr;
+ u64 *opd = opddata->d_buf + offset;
+ sym.st_value = *opd;
+ sym.st_shndx = elf_addr_to_index(elf, sym.st_value);
+ }
+
sec = elf_getscn(elf, sym.st_shndx);
if (!sec)
goto out_elf_end;
@@ -1151,37 +1211,26 @@
*/
#define NOTE_ALIGN(n) (((n) + 3) & -4U)
-int filename__read_build_id(const char *filename, void *bf, size_t size)
+static int elf_read_build_id(Elf *elf, void *bf, size_t size)
{
- int fd, err = -1;
+ int err = -1;
GElf_Ehdr ehdr;
GElf_Shdr shdr;
Elf_Data *data;
Elf_Scn *sec;
Elf_Kind ek;
void *ptr;
- Elf *elf;
if (size < BUILD_ID_SIZE)
goto out;
- fd = open(filename, O_RDONLY);
- if (fd < 0)
- goto out;
-
- elf = elf_begin(fd, PERF_ELF_C_READ_MMAP, NULL);
- if (elf == NULL) {
- pr_debug2("%s: cannot read %s ELF file.\n", __func__, filename);
- goto out_close;
- }
-
ek = elf_kind(elf);
if (ek != ELF_K_ELF)
- goto out_elf_end;
+ goto out;
if (gelf_getehdr(elf, &ehdr) == NULL) {
pr_err("%s: cannot get elf header.\n", __func__);
- goto out_elf_end;
+ goto out;
}
sec = elf_section_by_name(elf, &ehdr, &shdr,
@@ -1190,12 +1239,12 @@
sec = elf_section_by_name(elf, &ehdr, &shdr,
".notes", NULL);
if (sec == NULL)
- goto out_elf_end;
+ goto out;
}
data = elf_getdata(sec, NULL);
if (data == NULL)
- goto out_elf_end;
+ goto out;
ptr = data->d_buf;
while (ptr < (data->d_buf + data->d_size)) {
@@ -1217,7 +1266,31 @@
}
ptr += descsz;
}
-out_elf_end:
+
+out:
+ return err;
+}
+
+int filename__read_build_id(const char *filename, void *bf, size_t size)
+{
+ int fd, err = -1;
+ Elf *elf;
+
+ if (size < BUILD_ID_SIZE)
+ goto out;
+
+ fd = open(filename, O_RDONLY);
+ if (fd < 0)
+ goto out;
+
+ elf = elf_begin(fd, PERF_ELF_C_READ_MMAP, NULL);
+ if (elf == NULL) {
+ pr_debug2("%s: cannot read %s ELF file.\n", __func__, filename);
+ goto out_close;
+ }
+
+ err = elf_read_build_id(elf, bf, size);
+
elf_end(elf);
out_close:
close(fd);
@@ -1293,11 +1366,11 @@
{
int size = PATH_MAX;
char *name;
- u8 build_id[BUILD_ID_SIZE];
int ret = -1;
int fd;
struct machine *machine;
const char *root_dir;
+ int want_symtab;
dso__set_loaded(self, map->type);
@@ -1324,13 +1397,18 @@
return ret;
}
- self->origin = DSO__ORIG_BUILD_ID_CACHE;
- if (dso__build_id_filename(self, name, size) != NULL)
- goto open_file;
-more:
- do {
- self->origin++;
+ /* Iterate over candidate debug images.
+ * On the first pass, only load images if they have a full symtab.
+ * Failing that, do a second pass where we accept .dynsym also
+ */
+ for (self->origin = DSO__ORIG_BUILD_ID_CACHE, want_symtab = 1;
+ self->origin != DSO__ORIG_NOT_FOUND;
+ self->origin++) {
switch (self->origin) {
+ case DSO__ORIG_BUILD_ID_CACHE:
+ if (dso__build_id_filename(self, name, size) == NULL)
+ continue;
+ break;
case DSO__ORIG_FEDORA:
snprintf(name, size, "/usr/lib/debug%s.debug",
self->long_name);
@@ -1339,21 +1417,20 @@
snprintf(name, size, "/usr/lib/debug%s",
self->long_name);
break;
- case DSO__ORIG_BUILDID:
- if (filename__read_build_id(self->long_name, build_id,
- sizeof(build_id))) {
- char build_id_hex[BUILD_ID_SIZE * 2 + 1];
- build_id__sprintf(build_id, sizeof(build_id),
- build_id_hex);
- snprintf(name, size,
- "/usr/lib/debug/.build-id/%.2s/%s.debug",
- build_id_hex, build_id_hex + 2);
- if (self->has_build_id)
- goto compare_build_id;
- break;
+ case DSO__ORIG_BUILDID: {
+ char build_id_hex[BUILD_ID_SIZE * 2 + 1];
+
+ if (!self->has_build_id)
+ continue;
+
+ build_id__sprintf(self->build_id,
+ sizeof(self->build_id),
+ build_id_hex);
+ snprintf(name, size,
+ "/usr/lib/debug/.build-id/%.2s/%s.debug",
+ build_id_hex, build_id_hex + 2);
}
- self->origin++;
- /* Fall thru */
+ break;
case DSO__ORIG_DSO:
snprintf(name, size, "%s", self->long_name);
break;
@@ -1366,36 +1443,41 @@
break;
default:
- goto out;
+ /*
+ * If we wanted a full symtab but no image had one,
+ * relax our requirements and repeat the search.
+ */
+ if (want_symtab) {
+ want_symtab = 0;
+ self->origin = DSO__ORIG_BUILD_ID_CACHE;
+ } else
+ continue;
}
- if (self->has_build_id) {
- if (filename__read_build_id(name, build_id,
- sizeof(build_id)) < 0)
- goto more;
-compare_build_id:
- if (!dso__build_id_equal(self, build_id))
- goto more;
- }
-open_file:
+ /* Name is now the name of the next image to try */
fd = open(name, O_RDONLY);
- } while (fd < 0);
+ if (fd < 0)
+ continue;
- ret = dso__load_sym(self, map, name, fd, filter, 0);
- close(fd);
+ ret = dso__load_sym(self, map, name, fd, filter, 0,
+ want_symtab);
+ close(fd);
- /*
- * Some people seem to have debuginfo files _WITHOUT_ debug info!?!?
- */
- if (!ret)
- goto more;
+ /*
+ * Some people seem to have debuginfo files _WITHOUT_ debug
+ * info!?!?
+ */
+ if (!ret)
+ continue;
- if (ret > 0) {
- int nr_plt = dso__synthesize_plt_symbols(self, map, filter);
- if (nr_plt > 0)
- ret += nr_plt;
+ if (ret > 0) {
+ int nr_plt = dso__synthesize_plt_symbols(self, map, filter);
+ if (nr_plt > 0)
+ ret += nr_plt;
+ break;
+ }
}
-out:
+
free(name);
if (ret < 0 && strstr(self->name, " (deleted)") != NULL)
return 0;
@@ -1494,6 +1576,7 @@
goto out;
}
dso__set_long_name(map->dso, long_name);
+ map->dso->lname_alloc = 1;
dso__kernel_module_get_build_id(map->dso, "");
}
}
@@ -1656,36 +1739,12 @@
{
int err = -1, fd;
- if (self->has_build_id) {
- u8 build_id[BUILD_ID_SIZE];
-
- if (filename__read_build_id(vmlinux, build_id,
- sizeof(build_id)) < 0) {
- pr_debug("No build_id in %s, ignoring it\n", vmlinux);
- return -1;
- }
- if (!dso__build_id_equal(self, build_id)) {
- char expected_build_id[BUILD_ID_SIZE * 2 + 1],
- vmlinux_build_id[BUILD_ID_SIZE * 2 + 1];
-
- build_id__sprintf(self->build_id,
- sizeof(self->build_id),
- expected_build_id);
- build_id__sprintf(build_id, sizeof(build_id),
- vmlinux_build_id);
- pr_debug("build_id in %s is %s while expected is %s, "
- "ignoring it\n", vmlinux, vmlinux_build_id,
- expected_build_id);
- return -1;
- }
- }
-
fd = open(vmlinux, O_RDONLY);
if (fd < 0)
return -1;
dso__set_loaded(self, map->type);
- err = dso__load_sym(self, map, vmlinux, fd, filter, 0);
+ err = dso__load_sym(self, map, vmlinux, fd, filter, 0, 0);
close(fd);
if (err > 0)
@@ -2048,6 +2107,36 @@
return 0;
}
+void machine__destroy_kernel_maps(struct machine *self)
+{
+ enum map_type type;
+
+ for (type = 0; type < MAP__NR_TYPES; ++type) {
+ struct kmap *kmap;
+
+ if (self->vmlinux_maps[type] == NULL)
+ continue;
+
+ kmap = map__kmap(self->vmlinux_maps[type]);
+ map_groups__remove(&self->kmaps, self->vmlinux_maps[type]);
+ if (kmap->ref_reloc_sym) {
+ /*
+ * ref_reloc_sym is shared among all maps, so free just
+ * on one of them.
+ */
+ if (type == MAP__FUNCTION) {
+ free((char *)kmap->ref_reloc_sym->name);
+ kmap->ref_reloc_sym->name = NULL;
+ free(kmap->ref_reloc_sym);
+ }
+ kmap->ref_reloc_sym = NULL;
+ }
+
+ map__delete(self->vmlinux_maps[type]);
+ self->vmlinux_maps[type] = NULL;
+ }
+}
+
int machine__create_kernel_maps(struct machine *self)
{
struct dso *kernel = machine__create_kernel(self);
@@ -2189,6 +2278,15 @@
return -1;
}
+void symbol__exit(void)
+{
+ strlist__delete(symbol_conf.sym_list);
+ strlist__delete(symbol_conf.dso_list);
+ strlist__delete(symbol_conf.comm_list);
+ vmlinux_path__exit();
+ symbol_conf.sym_list = symbol_conf.dso_list = symbol_conf.comm_list = NULL;
+}
+
int machines__create_kernel_maps(struct rb_root *self, pid_t pid)
{
struct machine *machine = machines__findnew(self, pid);
@@ -2283,6 +2381,19 @@
return ret;
}
+void machines__destroy_guest_kernel_maps(struct rb_root *self)
+{
+ struct rb_node *next = rb_first(self);
+
+ while (next) {
+ struct machine *pos = rb_entry(next, struct machine, rb_node);
+
+ next = rb_next(&pos->rb_node);
+ rb_erase(&pos->rb_node, self);
+ machine__delete(pos);
+ }
+}
+
int machine__load_kallsyms(struct machine *self, const char *filename,
enum map_type type, symbol_filter_t filter)
{
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 5e02d2c..906be20 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -9,8 +9,6 @@
#include <linux/rbtree.h>
#include <stdio.h>
-#define DEBUG_CACHE_DIR ".debug"
-
#ifdef HAVE_CPLUS_DEMANGLE
extern char *cplus_demangle(const char *, int);
@@ -70,9 +68,9 @@
show_nr_samples,
use_callchain,
exclude_other,
- full_paths,
show_cpu_utilization;
const char *vmlinux_name,
+ *source_prefix,
*field_sep;
const char *default_guest_vmlinux_name,
*default_guest_kallsyms,
@@ -103,6 +101,8 @@
struct map_symbol {
struct map *map;
struct symbol *sym;
+ bool unfolded;
+ bool has_children;
};
struct addr_location {
@@ -112,7 +112,8 @@
u64 addr;
char level;
bool filtered;
- unsigned int cpumode;
+ u8 cpumode;
+ s32 cpu;
};
enum dso_kernel_type {
@@ -125,12 +126,14 @@
struct list_head node;
struct rb_root symbols[MAP__NR_TYPES];
struct rb_root symbol_names[MAP__NR_TYPES];
+ enum dso_kernel_type kernel;
u8 adjust_symbols:1;
u8 slen_calculated:1;
u8 has_build_id:1;
- enum dso_kernel_type kernel;
u8 hit:1;
u8 annotate_warned:1;
+ u8 sname_alloc:1;
+ u8 lname_alloc:1;
unsigned char origin;
u8 sorted_by_name;
u8 loaded;
@@ -146,6 +149,8 @@
struct dso *dso__new_kernel(const char *name);
void dso__delete(struct dso *self);
+int dso__name_len(const struct dso *self);
+
bool dso__loaded(const struct dso *self, enum map_type type);
bool dso__sorted_by_name(const struct dso *self, enum map_type type);
@@ -207,13 +212,16 @@
int (*process_symbol)(void *arg, const char *name,
char type, u64 start));
+void machine__destroy_kernel_maps(struct machine *self);
int __machine__create_kernel_maps(struct machine *self, struct dso *kernel);
int machine__create_kernel_maps(struct machine *self);
int machines__create_kernel_maps(struct rb_root *self, pid_t pid);
int machines__create_guest_kernel_maps(struct rb_root *self);
+void machines__destroy_guest_kernel_maps(struct rb_root *self);
int symbol__init(void);
+void symbol__exit(void);
bool symbol_type__is_a(char symbol_type, enum map_type map_type);
size_t machine__fprintf_vmlinux_path(struct machine *self, FILE *fp);
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index 9a448b4..8c72d88 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -62,6 +62,13 @@
return self;
}
+void thread__delete(struct thread *self)
+{
+ map_groups__exit(&self->mg);
+ free(self->comm);
+ free(self);
+}
+
int thread__set_comm(struct thread *self, const char *comm)
{
int err;
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index ee6bbcf..688500f 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -20,6 +20,8 @@
struct perf_session;
+void thread__delete(struct thread *self);
+
int find_all_tid(int pid, pid_t ** all_tid);
int thread__set_comm(struct thread *self, const char *comm);
int thread__comm_len(struct thread *self);
diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h
index 4e8b6b0..f380fed 100644
--- a/tools/perf/util/util.h
+++ b/tools/perf/util/util.h
@@ -89,6 +89,7 @@
extern const char *graph_line;
extern const char *graph_dotted_line;
+extern char buildid_dir[];
/* On most systems <limits.h> would have given us this, but
* not on some systems (e.g. GNU/Hurd).
@@ -152,6 +153,8 @@
extern void set_die_routine(void (*routine)(const char *err, va_list params) NORETURN);
extern int prefixcmp(const char *str, const char *prefix);
+extern void set_buildid_dir(void);
+extern void disable_buildid_cache(void);
static inline const char *skip_prefix(const char *str, const char *prefix)
{