Jeff Brown | 7a33c86 | 2011-02-02 14:00:44 -0800 | [diff] [blame^] | 1 | # |
| 2 | # Unit masks for the Intel "Westmere" micro architecture |
| 3 | # |
| 4 | # See http://ark.intel.com/ for help in identifying Westmere based CPUs |
| 5 | # |
| 6 | include:i386/arch_perfmon |
| 7 | |
| 8 | name:x01 type:mandatory default:0x01 |
| 9 | 0x01 No unit mask |
| 10 | name:x02 type:mandatory default:0x02 |
| 11 | 0x02 No unit mask |
| 12 | name:x07 type:mandatory default:0x07 |
| 13 | 0x07 No unit mask |
| 14 | name:x10 type:mandatory default:0x10 |
| 15 | 0x10 No unit mask |
| 16 | name:x20 type:mandatory default:0x20 |
| 17 | 0x20 No unit mask |
| 18 | name:arith type:bitmask default:0x01 |
| 19 | 0x01 cycles_div_busy Cycles the divider is busy |
| 20 | 0x02 mul Multiply operations executed |
| 21 | name:baclear type:bitmask default:0x01 |
| 22 | 0x01 clear BACLEAR asserted, regardless of cause |
| 23 | 0x02 bad_target BACLEAR asserted with bad target address |
| 24 | name:bpu_clears type:bitmask default:0x01 |
| 25 | 0x01 early Early Branch Prediction Unit clears |
| 26 | 0x02 late Late Branch Prediction Unit clears |
| 27 | name:br_inst_exec type:bitmask default:0x7f |
| 28 | 0x01 cond Conditional branch instructions executed |
| 29 | 0x02 direct Unconditional branches executed |
| 30 | 0x04 indirect_non_call Indirect non call branches executed |
| 31 | 0x07 non_calls All non call branches executed |
| 32 | 0x08 return_near Indirect return branches executed |
| 33 | 0x10 direct_near_call Unconditional call branches executed |
| 34 | 0x20 indirect_near_call Indirect call branches executed |
| 35 | 0x30 near_calls Call branches executed |
| 36 | 0x40 taken Taken branches executed |
| 37 | 0x7f any Branch instructions executed |
| 38 | name:br_inst_retired type:bitmask default:0x04 |
| 39 | 0x01 conditional Retired conditional branch instructions (Precise Event) |
| 40 | 0x02 near_call Retired near call instructions (Precise Event) |
| 41 | 0x04 all_branches Retired branch instructions (Precise Event) |
| 42 | name:br_misp_exec type:bitmask default:0x7f |
| 43 | 0x01 cond Mispredicted conditional branches executed |
| 44 | 0x02 direct Mispredicted unconditional branches executed |
| 45 | 0x04 indirect_non_call Mispredicted indirect non call branches executed |
| 46 | 0x07 non_calls Mispredicted non call branches executed |
| 47 | 0x08 return_near Mispredicted return branches executed |
| 48 | 0x10 direct_near_call Mispredicted non call branches executed |
| 49 | 0x20 indirect_near_call Mispredicted indirect call branches executed |
| 50 | 0x30 near_calls Mispredicted call branches executed |
| 51 | 0x40 taken Mispredicted taken branches executed |
| 52 | 0x7f any Mispredicted branches executed |
| 53 | name:br_misp_retired type:bitmask default:0x04 |
| 54 | 0x01 conditional Mispredicted conditional retired branches (Precise Event) |
| 55 | 0x02 near_call Mispredicted near retired calls (Precise Event) |
| 56 | 0x04 all_branches Mispredicted retired branch instructions (Precise Event) |
| 57 | name:cache_lock_cycles type:bitmask default:0x01 |
| 58 | 0x01 l1d_l2 Cycles L1D and L2 locked |
| 59 | 0x02 l1d Cycles L1D locked |
| 60 | name:cpu_clk_unhalted type:bitmask default:0x00 |
| 61 | 0x00 thread_p Cycles when thread is not halted (programmable counter) |
| 62 | 0x01 ref_p Reference base clock (133 Mhz) cycles when thread is not halted (programmable counter) |
| 63 | name:dtlb_load_misses type:bitmask default:0x01 |
| 64 | 0x01 any DTLB load misses |
| 65 | 0x02 walk_completed DTLB load miss page walks complete |
| 66 | 0x04 walk_cycles DTLB load miss page walk cycles |
| 67 | 0x10 stlb_hit DTLB second level hit |
| 68 | 0x20 pde_miss DTLB load miss caused by low part of address |
| 69 | 0x80 large_walk_completed DTLB load miss large page walks |
| 70 | name:dtlb_misses type:bitmask default:0x01 |
| 71 | 0x01 any DTLB misses |
| 72 | 0x02 walk_completed DTLB miss page walks |
| 73 | 0x04 walk_cycles DTLB miss page walk cycles |
| 74 | 0x10 stlb_hit DTLB first level misses but second level hit |
| 75 | 0x20 pde_miss DTLB misses casued by low part of address |
| 76 | 0x80 large_walk_completed DTLB miss large page walks |
| 77 | name:fp_assist type:bitmask default:0x01 |
| 78 | 0x01 all X87 Floating point assists (Precise Event) |
| 79 | 0x02 output X87 Floating point assists for invalid output value (Precise Event) |
| 80 | 0x04 input X87 Floating poiint assists for invalid input value (Precise Event) |
| 81 | name:fp_comp_ops_exe type:bitmask default:0x01 |
| 82 | 0x01 x87 Computational floating-point operations executed |
| 83 | 0x02 mmx MMX Uops |
| 84 | 0x04 sse_fp SSE and SSE2 FP Uops |
| 85 | 0x08 sse2_integer SSE2 integer Uops |
| 86 | 0x10 sse_fp_packed SSE FP packed Uops |
| 87 | 0x20 sse_fp_scalar SSE FP scalar Uops |
| 88 | 0x40 sse_single_precision SSE* FP single precision Uops |
| 89 | 0x80 sse_double_precision SSE* FP double precision Uops |
| 90 | name:fp_mmx_trans type:bitmask default:0x03 |
| 91 | 0x01 to_fp Transitions from MMX to Floating Point instructions |
| 92 | 0x02 to_mmx Transitions from Floating Point to MMX instructions |
| 93 | 0x03 any All Floating Point to and from MMX transitions |
| 94 | name:ild_stall type:bitmask default:0x0f |
| 95 | 0x01 lcp Length Change Prefix stall cycles |
| 96 | 0x02 mru Stall cycles due to BPU MRU bypass |
| 97 | 0x04 iq_full Instruction Queue full stall cycles |
| 98 | 0x08 regen Regen stall cycles |
| 99 | 0x0f any Any Instruction Length Decoder stall cycles |
| 100 | name:inst_retired type:bitmask default:0x01 |
| 101 | 0x01 any_p Instructions retired (Programmable counter and Precise Event) |
| 102 | 0x02 x87 Retired floating-point operations (Precise Event) |
| 103 | 0x04 mmx Retired MMX instructions (Precise Event) |
| 104 | name:itlb_misses type:bitmask default:0x01 |
| 105 | 0x01 any ITLB miss |
| 106 | 0x02 walk_completed ITLB miss page walks |
| 107 | 0x04 walk_cycles ITLB miss page walk cycles |
| 108 | 0x80 large_walk_completed ITLB miss large page walks |
| 109 | name:l1d type:bitmask default:0x01 |
| 110 | 0x01 repl L1 data cache lines allocated |
| 111 | 0x02 m_repl L1D cache lines allocated in the M state |
| 112 | 0x04 m_evict L1D cache lines replaced in M state |
| 113 | 0x08 m_snoop_evict L1D snoop eviction of cache lines in M state |
| 114 | name:l1d_prefetch type:bitmask default:0x01 |
| 115 | 0x01 requests L1D hardware prefetch requests |
| 116 | 0x02 miss L1D hardware prefetch misses |
| 117 | 0x04 triggers L1D hardware prefetch requests triggered |
| 118 | name:l1d_wb_l2 type:bitmask default:0x0f |
| 119 | 0x01 i_state L1 writebacks to L2 in I state (misses) |
| 120 | 0x02 s_state L1 writebacks to L2 in S state |
| 121 | 0x04 e_state L1 writebacks to L2 in E state |
| 122 | 0x08 m_state L1 writebacks to L2 in M state |
| 123 | 0x0f mesi All L1 writebacks to L2 |
| 124 | name:l1i type:bitmask default:0x01 |
| 125 | 0x01 hits L1I instruction fetch hits |
| 126 | 0x02 misses L1I instruction fetch misses |
| 127 | 0x03 reads L1I Instruction fetches |
| 128 | 0x04 cycles_stalled L1I instruction fetch stall cycles |
| 129 | name:l2_data_rqsts type:bitmask default:0xff |
| 130 | 0x01 demand_i_state L2 data demand loads in I state (misses) |
| 131 | 0x02 demand_s_state L2 data demand loads in S state |
| 132 | 0x04 demand_e_state L2 data demand loads in E state |
| 133 | 0x08 demand_m_state L2 data demand loads in M state |
| 134 | 0x0f demand_mesi L2 data demand requests |
| 135 | 0x10 prefetch_i_state L2 data prefetches in the I state (misses) |
| 136 | 0x20 prefetch_s_state L2 data prefetches in the S state |
| 137 | 0x40 prefetch_e_state L2 data prefetches in E state |
| 138 | 0x80 prefetch_m_state L2 data prefetches in M state |
| 139 | 0xf0 prefetch_mesi All L2 data prefetches |
| 140 | 0xff any All L2 data requests |
| 141 | name:l2_lines_in type:bitmask default:0x07 |
| 142 | 0x02 s_state L2 lines allocated in the S state |
| 143 | 0x04 e_state L2 lines allocated in the E state |
| 144 | 0x07 any L2 lines alloacated |
| 145 | name:l2_lines_out type:bitmask default:0x0f |
| 146 | 0x01 demand_clean L2 lines evicted by a demand request |
| 147 | 0x02 demand_dirty L2 modified lines evicted by a demand request |
| 148 | 0x04 prefetch_clean L2 lines evicted by a prefetch request |
| 149 | 0x08 prefetch_dirty L2 modified lines evicted by a prefetch request |
| 150 | 0x0f any L2 lines evicted |
| 151 | name:l2_rqsts type:bitmask default:0x01 |
| 152 | 0x01 ld_hit L2 load hits |
| 153 | 0x02 ld_miss L2 load misses |
| 154 | 0x03 loads L2 requests |
| 155 | 0x04 rfo_hit L2 RFO hits |
| 156 | 0x08 rfo_miss L2 RFO misses |
| 157 | 0x0c rfos L2 RFO requests |
| 158 | 0x10 ifetch_hit L2 instruction fetch hits |
| 159 | 0x20 ifetch_miss L2 instruction fetch misses |
| 160 | 0x30 ifetches L2 instruction fetches |
| 161 | 0x40 prefetch_hit L2 prefetch hits |
| 162 | 0x80 prefetch_miss L2 prefetch misses |
| 163 | 0xaa miss All L2 misses |
| 164 | 0xc0 prefetches All L2 prefetches |
| 165 | 0xff references All L2 requests |
| 166 | name:l2_transactions type:bitmask default:0x80 |
| 167 | 0x01 load L2 Load transactions |
| 168 | 0x02 rfo L2 RFO transactions |
| 169 | 0x04 ifetch L2 instruction fetch transactions |
| 170 | 0x08 prefetch L2 prefetch transactions |
| 171 | 0x10 l1d_wb L1D writeback to L2 transactions |
| 172 | 0x20 fill L2 fill transactions |
| 173 | 0x40 wb L2 writeback to LLC transactions |
| 174 | 0x80 any All L2 transactions |
| 175 | name:l2_write type:bitmask default:0x01 |
| 176 | 0x01 rfo_i_state L2 demand store RFOs in I state (misses) |
| 177 | 0x02 rfo_s_state L2 demand store RFOs in S state |
| 178 | 0x08 rfo_m_state L2 demand store RFOs in M state |
| 179 | 0x0e rfo_hit All L2 demand store RFOs that hit the cache |
| 180 | 0x0f rfo_mesi All L2 demand store RFOs |
| 181 | 0x10 lock_i_state L2 demand lock RFOs in I state (misses) |
| 182 | 0x20 lock_s_state L2 demand lock RFOs in S state |
| 183 | 0x40 lock_e_state L2 demand lock RFOs in E state |
| 184 | 0x80 lock_m_state L2 demand lock RFOs in M state |
| 185 | 0xe0 lock_hit All demand L2 lock RFOs that hit the cache |
| 186 | 0xf0 lock_mesi All demand L2 lock RFOs |
| 187 | name:load_dispatch type:bitmask default:0x07 |
| 188 | 0x01 rs Loads dispatched that bypass the MOB |
| 189 | 0x02 rs_delayed Loads dispatched from stage 305 |
| 190 | 0x04 mob Loads dispatched from the MOB |
| 191 | 0x07 any All loads dispatched |
| 192 | name:longest_lat_cache type:bitmask default:0x01 |
| 193 | 0x01 miss Longest latency cache miss |
| 194 | 0x02 reference Longest latency cache reference |
| 195 | name:machine_clears type:bitmask default:0x01 |
| 196 | 0x01 cycles Cycles machine clear asserted |
| 197 | 0x02 mem_order Execution pipeline restart due to Memory ordering conflicts |
| 198 | 0x04 smc Self-Modifying Code detected |
| 199 | name:mem_inst_retired type:bitmask default:0x01 |
| 200 | 0x01 loads Instructions retired which contains a load (Precise Event) |
| 201 | 0x02 stores Instructions retired which contains a store (Precise Event) |
| 202 | 0x10 latency_above_threshold_0 Memory instructions retired above 0 clocks (Precise Event) (MSR_INDEX: 0x03F6 MSR_VALUE: 0x0000) |
| 203 | name:mem_load_retired type:bitmask default:0x01 |
| 204 | 0x01 l1d_hit Retired loads that hit the L1 data cache (Precise Event) |
| 205 | 0x02 l2_hit Retired loads that hit the L2 cache (Precise Event) |
| 206 | 0x04 llc_unshared_hit Retired loads that hit valid versions in the LLC cache (Precise Event) |
| 207 | 0x08 other_core_l2_hit_hitm Retired loads that hit sibling core's L2 in modified or unmodified states (Precise Event) |
| 208 | 0x10 llc_miss Retired loads that miss the LLC cache (Precise Event) |
| 209 | 0x40 hit_lfb Retired loads that miss L1D and hit an previously allocated LFB (Precise Event) |
| 210 | 0x80 dtlb_miss Retired loads that miss the DTLB (Precise Event) |
| 211 | name:mem_uncore_retired type:bitmask default:0x02 |
| 212 | 0x02 local_hitm Load instructions retired that HIT modified data in sibling core (Precise Event) |
| 213 | 0x04 remote_hitm Retired loads that hit remote socket in modified state (Precise Event) |
| 214 | 0x08 local_dram_and_remote_cache_hit Load instructions retired local dram and remote cache HIT data sources (Precise Event) |
| 215 | 0x10 remote_dram Load instructions retired remote DRAM and remote home-remote cache HITM (Precise Event) |
| 216 | 0x80 uncacheable Load instructions retired IO (Precise Event) |
| 217 | name:offcore_requests type:bitmask default:0x80 |
| 218 | 0x01 demand_read_data Offcore demand data read requests |
| 219 | 0x02 demand_read_code Offcore demand code read requests |
| 220 | 0x04 demand_rfo Offcore demand RFO requests |
| 221 | 0x08 any_read Offcore read requests |
| 222 | 0x10 any_rfo Offcore RFO requests |
| 223 | 0x40 l1d_writeback Offcore L1 data cache writebacks |
| 224 | 0x80 any All offcore requests |
| 225 | name:offcore_requests_outstanding type:bitmask default:0x08 |
| 226 | 0x01 demand_read_data Outstanding offcore demand data reads |
| 227 | 0x02 demand_read_code Outstanding offcore demand code reads |
| 228 | 0x04 demand_rfo Outstanding offcore demand RFOs |
| 229 | 0x08 any_read Outstanding offcore reads |
| 230 | name:rat_stalls type:bitmask default:0x0f |
| 231 | 0x01 flags Flag stall cycles |
| 232 | 0x02 registers Partial register stall cycles |
| 233 | 0x04 rob_read_port ROB read port stalls cycles |
| 234 | 0x08 scoreboard Scoreboard stall cycles |
| 235 | 0x0f any All RAT stall cycles |
| 236 | name:resource_stalls type:bitmask default:0x01 |
| 237 | 0x01 any Resource related stall cycles |
| 238 | 0x02 load Load buffer stall cycles |
| 239 | 0x04 rs_full Reservation Station full stall cycles |
| 240 | 0x08 store Store buffer stall cycles |
| 241 | 0x10 rob_full ROB full stall cycles |
| 242 | 0x20 fpcw FPU control word write stall cycles |
| 243 | 0x40 mxcsr MXCSR rename stall cycles |
| 244 | 0x80 other Other Resource related stall cycles |
| 245 | name:simd_int_128 type:bitmask default:0x01 |
| 246 | 0x01 packed_mpy 128 bit SIMD integer multiply operations |
| 247 | 0x02 packed_shift 128 bit SIMD integer shift operations |
| 248 | 0x04 pack 128 bit SIMD integer pack operations |
| 249 | 0x08 unpack 128 bit SIMD integer unpack operations |
| 250 | 0x10 packed_logical 128 bit SIMD integer logical operations |
| 251 | 0x20 packed_arith 128 bit SIMD integer arithmetic operations |
| 252 | 0x40 shuffle_move 128 bit SIMD integer shuffle/move operations |
| 253 | name:simd_int_64 type:bitmask default:0x01 |
| 254 | 0x01 packed_mpy SIMD integer 64 bit packed multiply operations |
| 255 | 0x02 packed_shift SIMD integer 64 bit shift operations |
| 256 | 0x04 pack SIMD integer 64 bit pack operations |
| 257 | 0x08 unpack SIMD integer 64 bit unpack operations |
| 258 | 0x10 packed_logical SIMD integer 64 bit logical operations |
| 259 | 0x20 packed_arith SIMD integer 64 bit arithmetic operations |
| 260 | 0x40 shuffle_move SIMD integer 64 bit shuffle/move operations |
| 261 | name:snoopq_requests type:bitmask default:0x01 |
| 262 | 0x01 data Snoop data requests |
| 263 | 0x02 invalidate Snoop invalidate requests |
| 264 | 0x04 code Snoop code requests |
| 265 | name:snoopq_requests_outstanding type:bitmask default:0x01 |
| 266 | 0x01 data Outstanding snoop data requests |
| 267 | 0x02 invalidate Outstanding snoop invalidate requests |
| 268 | 0x04 code Outstanding snoop code requests |
| 269 | name:snoop_response type:bitmask default:0x01 |
| 270 | 0x01 hit Thread responded HIT to snoop |
| 271 | 0x02 hite Thread responded HITE to snoop |
| 272 | 0x04 hitm Thread responded HITM to snoop |
| 273 | name:sq_misc type:bitmask default:0x04 |
| 274 | 0x04 lru_hints Super Queue LRU hints sent to LLC |
| 275 | 0x10 split_lock Super Queue lock splits across a cache line |
| 276 | name:ssex_uops_retired type:bitmask default:0x01 |
| 277 | 0x01 packed_single SIMD Packed-Single Uops retired (Precise Event) |
| 278 | 0x02 scalar_single SIMD Scalar-Single Uops retired (Precise Event) |
| 279 | 0x04 packed_double SIMD Packed-Double Uops retired (Precise Event) |
| 280 | 0x08 scalar_double SIMD Scalar-Double Uops retired (Precise Event) |
| 281 | 0x10 vector_integer SIMD Vector Integer Uops retired (Precise Event) |
| 282 | name:store_blocks type:bitmask default:0x04 |
| 283 | 0x04 at_ret Loads delayed with at-Retirement block code |
| 284 | 0x08 l1d_block Cacheable loads delayed with L1D block code |
| 285 | name:uops_decoded type:bitmask default:0x01 |
| 286 | 0x01 stall_cycles Cycles no Uops are decoded |
| 287 | 0x02 ms_cycles_active Uops decoded by Microcode Sequencer |
| 288 | 0x04 esp_folding Stack pointer instructions decoded |
| 289 | 0x08 esp_sync Stack pointer sync operations |
| 290 | name:uops_executed type:bitmask default:0x3f |
| 291 | 0x01 port0 Uops executed on port 0 |
| 292 | 0x02 port1 Uops executed on port 1 |
| 293 | 0x04 port2_core Uops executed on port 2 (core count) |
| 294 | 0x08 port3_core Uops executed on port 3 (core count) |
| 295 | 0x10 port4_core Uops executed on port 4 (core count) |
| 296 | 0x1f core_active_cycles_no_port5 Cycles Uops executed on ports 0-4 (core count) |
| 297 | 0x20 port5 Uops executed on port 5 |
| 298 | 0x3f core_active_cycles Cycles Uops executed on any port (core count) |
| 299 | 0x40 port015 Uops issued on ports 0, 1 or 5 |
| 300 | 0x80 port234_core Uops issued on ports 2, 3 or 4 |
| 301 | name:uops_issued type:bitmask default:0x01 |
| 302 | 0x01 any Uops issued |
| 303 | 0x02 fused Fused Uops issued |
| 304 | name:uops_retired type:bitmask default:0x01 |
| 305 | 0x01 active_cycles Cycles Uops are being retired |
| 306 | 0x02 retire_slots Retirement slots used (Precise Event) |
| 307 | 0x04 macro_fused Macro-fused Uops retired (Precise Event) |