2013-03-28 19:15:54 +00:00
|
|
|
.\" Copyright (c) 2013 Hiren Panchasara <hiren.panchasara@gmail.com>
|
|
|
|
.\" All rights reserved.
|
|
|
|
.\"
|
|
|
|
.\" Redistribution and use in source and binary forms, with or without
|
|
|
|
.\" modification, are permitted provided that the following conditions
|
|
|
|
.\" are met:
|
|
|
|
.\" 1. Redistributions of source code must retain the above copyright
|
|
|
|
.\" notice, this list of conditions and the following disclaimer.
|
|
|
|
.\" 2. Redistributions in binary form must reproduce the above copyright
|
|
|
|
.\" notice, this list of conditions and the following disclaimer in the
|
|
|
|
.\" documentation and/or other materials provided with the distribution.
|
|
|
|
.\"
|
|
|
|
.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
|
|
|
|
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
|
|
|
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
|
|
|
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
|
|
|
|
.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
|
|
|
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
|
|
|
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
|
|
|
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
|
|
|
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
|
|
|
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
|
|
|
.\" SUCH DAMAGE.
|
|
|
|
.\"
|
|
|
|
.\" $FreeBSD$
|
|
|
|
.\"
|
|
|
|
.Dd March 22, 2013
|
|
|
|
.Dt PMC.HASWELL 3
|
|
|
|
.Os
|
|
|
|
.Sh NAME
|
|
|
|
.Nm pmc.haswell
|
|
|
|
.Nd measurement events for
|
|
|
|
.Tn Intel
|
|
|
|
.Tn Haswsell
|
|
|
|
family CPUs
|
|
|
|
.Sh LIBRARY
|
|
|
|
.Lb libpmc
|
|
|
|
.Sh SYNOPSIS
|
|
|
|
.In pmc.h
|
|
|
|
.Sh DESCRIPTION
|
|
|
|
.Tn Intel
|
|
|
|
.Tn "Haswell"
|
|
|
|
CPUs contain PMCs conforming to version 2 of the
|
|
|
|
.Tn Intel
|
|
|
|
performance measurement architecture.
|
|
|
|
These CPUs may contain up to two classes of PMCs:
|
|
|
|
.Bl -tag -width "Li PMC_CLASS_IAP"
|
|
|
|
.It Li PMC_CLASS_IAF
|
|
|
|
Fixed-function counters that count only one hardware event per counter.
|
|
|
|
.It Li PMC_CLASS_IAP
|
|
|
|
Programmable counters that may be configured to count one of a defined
|
|
|
|
set of hardware events.
|
|
|
|
.El
|
|
|
|
.Pp
|
|
|
|
The number of PMCs available in each class and their widths need to be
|
|
|
|
determined at run time by calling
|
|
|
|
.Xr pmc_cpuinfo 3 .
|
|
|
|
.Pp
|
|
|
|
Intel Haswell PMCs are documented in
|
|
|
|
.Rs
|
|
|
|
.%B "Intel(R) 64 and IA-32 Architectures Software Developer's Manual"
|
|
|
|
.%T "Combined Volumes: 1, 2A, 2B, 2C, 3A, 3B and 3C"
|
|
|
|
.%N "Order Number: 325462-045US"
|
|
|
|
.%D January 2013
|
|
|
|
.%Q "Intel Corporation"
|
|
|
|
.Re
|
|
|
|
.Ss HASWELL FIXED FUNCTION PMCS
|
|
|
|
These PMCs and their supported events are documented in
|
|
|
|
.Xr pmc.iaf 3 .
|
|
|
|
.Ss HASWELL PROGRAMMABLE PMCS
|
|
|
|
The programmable PMCs support the following capabilities:
|
|
|
|
.Bl -column "PMC_CAP_INTERRUPT" "Support"
|
|
|
|
.It Em Capability Ta Em Support
|
|
|
|
.It PMC_CAP_CASCADE Ta \&No
|
|
|
|
.It PMC_CAP_EDGE Ta Yes
|
|
|
|
.It PMC_CAP_INTERRUPT Ta Yes
|
|
|
|
.It PMC_CAP_INVERT Ta Yes
|
|
|
|
.It PMC_CAP_READ Ta Yes
|
|
|
|
.It PMC_CAP_PRECISE Ta \&No
|
|
|
|
.It PMC_CAP_SYSTEM Ta Yes
|
|
|
|
.It PMC_CAP_TAGGING Ta \&No
|
|
|
|
.It PMC_CAP_THRESHOLD Ta Yes
|
|
|
|
.It PMC_CAP_USER Ta Yes
|
|
|
|
.It PMC_CAP_WRITE Ta Yes
|
|
|
|
.El
|
|
|
|
.Ss Event Qualifiers
|
|
|
|
Event specifiers for these PMCs support the following common
|
|
|
|
qualifiers:
|
|
|
|
.Bl -tag -width indent
|
|
|
|
.It Li rsp= Ns Ar value
|
|
|
|
Configure the Off-core Response bits.
|
|
|
|
.Bl -tag -width indent
|
|
|
|
.It Li DMND_DATA_RD
|
|
|
|
Counts the number of demand and DCU prefetch data reads of full
|
|
|
|
and partial cachelines as well as demand data page table entry
|
|
|
|
cacheline reads. Does not count L2 data read prefetches or
|
|
|
|
instruction fetches.
|
|
|
|
.It Li REQ_DMND_RFO
|
|
|
|
Counts the number of demand and DCU prefetch reads for ownership (RFO)
|
|
|
|
requests generated by a write to data cacheline. Does not count L2 RFO
|
|
|
|
prefetches.
|
|
|
|
.It Li REQ_DMND_IFETCH
|
|
|
|
Counts the number of demand and DCU prefetch instruction cacheline reads.
|
|
|
|
Does not count L2 code read prefetches.
|
|
|
|
.It Li REQ_WB
|
|
|
|
Counts the number of writeback (modified to exclusive) transactions.
|
|
|
|
.It Li REQ_PF_DATA_RD
|
|
|
|
Counts the number of data cacheline reads generated by L2 prefetchers.
|
|
|
|
.It Li REQ_PF_RFO
|
|
|
|
Counts the number of RFO requests generated by L2 prefetchers.
|
|
|
|
.It Li REQ_PF_IFETCH
|
|
|
|
Counts the number of code reads generated by L2 prefetchers.
|
|
|
|
.It Li REQ_PF_LLC_DATA_RD
|
|
|
|
L2 prefetcher to L3 for loads.
|
|
|
|
.It Li REQ_PF_LLC_RFO
|
|
|
|
RFO requests generated by L2 prefetcher
|
|
|
|
.It Li REQ_PF_LLC_IFETCH
|
|
|
|
L2 prefetcher to L3 for instruction fetches.
|
|
|
|
.It Li REQ_BUS_LOCKS
|
|
|
|
Bus lock and split lock requests.
|
|
|
|
.It Li REQ_STRM_ST
|
|
|
|
Streaming store requests.
|
|
|
|
.It Li REQ_OTHER
|
|
|
|
Any other request that crosses IDI, including I/O.
|
|
|
|
.It Li RES_ANY
|
|
|
|
Catch all value for any response types.
|
|
|
|
.It Li RES_SUPPLIER_NO_SUPP
|
|
|
|
No Supplier Information available.
|
|
|
|
.It Li RES_SUPPLIER_LLC_HITM
|
|
|
|
M-state initial lookup stat in L3.
|
|
|
|
.It Li RES_SUPPLIER_LLC_HITE
|
|
|
|
E-state.
|
|
|
|
.It Li RES_SUPPLIER_LLC_HITS
|
|
|
|
S-state.
|
|
|
|
.It Li RES_SUPPLIER_LLC_HITF
|
|
|
|
F-state.
|
|
|
|
.It Li RES_SUPPLIER_LOCAL
|
|
|
|
Local DRAM Controller.
|
|
|
|
.It Li RES_SNOOP_SNP_NONE
|
|
|
|
No details on snoop-related information.
|
|
|
|
.It Li RES_SNOOP_SNP_NO_NEEDED
|
|
|
|
No snoop was needed to satisfy the request.
|
|
|
|
.It Li RES_SNOOP_SNP_MISS
|
|
|
|
A snoop was needed and it missed all snooped caches:
|
|
|
|
-For LLC Hit, ReslHitl was returned by all cores
|
|
|
|
-For LLC Miss, Rspl was returned by all sockets and data was returned from
|
|
|
|
DRAM.
|
|
|
|
.It Li RES_SNOOP_HIT_NO_FWD
|
|
|
|
A snoop was needed and it hits in at least one snooped cache. Hit denotes a
|
|
|
|
cache-line was valid before snoop effect. This includes:
|
|
|
|
-Snoop Hit w/ Invalidation (LLC Hit, RFO)
|
|
|
|
-Snoop Hit, Left Shared (LLC Hit/Miss, IFetch/Data_RD)
|
|
|
|
-Snoop Hit w/ Invalidation and No Forward (LLC Miss, RFO Hit S)
|
|
|
|
In the LLC Miss case, data is returned from DRAM.
|
|
|
|
.It Li RES_SNOOP_HIT_FWD
|
|
|
|
A snoop was needed and data was forwarded from a remote socket.
|
|
|
|
This includes:
|
|
|
|
-Snoop Forward Clean, Left Shared (LLC Hit/Miss, IFetch/Data_RD/RFT).
|
|
|
|
.It Li RES_SNOOP_HITM
|
|
|
|
A snoop was needed and it HitM-ed in local or remote cache. HitM denotes a
|
|
|
|
cache-line was in modified state before effect as a results of snoop. This
|
|
|
|
includes:
|
|
|
|
-Snoop HitM w/ WB (LLC miss, IFetch/Data_RD)
|
|
|
|
-Snoop Forward Modified w/ Invalidation (LLC Hit/Miss, RFO)
|
|
|
|
-Snoop MtoS (LLC Hit, IFetch/Data_RD).
|
|
|
|
.It Li RES_NON_DRAM
|
|
|
|
Target was non-DRAM system address. This includes MMIO transactions.
|
|
|
|
.El
|
|
|
|
.It Li cmask= Ns Ar value
|
|
|
|
Configure the PMC to increment only if the number of configured
|
|
|
|
events measured in a cycle is greater than or equal to
|
|
|
|
.Ar value .
|
|
|
|
.It Li edge
|
|
|
|
Configure the PMC to count the number of de-asserted to asserted
|
|
|
|
transitions of the conditions expressed by the other qualifiers.
|
|
|
|
If specified, the counter will increment only once whenever a
|
|
|
|
condition becomes true, irrespective of the number of clocks during
|
|
|
|
which the condition remains true.
|
|
|
|
.It Li inv
|
|
|
|
Invert the sense of comparison when the
|
|
|
|
.Dq Li cmask
|
|
|
|
qualifier is present, making the counter increment when the number of
|
|
|
|
events per cycle is less than the value specified by the
|
|
|
|
.Dq Li cmask
|
|
|
|
qualifier.
|
|
|
|
.It Li os
|
|
|
|
Configure the PMC to count events happening at processor privilege
|
|
|
|
level 0.
|
|
|
|
.It Li usr
|
|
|
|
Configure the PMC to count events occurring at privilege levels 1, 2
|
|
|
|
or 3.
|
|
|
|
.El
|
|
|
|
.Pp
|
|
|
|
If neither of the
|
|
|
|
.Dq Li os
|
|
|
|
or
|
|
|
|
.Dq Li usr
|
|
|
|
qualifiers are specified, the default is to enable both.
|
|
|
|
.Ss Event Specifiers (Programmable PMCs)
|
|
|
|
Haswell programmable PMCs support the following events:
|
|
|
|
.Bl -tag -width indent
|
|
|
|
.It Li LD_BLOCKS.STORE_FORWARD
|
|
|
|
.Pq Event 03H , Umask 02H
|
2013-03-29 08:32:49 +00:00
|
|
|
Loads blocked by overlapping with store buffer that
|
|
|
|
cannot be forwarded.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li MISALIGN_MEM_REF.LOADS
|
|
|
|
.Pq Event 05H , Umask 01H
|
2013-03-29 08:32:49 +00:00
|
|
|
Speculative cache-line split load uops dispatched to
|
|
|
|
L1D.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li MISALIGN_MEM_REF.STORES
|
|
|
|
.Pq Event 05H , Umask 02H
|
2013-03-29 08:32:49 +00:00
|
|
|
Speculative cache-line split Store-address uops
|
|
|
|
dispatched to L1D.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li LD_BLOCKS_PARTIAL.ADDRESS_ALIAS
|
|
|
|
.Pq Event 07H , Umask 01H
|
2013-03-29 08:32:49 +00:00
|
|
|
False dependencies in MOB due to partial compare
|
|
|
|
on address.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li DTLB_LOAD_MISSES.MISS_CAUSES_A_WALK
|
|
|
|
.Pq Event 08H , Umask 01H
|
2013-03-29 08:32:49 +00:00
|
|
|
Misses in all TLB levels that cause a page walk of any
|
|
|
|
page size.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li DTLB_LOAD_MISSES.WALK_COMPLETED_4K
|
|
|
|
.Pq Event 08H , Umask 02H
|
2013-03-29 08:32:49 +00:00
|
|
|
Completed page walks due to demand load misses
|
|
|
|
that caused 4K page walks in any TLB levels.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li DTLB_LOAD_MISSES.WALK_COMPLETED_2M_4K
|
|
|
|
.Pq Event 08H , Umask 02H
|
2013-03-29 08:32:49 +00:00
|
|
|
Completed page walks due to demand load misses
|
|
|
|
that caused 2M/4M page walks in any TLB levels.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li DTLB_LOAD_MISSES.WALK_COMPLETED
|
|
|
|
.Pq Event 08H , Umask 0EH
|
2013-03-29 08:32:49 +00:00
|
|
|
Completed page walks in any TLB of any page size
|
|
|
|
due to demand load misses
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li DTLB_LOAD_MISSES.WALK_DURATION
|
|
|
|
.Pq Event 08H , Umask 10H
|
2013-03-29 08:32:49 +00:00
|
|
|
Cycle PMH is busy with a walk.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li DTLB_LOAD_MISSES.STLB_HIT_4K
|
|
|
|
.Pq Event 08H , Umask 20H
|
2013-03-29 08:32:49 +00:00
|
|
|
Load misses that missed DTLB but hit STLB (4K).
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li DTLB_LOAD_MISSES.STLB_HIT_2M
|
|
|
|
.Pq Event 08H , Umask 40H
|
2013-03-29 08:32:49 +00:00
|
|
|
Load misses that missed DTLB but hit STLB (2M).
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li DTLB_LOAD_MISSES.STLB_HIT
|
|
|
|
.Pq Event 08H , Umask 60H
|
|
|
|
Number of cache load STLB hits. No page walk.
|
|
|
|
.It Li DTLB_LOAD_MISSES.PDE_CACHE_MISS
|
|
|
|
.Pq Event 08H , Umask 80H
|
2013-03-29 08:32:49 +00:00
|
|
|
DTLB demand load misses with low part of linear-to-
|
|
|
|
physical address translation missed
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li INT_MISC.RECOVERY_CYCLES
|
|
|
|
.Pq Event 0DH , Umask 03H
|
2013-03-29 08:32:49 +00:00
|
|
|
Cycles waiting to recover after Machine Clears
|
|
|
|
except JEClear. Set Cmask= 1.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li UOPS_ISSUED.ANY
|
|
|
|
.Pq Event 0EH , Umask 01H
|
2013-03-29 08:32:49 +00:00
|
|
|
ncrements each cycle the # of Uops issued by the
|
|
|
|
RAT to RS.
|
2013-03-28 19:15:54 +00:00
|
|
|
Set Cmask = 1, Inv = 1, Any= 1to count stalled cycles
|
|
|
|
of this core.
|
|
|
|
.It Li UOPS_ISSUED.FLAGS_MERGE
|
|
|
|
.Pq Event 0EH , Umask 10H
|
|
|
|
Number of flags-merge uops allocated. Such uops
|
|
|
|
adds delay.
|
|
|
|
.It Li UOPS_ISSUED.SLOW_LEA
|
|
|
|
.Pq Event 0EH , Umask 20H
|
|
|
|
Number of slow LEA or similar uops allocated. Such
|
|
|
|
uop has 3 sources (e.g. 2 sources + immediate)
|
|
|
|
regardless if as a result of LEA instruction or not.
|
|
|
|
.It Li UOPS_ISSUED.SiNGLE_MUL
|
|
|
|
.Pq Event 0EH , Umask 40H
|
|
|
|
Number of multiply packed/scalar single precision
|
|
|
|
uops allocated.
|
|
|
|
.It Li L2_RQSTS.DEMAND_DATA_RD_MISS
|
|
|
|
.Pq Event 24H , Umask 21H
|
|
|
|
Demand Data Read requests that missed L2, no
|
|
|
|
rejects.
|
|
|
|
.It Li L2_RQSTS.DEMAND_DATA_RD_HIT
|
|
|
|
.Pq Event 24H , Umask 41H
|
|
|
|
Demand Data Read requests that hit L2 cache.
|
|
|
|
.It Li L2_RQSTS.ALL_DEMAND_DATA_RD
|
2013-03-29 08:32:49 +00:00
|
|
|
.Pq Event 24H , Umask E1H
|
2013-03-28 19:15:54 +00:00
|
|
|
Counts any demand and L1 HW prefetch data load
|
|
|
|
requests to L2.
|
|
|
|
.It Li L2_RQSTS.RFO_HIT
|
|
|
|
.Pq Event 24H , Umask 42H
|
|
|
|
Counts the number of store RFO requests that hit
|
|
|
|
the L2 cache.
|
|
|
|
.It Li L2_RQSTS.RFO_MISS
|
|
|
|
.Pq Event 24H , Umask 22H
|
|
|
|
Counts the number of store RFO requests that miss
|
|
|
|
the L2 cache.
|
|
|
|
.It Li L2_RQSTS.ALL_RFO
|
|
|
|
.Pq Event 24H , Umask E2H
|
|
|
|
Counts all L2 store RFO requests.
|
|
|
|
.It Li L2_RQSTS.CODE_RD_HIT
|
|
|
|
.Pq Event 24H , Umask 44H
|
|
|
|
Number of instruction fetches that hit the L2 cache.
|
|
|
|
.It Li L2_RQSTS.CODE_RD_MISS
|
|
|
|
.Pq Event 24H , Umask 24H
|
|
|
|
Number of instruction fetches that missed the L2
|
|
|
|
cache.
|
|
|
|
.It Li L2_RQSTS.ALL_DEMAND_MISS
|
|
|
|
.Pq Event 24H , Umask 27H
|
|
|
|
Demand requests that miss L2 cache.
|
|
|
|
.It Li L2_RQSTS.ALL_DEMAND_REFERENCES
|
|
|
|
.Pq Event 24H , Umask E7H
|
|
|
|
Demand requests to L2 cache.
|
|
|
|
.It Li L2_RQSTS.ALL_CODE_RD
|
|
|
|
.Pq Event 24H , Umask E4H
|
|
|
|
Counts all L2 code requests.
|
|
|
|
.It Li L2_RQSTS.L2_PF_HIT
|
|
|
|
.Pq Event 24H , Umask 50H
|
|
|
|
Counts all L2 HW prefetcher requests that hit L2.
|
|
|
|
.It Li L2_RQSTS.L2_PF_MISS
|
|
|
|
.Pq Event 24H , Umask 30H
|
|
|
|
Counts all L2 HW prefetcher requests that missed
|
|
|
|
L2.
|
|
|
|
.It Li L2_RQSTS.ALL_PF
|
|
|
|
.Pq Event 24H , Umask F8H
|
|
|
|
Counts all L2 HW prefetcher requests.
|
|
|
|
.It Li L2_RQSTS.MISS
|
|
|
|
.Pq Event 24H , Umask 3FH
|
|
|
|
All requests that missed L2.
|
|
|
|
.It Li L2_RQSTS.REFERENCES
|
|
|
|
.Pq Event 24H , Umask FFH
|
|
|
|
All requests to L2 cache.
|
|
|
|
.It Li L2_DEMAND_RQSTS.WB_HIT
|
|
|
|
.Pq Event 27H , Umask 50H
|
|
|
|
Not rejected writebacks that hit L2 cache
|
|
|
|
.It Li LONGEST_LAT_CACHE.REFERENCE
|
|
|
|
.Pq Event 2EH , Umask 4FH
|
|
|
|
This event counts requests originating from the core
|
|
|
|
that reference a cache line in the last level cache.
|
|
|
|
.It Li LONGEST_LAT_CACHE.MISS
|
|
|
|
.Pq Event 2EH , Umask 41H
|
|
|
|
This event counts each cache miss condition for
|
|
|
|
references to the last level cache.
|
|
|
|
.It Li CPU_CLK_UNHALTED.THREAD_P
|
|
|
|
.Pq Event 3CH , Umask 00H
|
|
|
|
Counts the number of thread cycles while the thread
|
|
|
|
is not in a halt state. The thread enters the halt state
|
|
|
|
when it is running the HLT instruction. The core
|
|
|
|
frequency may change from time to time due to
|
|
|
|
power or thermal throttling.
|
|
|
|
.It Li CPU_CLK_THREAD_UNHALTED.REF_XCLK
|
|
|
|
.Pq Event 3CH , Umask 01H
|
|
|
|
Increments at the frequency of XCLK (100 MHz)
|
|
|
|
when not halted.
|
|
|
|
.It Li L1D_PEND_MISS.PENDING
|
|
|
|
.Pq Event 48H , Umask 01H
|
2013-03-29 08:32:49 +00:00
|
|
|
Increments the number of outstanding L1D misses
|
|
|
|
every cycle. Set Cmaks = 1 and Edge =1 to count
|
|
|
|
occurrences.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li DTLB_STORE_MISSES.MISS_CAUSES_A_WALK
|
|
|
|
.Pq Event 49H , Umask 01H
|
|
|
|
Miss in all TLB levels causes an page walk of any
|
|
|
|
page size (4K/2M/4M/1G).
|
|
|
|
.It Li DTLB_STORE_MISSES.WALK_COMPLETED_4K
|
|
|
|
.Pq Event 49H , Umask 02H
|
|
|
|
Completed page walks due to store misses in one or
|
|
|
|
more TLB levels of 4K page structure.
|
|
|
|
.It Li DTLB_STORE_MISSES.WALK_COMPLETED_2M_4M
|
|
|
|
.Pq Event 49H , Umask 04H
|
|
|
|
Completed page walks due to store misses in one or
|
|
|
|
more TLB levels of 2M/4M page structure.
|
|
|
|
.It Li DTLB_STORE_MISSES.WALK_COMPLETED
|
|
|
|
.Pq Event 49H , Umask 0EH
|
|
|
|
Completed page walks due to store miss in any TLB
|
|
|
|
levels of any page size (4K/2M/4M/1G).
|
|
|
|
.It Li DTLB_STORE_MISSES.WALK_DURATION
|
|
|
|
.Pq Event 49H , Umask 10H
|
|
|
|
Cycles PMH is busy with this walk.
|
|
|
|
.It Li DTLB_STORE_MISSES.STLB_HIT_4K
|
|
|
|
.Pq Event 49H , Umask 20H
|
|
|
|
Store misses that missed DTLB but hit STLB (4K).
|
|
|
|
.It Li DTLB_STORE_MISSES.STLB_HIT_2M
|
|
|
|
.Pq Event 49H , Umask 40H
|
|
|
|
Store misses that missed DTLB but hit STLB (2M).
|
|
|
|
.It Li DTLB_STORE_MISSES.STLB_HIT
|
|
|
|
.Pq Event 49H , Umask 60H
|
|
|
|
Store operations that miss the first TLB level but hit
|
|
|
|
the second and do not cause page walks.
|
|
|
|
.It Li DTLB_STORE_MISSES.PDE_CACHE_MISS
|
|
|
|
.Pq Event 49H , Umask 80H
|
|
|
|
DTLB store misses with low part of linear-to-physical
|
|
|
|
address translation missed.
|
|
|
|
.It Li LOAD_HIT_PRE.SW_PF
|
|
|
|
.Pq Event 4CH , Umask 01H
|
|
|
|
Non-SW-prefetch load dispatches that hit fill buffer
|
|
|
|
allocated for S/W prefetch.
|
|
|
|
.It Li LOAD_HIT_PRE.HW_PF
|
|
|
|
.Pq Event 4CH , Umask 02H
|
|
|
|
Non-SW-prefetch load dispatches that hit fill buffer
|
|
|
|
allocated for H/W prefetch.
|
|
|
|
.It Li L1D.REPLACEMENT
|
|
|
|
.Pq Event 51H , Umask 01H
|
|
|
|
Counts the number of lines brought into the L1 data
|
|
|
|
cache.
|
|
|
|
.It Li MOVE_ELIMINATION.INT_NOT_ELIMINATED
|
|
|
|
.Pq Event 58H , Umask 04H
|
|
|
|
Number of integer Move Elimination candidate uops
|
|
|
|
that were not eliminated.
|
|
|
|
.It Li MOVE_ELIMINATION.SMID_NOT_ELIMINATED
|
|
|
|
.Pq Event 58H , Umask 08H
|
|
|
|
Number of SIMD Move Elimination candidate uops
|
|
|
|
that were not eliminated.
|
|
|
|
.It Li MOVE_ELIMINATION.INT_ELIMINATED
|
|
|
|
.Pq Event 58H , Umask 01H
|
|
|
|
Unhalted core cycles when the thread is in ring 0.
|
|
|
|
.It Li MOVE_ELIMINATION.SMID_ELIMINATED
|
|
|
|
.Pq Event 58H , Umask 02H
|
|
|
|
Number of SIMD Move Elimination candidate uops
|
|
|
|
that were eliminated.
|
|
|
|
.It Li CPL_CYCLES.RING0
|
|
|
|
.Pq Event 5CH , Umask 02H
|
|
|
|
Unhalted core cycles when the thread is in ring 0.
|
|
|
|
.It Li CPL_CYCLES.RING123
|
|
|
|
.Pq Event 5CH , Umask 01H
|
|
|
|
Unhalted core cycles when the thread is not in ring 0.
|
|
|
|
.It Li RS_EVENTS.EMPTY_CYCLES
|
|
|
|
.Pq Event 5EH , Umask 01H
|
|
|
|
Cycles the RS is empty for the thread.
|
|
|
|
.It Li OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD
|
|
|
|
.Pq Event 60H , Umask 01H
|
2013-03-29 08:32:49 +00:00
|
|
|
Offcore outstanding Demand Data Read transactions
|
2013-03-28 19:15:54 +00:00
|
|
|
in SQ to uncore. Set Cmask=1 to count cycles.
|
|
|
|
.It Li OFFCORE_REQUESTS_OUTSTANDING.DEMAND_CORE_RD
|
|
|
|
.Pq Event 60H , Umask 02H
|
2013-03-29 08:32:49 +00:00
|
|
|
Offcore outstanding Demand code Read transactions
|
2013-03-28 19:15:54 +00:00
|
|
|
in SQ to uncore. Set Cmask=1 to count cycles.
|
|
|
|
.It Li OFFCORE_REQUESTS_OUTSTANDING.DEMAND_RFO
|
|
|
|
.Pq Event 60H , Umask 04H
|
2013-03-29 08:32:49 +00:00
|
|
|
Offcore outstanding RFO store transactions in SQ to
|
2013-03-28 19:15:54 +00:00
|
|
|
uncore. Set Cmask=1 to count cycles.
|
|
|
|
.It Li OFFCORE_REQUESTS_OUTSTANDING.ALL_DATA_RD
|
|
|
|
.Pq Event 60H , Umask 08H
|
|
|
|
Offcore outstanding cacheable data read
|
|
|
|
transactions in SQ to uncore. Set Cmask=1 to count
|
|
|
|
cycles.
|
|
|
|
.It Li LOCK_CYCLES.SPLIT_LOCK_UC_LOCK_DURATION
|
|
|
|
.Pq Event 63H , Umask 01H
|
|
|
|
Cycles in which the L1D and L2 are locked, due to a
|
|
|
|
UC lock or split lock.
|
|
|
|
.It Li LOCK_CYCLES.CACHE_LOCK_DURATION
|
|
|
|
.Pq Event 63H , Umask 02H
|
|
|
|
Cycles in which the L1D is locked.
|
|
|
|
.It Li IDQ.EMPTY
|
|
|
|
.Pq Event 79H , Umask 02H
|
|
|
|
Counts cycles the IDQ is empty.
|
|
|
|
.It Li IDQ.MITE_UOPS
|
|
|
|
.Pq Event 79H , Umask 04H
|
2013-03-29 08:32:49 +00:00
|
|
|
Increment each cycle # of uops delivered to IDQ from
|
2013-03-28 19:15:54 +00:00
|
|
|
MITE path.
|
|
|
|
Set Cmask = 1 to count cycles.
|
|
|
|
.It Li IDQ.DSB_UOPS
|
|
|
|
.Pq Event 79H , Umask 08H
|
|
|
|
Increment each cycle. # of uops delivered to IDQ
|
|
|
|
from DSB path.
|
|
|
|
Set Cmask = 1 to count cycles.
|
|
|
|
.It Li IDQ.MS_DSB_UOPS
|
|
|
|
.Pq Event 79H , Umask 10H
|
2013-03-29 08:32:49 +00:00
|
|
|
Increment each cycle # of uops delivered to IDQ
|
|
|
|
when MS_busy by DSB. Set Cmask = 1 to count
|
|
|
|
cycles. Add Edge=1 to count # of delivery.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li IDQ.MS_MITE_UOPS
|
|
|
|
.Pq Event 79H , Umask 20H
|
2013-03-29 08:32:49 +00:00
|
|
|
ncrement each cycle # of uops delivered to IDQ
|
|
|
|
when MS_busy by MITE. Set Cmask = 1 to count
|
|
|
|
cycles.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li IDQ.MS_UOPS
|
|
|
|
.Pq Event 79H , Umask 30H
|
|
|
|
Increment each cycle # of uops delivered to IDQ from
|
|
|
|
MS by either DSB or MITE. Set Cmask = 1 to count
|
|
|
|
cycles.
|
|
|
|
.It Li IDQ.ALL_DSB_CYCLES_ANY_UOPS
|
|
|
|
.Pq Event 79H , Umask 18H
|
2013-03-29 08:32:49 +00:00
|
|
|
Counts cycles DSB is delivered at least one uops. Set
|
|
|
|
Cmask = 1.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li IDQ.ALL_DSB_CYCLES_4_UOPS
|
|
|
|
.Pq Event 79H , Umask 18H
|
2013-03-29 08:32:49 +00:00
|
|
|
Counts cycles DSB is delivered four uops. Set Cmask
|
2013-03-28 19:15:54 +00:00
|
|
|
=4.
|
|
|
|
.It Li IDQ.ALL_MITE_CYCLES_ANY_UOPS
|
|
|
|
.Pq Event 79H , Umask 24H
|
2013-03-29 08:32:49 +00:00
|
|
|
Counts cycles MITE is delivered at least one uops. Set
|
|
|
|
Cmask = 1.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li IDQ.ALL_MITE_CYCLES_4_UOPS
|
|
|
|
.Pq Event 79H , Umask 24H
|
2013-03-29 08:32:49 +00:00
|
|
|
Counts cycles MITE is delivered four uops. Set Cmask
|
2013-03-28 19:15:54 +00:00
|
|
|
=4.
|
|
|
|
.It Li IDQ.MITE_ALL_UOPS
|
|
|
|
.Pq Event 79H , Umask 3CH
|
2013-03-29 08:32:49 +00:00
|
|
|
# of uops delivered to IDQ from any path.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li ICACHE.MISSES
|
|
|
|
.Pq Event 80H , Umask 02H
|
2013-03-29 08:32:49 +00:00
|
|
|
Number of Instruction Cache, Streaming Buffer and
|
|
|
|
Victim Cache Misses. Includes UC accesses.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li ITLB_MISSES.MISS_CAUSES_A_WALK
|
|
|
|
.Pq Event 85H , Umask 01H
|
2013-03-29 08:32:49 +00:00
|
|
|
Misses in ITLB that causes a page walk of any page
|
|
|
|
size.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li ITLB_MISSES.WALK_COMPLETED_4K
|
|
|
|
.Pq Event 85H , Umask 02H
|
2013-03-29 08:32:49 +00:00
|
|
|
Completed page walks due to misses in ITLB 4K page
|
|
|
|
entries.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li TLB_MISSES.WALK_COMPLETED_2M_4M
|
|
|
|
.Pq Event 85H , Umask 04H
|
2013-03-29 08:32:49 +00:00
|
|
|
Completed page walks due to misses in ITLB 2M/4M
|
2013-03-28 19:15:54 +00:00
|
|
|
page entries.
|
|
|
|
.It Li ITLB_MISSES.WALK_COMPLETED
|
|
|
|
.Pq Event 85H , Umask 0EH
|
2013-03-29 08:32:49 +00:00
|
|
|
Completed page walks in ITLB of any page size.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li ITLB_MISSES.WALK_DURATION
|
|
|
|
.Pq Event 85H , Umask 10H
|
2013-03-29 08:32:49 +00:00
|
|
|
Cycle PMH is busy with a walk.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li ITLB_MISSES.STLB_HIT_4K
|
|
|
|
.Pq Event 85H , Umask 20H
|
2013-03-29 08:32:49 +00:00
|
|
|
ITLB misses that hit STLB (4K).
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li ITLB_MISSES.STLB_HIT_2M
|
|
|
|
.Pq Event 85H , Umask 40H
|
2013-03-29 08:32:49 +00:00
|
|
|
ITLB misses that hit STLB (2K).
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li ITLB_MISSES.STLB_HIT
|
|
|
|
.Pq Event 85H , Umask 60H
|
2013-03-29 08:32:49 +00:00
|
|
|
TLB misses that hit STLB. No page walk.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li ILD_STALL.LCP
|
|
|
|
.Pq Event 87H , Umask 01H
|
2013-03-29 08:32:49 +00:00
|
|
|
Stalls caused by changing prefix length of the
|
2013-03-28 19:15:54 +00:00
|
|
|
instruction.
|
|
|
|
.It Li ILD_STALL.IQ_FULL
|
|
|
|
.Pq Event 87H , Umask 04H
|
2013-03-29 08:32:49 +00:00
|
|
|
Stall cycles due to IQ is full.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li BR_INST_EXEC.COND
|
|
|
|
.Pq Event 88H , Umask 01H
|
2013-03-29 08:32:49 +00:00
|
|
|
Qualify conditional near branch instructions
|
|
|
|
executed, but not necessarily retired.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li BR_INST_EXEC.DIRECT_JMP
|
|
|
|
.Pq Event 88H , Umask 02H
|
2013-03-29 08:32:49 +00:00
|
|
|
Qualify all unconditional near branch instructions
|
|
|
|
excluding calls and indirect branches.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li BR_INST_EXEC.INDIRECT_JMP_NON_CALL_RET
|
|
|
|
.Pq Event 88H , Umask 04H
|
2013-03-29 08:32:49 +00:00
|
|
|
Qualify executed indirect near branch instructions
|
|
|
|
that are not calls nor returns.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li BR_INST_EXEC.RETURN_NEAR
|
|
|
|
.Pq Event 88H , Umask 08H
|
2013-03-29 08:32:49 +00:00
|
|
|
Qualify indirect near branches that have a return
|
|
|
|
mnemonic.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li BR_INST_EXEC.DIRECT_NEAR_CALL
|
|
|
|
.Pq Event 88H , Umask 10H
|
2013-03-29 08:32:49 +00:00
|
|
|
Qualify unconditional near call branch instructions,
|
|
|
|
excluding non call branch, executed.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li BR_INST_EXEC.INDIRECT_NEAR_CALL
|
|
|
|
.Pq Event 88H , Umask 20H
|
|
|
|
Qualify indirect near calls, including both register and
|
|
|
|
memory indirect, executed.
|
|
|
|
.It Li BR_INST_EXEC.NONTAKEN
|
|
|
|
.Pq Event 88H , Umask 40H
|
2013-03-29 08:32:49 +00:00
|
|
|
Qualify non-taken near branches executed.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li BR_INST_EXEC.TAKEN
|
|
|
|
.Pq Event 88H , Umask 80H
|
2013-03-29 08:32:49 +00:00
|
|
|
Qualify taken near branches executed. Must combine
|
|
|
|
with 01H,02H, 04H, 08H, 10H, 20H.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li BR_INST_EXEC.ALL_BRANCHES
|
|
|
|
.Pq Event 88H , Umask FFH
|
2013-03-29 08:32:49 +00:00
|
|
|
Counts all near executed branches (not necessarily
|
|
|
|
retired).
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li BR_MISP_EXEC.COND
|
|
|
|
.Pq Event 89H , Umask 01H
|
2013-03-29 08:32:49 +00:00
|
|
|
Qualify conditional near branch instructions
|
|
|
|
mispredicted.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li BR_MISP_EXEC.INDIRECT_JMP_NON_CALL_RET
|
|
|
|
.Pq Event 89H , Umask 04H
|
2013-03-29 08:32:49 +00:00
|
|
|
Qualify mispredicted indirect near branch
|
|
|
|
instructions that are not calls nor returns.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li BR_MISP_EXEC.RETURN_NEAR
|
|
|
|
.Pq Event 89H , Umask 08H
|
2013-03-29 08:32:49 +00:00
|
|
|
Qualify mispredicted indirect near branches that
|
|
|
|
have a return mnemonic.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li BR_MISP_EXEC.DIRECT_NEAR_CALL
|
|
|
|
.Pq Event 89H , Umask 10H
|
2013-03-29 08:32:49 +00:00
|
|
|
Qualify mispredicted unconditional near call branch
|
|
|
|
instructions, excluding non call branch, executed.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li BR_MISP_EXEC.INDIRECT_NEAR_CALL
|
|
|
|
.Pq Event 89H , Umask 20H
|
2013-03-29 08:32:49 +00:00
|
|
|
Qualify mispredicted indirect near calls, including
|
|
|
|
both register and memory indirect, executed.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li BR_MISP_EXEC.NONTAKEN
|
|
|
|
.Pq Event 89H , Umask 40H
|
2013-03-29 08:32:49 +00:00
|
|
|
Qualify mispredicted non-taken near branches
|
2013-03-28 19:15:54 +00:00
|
|
|
executed.
|
|
|
|
.It Li BR_MISP_EXEC.TAKEN
|
|
|
|
.Pq Event 89H , Umask 80H
|
2013-03-29 08:32:49 +00:00
|
|
|
Qualify mispredicted taken near branches executed.
|
|
|
|
Must combine with 01H,02H, 04H, 08H, 10H, 20H.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li BR_MISP_EXEC.ALL_BRANCHES
|
|
|
|
.Pq Event 89H , Umask FFH
|
2013-03-29 08:32:49 +00:00
|
|
|
Counts all near executed branches (not necessarily
|
|
|
|
retired).
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li IDQ_UOPS_NOT_DELIVERED.CORE
|
|
|
|
.Pq Event 9CH , Umask 01H
|
2013-03-29 08:32:49 +00:00
|
|
|
Count number of non-delivered uops to RAT per
|
|
|
|
thread.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li UOPS_EXECUTED_PORT.PORT_0
|
|
|
|
.Pq Event A1H , Umask 01H
|
2013-03-29 08:32:49 +00:00
|
|
|
Cycles which a Uop is dispatched on port 0 in this
|
|
|
|
thread.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li UOPS_EXECUTED_PORT.PORT_1
|
|
|
|
.Pq Event A1H , Umask 02H
|
2013-03-29 08:32:49 +00:00
|
|
|
Cycles which a Uop is dispatched on port 1 in this
|
2013-03-28 19:15:54 +00:00
|
|
|
thread.
|
|
|
|
.It Li UOPS_EXECUTED_PORT.PORT_2
|
|
|
|
.Pq Event A1H , Umask 04H
|
2013-03-29 08:32:49 +00:00
|
|
|
Cycles which a Uop is dispatched on port 2 in this
|
2013-03-28 19:15:54 +00:00
|
|
|
thread.
|
|
|
|
.It Li UOPS_EXECUTED_PORT.PORT_3
|
|
|
|
.Pq Event A1H , Umask 08H
|
2013-03-29 08:32:49 +00:00
|
|
|
Cycles which a Uop is dispatched on port 3 in this
|
2013-03-28 19:15:54 +00:00
|
|
|
thread.
|
|
|
|
.It Li UOPS_EXECUTED_PORT.PORT_4
|
|
|
|
.Pq Event A1H , Umask 10H
|
2013-03-29 08:32:49 +00:00
|
|
|
Cycles which a Uop is dispatched on port 4 in this
|
2013-03-28 19:15:54 +00:00
|
|
|
thread.
|
|
|
|
.It Li UOPS_EXECUTED_PORT.PORT_5
|
|
|
|
.Pq Event A1H , Umask 20H
|
2013-03-29 08:32:49 +00:00
|
|
|
Cycles which a Uop is dispatched on port 5 in this
|
2013-03-28 19:15:54 +00:00
|
|
|
thread.
|
|
|
|
.It Li UOPS_EXECUTED_PORT.PORT_6
|
|
|
|
.Pq Event A1H , Umask 40H
|
2013-03-29 08:32:49 +00:00
|
|
|
Cycles which a Uop is dispatched on port 6 in this
|
2013-03-28 19:15:54 +00:00
|
|
|
thread.
|
|
|
|
.It Li UOPS_EXECUTED_PORT.PORT_7
|
|
|
|
.Pq Event A1H , Umask 80H
|
2013-03-29 08:32:49 +00:00
|
|
|
Cycles which a Uop is dispatched on port 7 in this
|
2013-03-28 19:15:54 +00:00
|
|
|
thread.
|
|
|
|
.It Li RESOURCE_STALLS.ANY
|
|
|
|
.Pq Event A2H , Umask 01H
|
|
|
|
Cycles Allocation is stalled due to Resource Related
|
|
|
|
reason.
|
|
|
|
.It Li RESOURCE_STALLS.RS
|
|
|
|
.Pq Event A2H , Umask 04H
|
|
|
|
Cycles stalled due to no eligible RS entry available.
|
|
|
|
.It Li RESOURCE_STALLS.SB
|
|
|
|
.Pq Event A2H , Umask 08H
|
|
|
|
Cycles stalled due to no store buffers available (not
|
|
|
|
including draining form sync).
|
|
|
|
.It Li RESOURCE_STALLS.ROB
|
|
|
|
.Pq Event A2H , Umask 10H
|
|
|
|
Cycles stalled due to re-order buffer full.
|
|
|
|
.It Li CYCLE_ACTIVITY.CYCLES_L2_PENDING
|
|
|
|
.Pq Event A3H , Umask 01H
|
|
|
|
Cycles with pending L2 miss loads. Set Cmask=2 to
|
|
|
|
count cycle.
|
|
|
|
.It Li CYCLE_ACTIVITY.CYCLES_LDM_PENDING
|
|
|
|
.Pq Event A3H , Umask 02H
|
|
|
|
Cycles with pending memory loads. Set Cmask=2 to
|
|
|
|
count cycle.
|
|
|
|
.It Li CYCLE_ACTIVITY.STALLS_L2_PENDING
|
|
|
|
.Pq Event A3H , Umask 05H
|
|
|
|
Number of loads missed L2.
|
|
|
|
.It Li CYCLE_ACTIVITY.CYCLES_L1D_PENDING
|
|
|
|
.Pq Event A3H , Umask 08H
|
|
|
|
Cycles with pending L1 cache miss loads. Set
|
|
|
|
Cmask=8 to count cycle.
|
|
|
|
.It Li ITLB.ITLB_FLUSH
|
|
|
|
.Pq Event AEH , Umask 01H
|
|
|
|
Counts the number of ITLB flushes, includes
|
|
|
|
4k/2M/4M pages.
|
|
|
|
.It Li OFFCORE_REQUESTS.DEMAND_DATA_RD
|
|
|
|
.Pq Event B0H , Umask 01H
|
|
|
|
Demand data read requests sent to uncore.
|
|
|
|
.It Li OFFCORE_REQUESTS.DEMAND_CODE_RD
|
|
|
|
.Pq Event B0H , Umask 02H
|
|
|
|
Demand code read requests sent to uncore.
|
|
|
|
.It Li OFFCORE_REQUESTS.DEMAND_RFO
|
|
|
|
.Pq Event B0H , Umask 04H
|
2013-03-29 08:32:49 +00:00
|
|
|
Demand RFO read requests sent to uncore, including
|
2013-03-28 19:15:54 +00:00
|
|
|
regular RFOs, locks, ItoM.
|
|
|
|
.It Li OFFCORE_REQUESTS.ALL_DATA_RD
|
|
|
|
.Pq Event B0H , Umask 08H
|
|
|
|
Data read requests sent to uncore (demand and
|
|
|
|
prefetch).
|
|
|
|
.It Li UOPS_EXECUTED.CORE
|
|
|
|
.Pq Event B1H , Umask 02H
|
|
|
|
Counts total number of uops to be executed per-core
|
|
|
|
each cycle.
|
|
|
|
.It Li OFF_CORE_RESPONSE_0
|
|
|
|
.Pq Event B7H , Umask 01H
|
|
|
|
Requires MSR 01A6H
|
|
|
|
.It Li OFF_CORE_RESPONSE_1
|
|
|
|
.Pq Event BBH , Umask 01H
|
|
|
|
Requires MSR 01A7H
|
|
|
|
.It Li PAGE_WALKER_LOADS.DTLB_L1
|
|
|
|
.Pq Event BCH , Umask 11H
|
|
|
|
Number of DTLB page walker loads that hit in the
|
|
|
|
L1+FB.
|
|
|
|
.It Li PAGE_WALKER_LOADS.ITLB_L1
|
|
|
|
.Pq Event BCH , Umask 21H
|
|
|
|
Number of ITLB page walker loads that hit in the
|
|
|
|
L1+FB.
|
|
|
|
.It Li PAGE_WALKER_LOADS.DTLB_L2
|
|
|
|
.Pq Event BCH , Umask 12H
|
|
|
|
Number of DTLB page walker loads that hit in the L2.
|
|
|
|
.It Li PAGE_WALKER_LOADS.ITLB_L2
|
|
|
|
.Pq Event BCH , Umask 22H
|
|
|
|
Number of ITLB page walker loads that hit in the L2.
|
|
|
|
.It Li PAGE_WALKER_LOADS.DTLB_L3
|
|
|
|
.Pq Event BCH , Umask 14H
|
|
|
|
Number of DTLB page walker loads that hit in the L3.
|
|
|
|
.It Li PAGE_WALKER_LOADS.ITLB_L3
|
|
|
|
.Pq Event BCH , Umask 24H
|
|
|
|
Number of ITLB page walker loads that hit in the L3.
|
|
|
|
.It Li PAGE_WALKER_LOADS.DTLB_MEMORY
|
|
|
|
.Pq Event BCH , Umask 18H
|
|
|
|
Number of DTLB page walker loads from memory.
|
|
|
|
.It Li PAGE_WALKER_LOADS.ITLB_MEMORY
|
|
|
|
.Pq Event BCH , Umask 28H
|
|
|
|
Number of ITLB page walker loads from memory.
|
|
|
|
.It Li TLB_FLUSH.DTLB_THREAD
|
|
|
|
.Pq Event BDH , Umask 01H
|
|
|
|
DTLB flush attempts of the thread-specific entries.
|
|
|
|
.It Li TLB_FLUSH.STLB_ANY
|
|
|
|
.Pq Event BDH , Umask 20H
|
|
|
|
Count number of STLB flush attempts.
|
|
|
|
.It Li INST_RETIRED.ANY_P
|
|
|
|
.Pq Event C0H , Umask 00H
|
2013-03-29 08:32:49 +00:00
|
|
|
Number of instructions at retirement.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li INST_RETIRED.ALL
|
|
|
|
.Pq Event C0H , Umask 01H
|
2013-03-29 08:32:49 +00:00
|
|
|
Precise instruction retired event with HW to reduce
|
|
|
|
effect of PEBS shadow in IP distribution.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li OTHER_ASSISTS.AVX_TO_SSE
|
|
|
|
.Pq Event C1H , Umask 08H
|
2013-03-29 08:32:49 +00:00
|
|
|
Number of transitions from AVX-256 to legacy SSE
|
|
|
|
when penalty applicable.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li OTHER_ASSISTS.SSE_TO_AVX
|
|
|
|
.Pq Event C1H , Umask 10H
|
2013-03-29 08:32:49 +00:00
|
|
|
Number of transitions from SSE to AVX-256 when
|
|
|
|
penalty applicable.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li OTHER_ASSISTS.ANY_WB_ASSIST
|
|
|
|
.Pq Event C1H , Umask 40H
|
2013-03-29 08:32:49 +00:00
|
|
|
Number of microcode assists invoked by HW upon
|
|
|
|
uop writeback.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li UOPS_RETIRED.ALL
|
|
|
|
.Pq Event C2H , Umask 01H
|
2013-03-29 08:32:49 +00:00
|
|
|
Counts the number of micro-ops retired, Use
|
|
|
|
cmask=1 and invert to count active cycles or stalled
|
|
|
|
cycles.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li UOPS_RETIRED.RETIRE_SLOTS
|
|
|
|
.Pq Event C2H , Umask 02H
|
2013-03-29 08:32:49 +00:00
|
|
|
Counts the number of retirement slots used each
|
2013-03-28 19:15:54 +00:00
|
|
|
cycle.
|
|
|
|
.It Li MACHINE_CLEARS.MEMORY_ORDERING
|
|
|
|
.Pq Event C3H , Umask 02H
|
2013-03-29 08:32:49 +00:00
|
|
|
Counts the number of machine clears due to memory
|
|
|
|
order conflicts.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li MACHINE_CLEARS.SMC
|
|
|
|
.Pq Event C3H , Umask 04H
|
2013-03-29 08:32:49 +00:00
|
|
|
Number of self-modifying-code machine clears
|
2013-03-28 19:15:54 +00:00
|
|
|
detected.
|
|
|
|
.It Li MACHINE_CLEARS.MASKMOV
|
|
|
|
.Pq Event C3H , Umask 20H
|
2013-03-29 08:32:49 +00:00
|
|
|
Counts the number of executed AVX masked load
|
|
|
|
operations that refer to an illegal address range with
|
|
|
|
the mask bits set to 0.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li BR_INST_RETIRED.ALL_BRANCHES
|
|
|
|
.Pq Event C4H , Umask 00H
|
2013-03-29 08:32:49 +00:00
|
|
|
Branch instructions at retirement.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li BR_INST_RETIRED.CONDITIONAL
|
|
|
|
.Pq Event C4H , Umask 01H
|
|
|
|
Counts the number of conditional branch instructions Supports PEBS
|
|
|
|
retired.
|
|
|
|
.It Li BR_INST_RETIRED.NEAR_CALL
|
|
|
|
.Pq Event C4H , Umask 02H
|
2013-03-29 08:32:49 +00:00
|
|
|
Direct and indirect near call instructions retired.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li BR_INST_RETIRED.ALL_BRANCHES
|
|
|
|
.Pq Event C4H , Umask 04H
|
2013-03-29 08:32:49 +00:00
|
|
|
Counts the number of branch instructions retired.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li BR_INST_RETIRED.NEAR_RETURN
|
|
|
|
.Pq Event C4H , Umask 08H
|
2013-03-29 08:32:49 +00:00
|
|
|
Counts the number of near return instructions
|
|
|
|
retired.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li BR_INST_RETIRED.NOT_TAKEN
|
|
|
|
.Pq Event C4H , Umask 10H
|
2013-03-29 08:32:49 +00:00
|
|
|
Counts the number of not taken branch instructions
|
|
|
|
retired.
|
2013-03-28 19:15:54 +00:00
|
|
|
It Li BR_INST_RETIRED.NEAR_TAKEN
|
|
|
|
.Pq Event C4H , Umask 20H
|
|
|
|
Number of near taken branches retired.
|
|
|
|
.It Li BR_INST_RETIRED.FAR_BRANCH
|
|
|
|
.Pq Event C4H , Umask 40H
|
|
|
|
Number of far branches retired.
|
|
|
|
.It Li BR_MISP_RETIRED.ALL_BRANCHES
|
|
|
|
.Pq Event C5H , Umask 00H
|
|
|
|
Mispredicted branch instructions at retirement
|
|
|
|
.It Li BR_MISP_RETIRED.CONDITIONAL
|
|
|
|
.Pq Event C5H , Umask 01H
|
|
|
|
Mispredicted conditional branch instructions retired.
|
|
|
|
.It Li BR_MISP_RETIRED.CONDITIONAL
|
|
|
|
.Pq Event C5H , Umask 04H
|
|
|
|
Mispredicted macro branch instructions retired.
|
|
|
|
.It Li FP_ASSIST.X87_OUTPUT
|
|
|
|
.Pq Event CAH , Umask 02H
|
|
|
|
Number of X87 FP assists due to Output values.
|
|
|
|
.It Li FP_ASSIST.X87_INPUT
|
|
|
|
.Pq Event CAH , Umask 04H
|
|
|
|
Number of X87 FP assists due to input values.
|
|
|
|
.It Li FP_ASSIST.SIMD_OUTPUT
|
|
|
|
.Pq Event CAH , Umask 08H
|
|
|
|
Number of SIMD FP assists due to Output values.
|
|
|
|
.It Li FP_ASSIST.SIMD_INPUT
|
|
|
|
.Pq Event CAH , Umask 10H
|
|
|
|
Number of SIMD FP assists due to input values.
|
|
|
|
.It Li FP_ASSIST.ANY
|
|
|
|
.Pq Event CAH , Umask 1EH
|
|
|
|
Cycles with any input/output SSE* or FP assists.
|
|
|
|
.It Li ROB_MISC_EVENTS.LBR_INSERTS
|
|
|
|
.Pq Event CCH , Umask 20H
|
|
|
|
Count cases of saving new LBR records by hardware.
|
|
|
|
.It Li MEM_TRANS_RETIRED.LOAD_LATENCY
|
|
|
|
.Pq Event CDH , Umask 01H
|
|
|
|
Randomly sampled loads whose latency is above a
|
|
|
|
user defined threshold. A small fraction of the overall
|
|
|
|
loads are sampled due to randomization.
|
|
|
|
.It Li MEM_UOP_RETIRED.LOADS
|
|
|
|
.Pq Event D0H , Umask 01H
|
|
|
|
Qualify retired memory uops that are loads. Combine Supports PEBS and
|
|
|
|
with umask 10H, 20H, 40H, 80H.
|
|
|
|
.It Li MEM_UOP_RETIRED.STORES
|
|
|
|
.Pq Event D0H , Umask 02H
|
|
|
|
Qualify retired memory uops that are stores.
|
|
|
|
Combine with umask 10H, 20H, 40H, 80H.
|
|
|
|
.It Li MEM_UOP_RETIRED.STLB_MISS
|
|
|
|
.Pq Event D0H , Umask 10H
|
|
|
|
Qualify retired memory uops with STLB miss. Must
|
|
|
|
combine with umask 01H, 02H, to produce counts.
|
|
|
|
.It Li MEM_UOP_RETIRED.LOCK
|
|
|
|
.Pq Event D0H , Umask 20H
|
|
|
|
Qualify retired memory uops with lock. Must combine Supports PEBS and
|
|
|
|
with umask 01H, 02H, to produce counts.
|
|
|
|
.It Li MEM_UOP_RETIRED.SPLIT
|
|
|
|
.Pq Event D0H , Umask 40H
|
|
|
|
Qualify retired memory uops with line split. Must
|
|
|
|
combine with umask 01H, 02H, to produce counts.
|
|
|
|
.It Li MEM_UOP_RETIRED.ALL
|
|
|
|
.Pq Event D0H , Umask 80H
|
|
|
|
Qualify any retired memory uops. Must combine with Supports PEBS and
|
|
|
|
umask 01H, 02H, to produce counts.
|
|
|
|
.It Li MEM_LOAD_UOPS_RETIRED.L1_HIT
|
|
|
|
.Pq Event D1H , Umask 01H
|
2013-03-29 08:32:49 +00:00
|
|
|
Retired load uops with L1 cache hits as data sources.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li MEM_LOAD_UOPS_RETIRED.L2_HIT
|
|
|
|
.Pq Event D1H , Umask 02H
|
2013-03-29 08:32:49 +00:00
|
|
|
Retired load uops with L2 cache hits as data sources.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li MEM_LOAD_UOPS_RETIRED.LLC_HIT
|
|
|
|
.Pq Event D1H , Umask 04H
|
|
|
|
Retired load uops with LLC cache hits as data
|
|
|
|
sources.
|
|
|
|
.It Li MEM_LOAD_UOPS_RETIRED.L2_MISS
|
|
|
|
.Pq Event D1H , Umask 10H
|
|
|
|
Retired load uops missed L2. Unknown data source
|
|
|
|
excluded.
|
|
|
|
.It Li MEM_LOAD_UOPS_RETIRED.HIT_LFB
|
|
|
|
.Pq Event D1H , Umask 40H
|
|
|
|
Retired load uops which data sources were load uops
|
|
|
|
missed L1 but hit FB due to preceding miss to the
|
|
|
|
same cache line with data not ready.
|
|
|
|
.It Li MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_MISS
|
|
|
|
.Pq Event D2H , Umask 01H
|
|
|
|
Retired load uops which data sources were LLC hit
|
|
|
|
and cross-core snoop missed in on-pkg core cache.
|
|
|
|
.It Li MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_HIT
|
|
|
|
.Pq Event D2H , Umask 02H
|
2013-03-29 08:32:49 +00:00
|
|
|
Retired load uops which data sources were LLC and
|
|
|
|
cross-core snoop hits in on-pkg core cache.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_HITM
|
|
|
|
.Pq Event D2H , Umask 04H
|
2013-03-29 08:32:49 +00:00
|
|
|
Retired load uops which data sources were HitM
|
|
|
|
responses from shared LLC.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_NONE
|
|
|
|
.Pq Event D2H , Umask 08H
|
2013-03-29 08:32:49 +00:00
|
|
|
Retired load uops which data sources were hits in
|
|
|
|
LLC without snoops required.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li MEM_LOAD_UOPS_LLC_MISS_RETIRED.LOCAL_DRAM
|
|
|
|
.Pq Event D3H , Umask 01H
|
2013-03-29 08:32:49 +00:00
|
|
|
Retired load uops which data sources missed LLC but
|
2013-03-28 19:15:54 +00:00
|
|
|
serviced from local dram.
|
|
|
|
.It Li BACLEARS.ANY
|
|
|
|
.Pq Event E6H , Umask 1FH
|
2013-03-29 08:32:49 +00:00
|
|
|
Number of front end re-steers due to BPU
|
|
|
|
misprediction.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li L2_TRANS.DEMAND_DATA_RD
|
|
|
|
.Pq Event F0H , Umask 01H
|
2013-03-29 08:32:49 +00:00
|
|
|
Demand Data Read requests that access L2 cache.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li L2_TRANS.RFO
|
|
|
|
.Pq Event F0H , Umask 02H
|
2013-03-29 08:32:49 +00:00
|
|
|
RFO requests that access L2 cache.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li L2_TRANS.CODE_RD
|
|
|
|
.Pq Event F0H , Umask 04H
|
2013-03-29 08:32:49 +00:00
|
|
|
L2 cache accesses when fetching instructions.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li L2_TRANS.ALL_PF
|
|
|
|
.Pq Event F0H , Umask 08H
|
2013-03-29 08:32:49 +00:00
|
|
|
Any MLC or LLC HW prefetch accessing L2, including
|
|
|
|
rejects.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li L2_TRANS.L1D_WB
|
|
|
|
.Pq Event F0H , Umask 10H
|
2013-03-29 08:32:49 +00:00
|
|
|
L1D writebacks that access L2 cache.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li L2_TRANS.L2_FILL
|
|
|
|
.Pq Event F0H , Umask 20H
|
2013-03-29 08:32:49 +00:00
|
|
|
L2 fill requests that access L2 cache.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li L2_TRANS.L2_WB
|
|
|
|
.Pq Event F0H , Umask 40H
|
2013-03-29 08:32:49 +00:00
|
|
|
L2 writebacks that access L2 cache.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li L2_TRANS.ALL_REQUESTS
|
|
|
|
.Pq Event F0H , Umask 80H
|
2013-03-29 08:32:49 +00:00
|
|
|
Transactions accessing L2 pipe.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li L2_LINES_IN.I
|
|
|
|
.Pq Event F1H , Umask 01H
|
2013-03-29 08:32:49 +00:00
|
|
|
L2 cache lines in I state filling L2.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li L2_LINES_IN.S
|
|
|
|
.Pq Event F1H , Umask 02H
|
2013-03-29 08:32:49 +00:00
|
|
|
L2 cache lines in S state filling L2.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li L2_LINES_IN.E
|
|
|
|
.Pq Event F1H , Umask 04H
|
2013-03-29 08:32:49 +00:00
|
|
|
L2 cache lines in E state filling L2.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li L2_LINES_IN.ALL
|
|
|
|
.Pq Event F1H , Umask 07H
|
2013-03-29 08:32:49 +00:00
|
|
|
L2 cache lines filling L2.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li L2_LINES_OUT.DEMAND_CLEAN
|
|
|
|
.Pq Event F2H , Umask 05H
|
2013-03-29 08:32:49 +00:00
|
|
|
Clean L2 cache lines evicted by demand.
|
2013-03-28 19:15:54 +00:00
|
|
|
.It Li L2_LINES_OUT.DEMAND_DIRTY
|
|
|
|
.Pq Event F2H , Umask 06H
|
|
|
|
Dirty L2 cache lines evicted by demand.
|
|
|
|
.El
|
|
|
|
.Sh SEE ALSO
|
|
|
|
.Xr pmc 3 ,
|
|
|
|
.Xr pmc.atom 3 ,
|
|
|
|
.Xr pmc.core 3 ,
|
|
|
|
.Xr pmc.iaf 3 ,
|
|
|
|
.Xr pmc.ucf 3 ,
|
|
|
|
.Xr pmc.k7 3 ,
|
|
|
|
.Xr pmc.k8 3 ,
|
|
|
|
.Xr pmc.p4 3 ,
|
|
|
|
.Xr pmc.p5 3 ,
|
|
|
|
.Xr pmc.p6 3 ,
|
|
|
|
.Xr pmc.corei7 3 ,
|
|
|
|
.Xr pmc.corei7uc 3 ,
|
|
|
|
.Xr pmc.haswelluc 3 ,
|
|
|
|
.Xr pmc.ivybridge 3 ,
|
|
|
|
.Xr pmc.ivybridgexeon 3 ,
|
|
|
|
.Xr pmc.sandybridge 3 ,
|
|
|
|
.Xr pmc.sandybridgeuc 3 ,
|
|
|
|
.Xr pmc.sandybridgexeon 3 ,
|
|
|
|
.Xr pmc.westmere 3 ,
|
|
|
|
.Xr pmc.westmereuc 3 ,
|
|
|
|
.Xr pmc.soft 3 ,
|
|
|
|
.Xr pmc.tsc 3 ,
|
|
|
|
.Xr pmc_cpuinfo 3 ,
|
|
|
|
.Xr pmclog 3 ,
|
|
|
|
.Xr hwpmc 4
|
|
|
|
.Sh HISTORY
|
|
|
|
The
|
|
|
|
.Nm pmc
|
|
|
|
library first appeared in
|
|
|
|
.Fx 6.0 .
|
|
|
|
.Sh AUTHORS
|
|
|
|
The
|
|
|
|
.Lb libpmc
|
|
|
|
library was written by
|
|
|
|
.An "Joseph Koshy"
|
|
|
|
.Aq jkoshy@FreeBSD.org .
|
|
|
|
The support for the Haswell
|
|
|
|
microarchitecture was written by
|
|
|
|
.An "Hiren Panchasara"
|
|
|
|
.Aq hiren.panchasara@gmail.com .
|