-
Notifications
You must be signed in to change notification settings - Fork 233
BroadwellEP
Intel® Broadwell EP Performance groups
The input file for the events on Intel® Broadwell EP/EN/EX can be found here.
- Core-local counters
-
Socket-wide counters
- Energy counters
- Home Agent counters
- Ring-to-ring interface counters
- QPI interface fixed-purpose counters
- QPI interface general-purpose counters
- Last Level cache counters
- Uncore management fixed-purpose counter
- Uncore management general-purpose counters
- Power control unit fixed-purpose counters
- Power control unit general-purpose counters
- Memory controller fixed-purpose counters
- Memory controller general-purpose counters
- Ring-to-QPI counters
- Ring-to-PCIe counters
- IRP box counters
Since the Core2 microarchitecture, Intel® provides a set of fixed-purpose counters. Each can measure only one specific event.
Counter name | Event name |
---|---|
FIXC0 | INSTR_RETIRED_ANY |
FIXC1 | CPU_CLK_UNHALTED_CORE |
FIXC2 | CPU_CLK_UNHALTED_REF |
Option | Argument | Description | Comment |
---|---|---|---|
anythread | N | Set bit 2+(index*4) in config register | |
kernel | N | Set bit (index*4) in config register |
The Intel® Broadwell EP microarchitecture provides 4 general-purpose counters consisting of a config and a counter register.
Counter name | Event name |
---|---|
PMC0 | * |
PMC1 | * |
PMC2 | * |
PMC3 | * |
PMC4 | * (only available without HyperThreading) |
PMC5 | * (only available without HyperThreading) |
PMC6 | * (only available without HyperThreading) |
PMC7 | * (only available without HyperThreading) |
WARNING: Counters PMC4-7 can only be measured if the 'kernel' option is set. LIKWID does that automatically for these counters. Be aware that the registers are incremented in user- as well as in kernel-space and thus probably have much higher counts. For comparisons with on event in PMC0-3 it is recommended to add the 'kernel' option there as well. (src: Intel® Xeon® E7-8800/4800 v4 Processor Product Family Spec Update)
Option | Argument | Description | Comment |
---|---|---|---|
edgedetect | N | Set bit 18 in config register | |
kernel | N | Set bit 17 in config register | |
anythread | N | Set bit 21 in config register | |
threshold | 8 bit hex value | Set bits 24-31 in config register | |
invert | N | Set bit 23 in config register | |
in_transaction | N | Set bit 32 in config register | Only available if Intel® Transactional Synchronization Extensions are available |
in_transaction_aborted | N | Set bit 33 in config register | Only counter PMC2 and only if Intel® Transactional Synchronization Extensions are available |
The Intel® Broadwell microarchitecture provides measureing of offcore events in PMC counters. Therefore the stream of offcore events must be filtered using the OFFCORE_RESPONSE registers. The Intel® Broadwell microarchitecture has two of those registers. LIKWID defines some events that perform the filtering according to the event name. Although there are many bitmasks possible, LIKWID natively provides only the ones with response type ANY. Own filtering can be applied with the OFFCORE_RESPONSE_0_OPTIONS and OFFCORE_RESPONSE_1_OPTIONS events. Only for those events two more counter options are available:
Option | Argument | Description | Comment |
---|---|---|---|
match0 | 16 bit hex value | Input value masked with 0x8FFF and written to bits 0-15 in the OFFCORE_RESPONSE register | Check the Intel® Software Developer System Programming Manual, Vol. 3, Chapter Performance Monitoring and https://download.01.org/perfmon/BDW. |
match1 | 22 bit hex value | Input value is written to bits 16-37 in the OFFCORE_RESPONSE register | Check the Intel® Software Developer System Programming Manual, Vol. 3, Chapter Performance Monitoring and https://download.01.org/perfmon/BDW. |
The Intel® Broadwell microarchitecture provides one register for the current core temperature.
Counter name | Event name |
---|---|
TMP0 | TEMP_CORE |
The Intel® Broadwell microarchitecture provides measurements of the current energy consumption through the RAPL interface.
Counter name | Event name |
---|---|
PWR0 | PWR_PKG_ENERGY |
PWR1 | PWR_PP0_ENERGY |
PWR2 | PWR_PP1_ENERGY |
PWR3 | PWR_DRAM_ENERGY |
The Intel® Broadwell EP/EN/EX microarchitecture provides measurements of the Home Agent (HA) in the uncore. The description from Intel®:
Each HA is responsible for the protocol side of memory interactions, including coherent and non-coherent home agent protocols (as defined in the Intel® QuickPath Interconnect Specification). Additionally, the HA is responsible for ordering memory reads/writes, coming in from the modular Ring, to a given address such that the IMC (memory controller).
The Home Agent performance counters are exposed to the operating system through PCI interfaces. There are two of those interfaces for the HA. For systems where each socket has 12 or more cores, there are both HAs available. The name BBOX originates from the Nehalem EX uncore monitoring.
Counter name | Event name |
---|---|
BBOX<0,1>C0 | * |
BBOX<0,1>C1 | * |
BBOX<0,1>C2 | * |
BBOX<0,1>C3 | * |
Option | Argument | Description | Comment |
---|---|---|---|
edgedetect | N | Set bit 18 in config register | |
threshold | 8 bit hex value | Set bits 24-31 in config register | |
invert | N | Set bit 23 in config register | |
opcode | 6 bit hex value | Set bits 0-5 in PCI_UNC_HA_PMON_OPCODEMATCH register of PCI device | |
match0 | 46 bit hex address | Extract bits 6-31 and set bits 6-31 in PCI_UNC_HA_PMON_ADDRMATCH0 register of PCI device Extract bits 32-45 and set bits 0-13 in PCI_UNC_HA_PMON_ADDRMATCH1 register of PCI device |
The Intel® Broadwell EP/EN/EX microarchitecture manages the socket internal traffic through ring-based networks. Depending on the system's configuration there are multiple rings in one socket. The SBOXes organizes the traffic between the rings. The description from Intel®:
The SBox manages the interface between the two Rings.
The processor is composed of two independent rings connected via two sets of bi-directional buffered switches. Each set of bi-directional buffered switches is partitioned into two ingress/egress pairs. Further, each ingress/egress pair is associated with a ring stop on adjacent rings. This ring stop is termed an Sbo. The processor has up to 4 SBos depending on SKU. The Sbo can be simply thought of as a conduit for the ring, but must also help maintain ordering of traffic to ensure functional correctness in certain cases.
The SBOX hardware performance counters are exposed to the operating system through the MSR interface. There are maximal four of those interfaces but not all must be present. The name SBOX originates from the Nehalem EX uncore monitoring where the functional unit to the QPI network is called SBOX but it had a different duty.
Counter name | Event name |
---|---|
SBOX<0-3>C0 | * |
SBOX<0-3>C1 | * |
SBOX<0-3>C2 | * |
SBOX<0-3>C3 | * |
Option | Argument | Description | Comment |
---|---|---|---|
edgedetect | N | Set bit 18 in config register | |
threshold | 8 bit hex value | Set bits 24-31 in config register | |
invert | N | Set bit 23 in config register | |
tid | N | Set bit 19 in config register | This option has no real effect because TID filtering can be activated but there is no possibility to specify the TID somewhere. |
The Intel® Broadwell EP/EN/EX microarchitecture provides measurements of the QPI Link layer (QPI) in the uncore. The description from Intel®:
The Intel® QPI Link Layer is responsible for packetizing requests from the caching agent on the way out to the system interface. As such, it shares responsibility with the CBo(s) as the Intel® QPI caching agent(s). It is responsible for converting CBo requests to Intel® QPI messages (i.e. snoop generation and data response messages from the snoop response) as well as converting/forwarding ring messages to Intel® QPI packets and vice versa. On Intel® Xeon processor E5 v3 family, Intel® QPI is split into two separate layers. The Intel® QPI LL (link layer) is responsible for generating, transmitting, and receiving packets with the Intel® QPI link.
The QPI hardware performance counters are exposed to the operating system through PCI interfaces. There are two of those interfaces for the QPI. The actual amount of QBOX counters depend on the CPU core count of one socket. If your system has not all interfaces but interface 0 does not work, try the other ones. The QBOX was introduced for the Haswell EP microarchitecture, for older uncore-aware architectures the QBOX and the SBOX are the same.
Counter name | Event name |
---|---|
QBOX<0,1>FIX0 | QPI_RATE |
QBOX<0,1>FIX1 | QPI_RX_IDLE |
QBOX<0,1>FIX2 | QPI_RX_LLR |
The Intel® Broadwell EP/EN/EX microarchitecture provides measurements of the QPI Link layer (QPI) in the uncore. The description from Intel®:
The Intel® QPI Link Layer is responsible for packeting requests from the caching agent
on the way out to the system interface. As such, it shares responsibility with the CBo(s)
as the Intel QPI caching agent(s). It is responsible for converting CBo requests to Intel
QPI messages (i.e. snoop generation and data response messages from the snoop
response) as well as converting/forwarding ring messages to Intel QPI packets and vice
versa.
On Intel® Xeon® Processor E5 and E7 v4 Product Families, Intel® QPI is split into two
separate layers. The Intel® QPI LL (link layer) is responsible for generating,
transmitting, and receiving packets with the Intel® QPI link.
R3QPI (Section 2.11, “R3QPI Performance Monitoring”) provides the interface to the
Ring for the Link Layer. It is also the point where VNA/VN0 link credits are acquired.
There are two Intel® QPI agents that share a single ring stop and a third agent in the
EX part with its own ring stop. These links can be connected to a single destination
(such as in DP), but also can be connected to two separate destinations (4s Ring or
sDP). Therefore, it will be necessary to count Intel® QPI statistics for each agent
separately.
The Intel® QPI Link Layer processes one flits per cycle in each direction. In order to
accommodate this, many of the events in the Link Layer can increment by 0, 1, or 2 in
each cycle. It is not possible to monitor Rx (received) and Tx (transmitted) flit
information at the same time on the same counter.
The QPI hardware performance counters are exposed to the operating system through PCI interfaces. There are three of those interfaces for the QPI (Device 0: Link 0,1 and Device 1: Link 0). The actual amount of QBOX counters depend on the CPU core count of one socket. If your system has not all interfaces but interface 0 does not work, try the other ones. The QBOX was introduced for the Haswell EP microarchitecture, for older uncore-aware architectures the QBOX and the SBOX are the same.
Counter name | Event name |
---|---|
QBOX<0-2>C0 | * |
QBOX<0-2>C1 | * |
QBOX<0-2>C2 | * |
QBOX<0-2>C3 | * |
Option | Argument | Description | Comment |
---|---|---|---|
edgedetect | N | Set bit 18 in config register | |
invert | N | Set bit 23 in config register | |
threshold | 8 bit hex value | Set bits 24-31 in config register | |
match0 | 32 bit hex address | Input value masked with 0x8003FFF8 and written to bits 0-31 in the PCI_UNC_V3_QPI_PMON_RX_MATCH_0 register of PCI device | This option matches the receive side. Check Intel® Xeon E5-2600 v3 uncore Manual for bit fields. |
match0 | 20 bit hex address | Input value masked with 0x000F000F and written to bits 0-19 in the PCI_UNC_V3_QPI_PMON_RX_MATCH_1 register of PCI device | This option matches the receive side. Check Intel® Xeon E5-2600 v3 uncore Manual for bit fields. |
match2 | 32 bit hex address | Input value masked with 0x8003FFF8 and written to bits 0-31 in the PCI_UNC_V3_QPI_PMON_TX_MATCH_0 register of PCI device | This option matches the transmit side. Check Intel® Xeon E5-2600 v3 uncore Manual for bit fields. |
match3 | 20 bit hex address | Input value masked with 0x000F000F and written to bits 0-19 in the PCI_UNC_V3_QPI_PMON_TX_MATCH_1 register of PCI device | This option matches the transmit side. Check Intel® Xeon E5-2600 v3 uncore Manual for bit fields. |
mask0 | 32 bit hex address | Input value masked with 0x8003FFF8 and written to bits 0-31 in the PCI_UNC_V3_QPI_PMON_RX_MASK_0 register of PCI device | This option masks the receive side. Check Intel® Xeon E5-2600 v3 uncore Manual for bit fields. |
mask0 | 20 bit hex address | Input value masked with 0x000F000F and written to bits 0-19 in the PCI_UNC_V3_QPI_PMON_RX_MASK_1 register of PCI device | This option masks the receive side. Check Intel® Xeon E5-2600 v3 uncore Manual for bit fields. |
mask2 | 32 bit hex address | Input value masked with 0x8003FFF8 and written to bits 0-31 in the PCI_UNC_V3_QPI_PMON_TX_MASK_0 register of PCI device | This option masks the transmit side. Check Intel® Xeon E5-2600 v3 uncore Manual for bit fields. |
mask3 | 20 bit hex address | Input value masked with 0x000F000F and written to bits 0-19 in the PCI_UNC_V3_QPI_PMON_TX_MASK_1 register of PCI device | This option masks the transmit side. Check Intel® Xeon E5-2600 v3 uncore Manual for bit fields. |
The Intel® Broadwell EP/EN/EX microarchitecture provides measurements of the LLC coherency engine in the uncore. The description from Intel®:
The LLC coherence engine (CBo) manages the interface between the core and the last
level cache (LLC). All core transactions that access the LLC are directed from the core
to a CBo via the ring interconnect. The CBo is responsible for managing data delivery
from the LLC to the requesting core. It is also responsible for maintaining coherence
between the cores within the socket that share the LLC; generating snoops and
collecting snoop responses from the local cores when the MESIF protocol requires it.
So, if the CBo fielding the core request indicates that a core within the socket owns the
line (for a coherent read), the request is snooped to that local core. That same CBo will
then snoop all peers which might have the address cached (other cores, remote
sockets, etc.) and send the request to the appropriate Home Agent for conflict
checking, memory requests and writebacks.
In the process of maintaining cache coherency within the socket, the CBo is the gate
keeper for all Intel® QuickPath Interconnect (Intel® QPI) messages that originate in
the core and is responsible for ensuring that all Intel® QPI messages that pass through
the socket’s LLC remain coherent.
The CBo manages local conflicts by ensuring that only one request is issued to the
system for a specific cacheline.
The Intel® Xeon® Processor E5 and E7 v4 Product Families uncore contains up to 24
instances of the CBo, each assigned to manage a (up to) distinct 2.5-MB slice of the
processor’s total LLC capacity. A slice that can be up to 20-way set associative. For
processors with fewer than 24 2.5-MB LLC slices, the CBo Boxes or missing slices will
still be active and track ring traffic caused by their co-located core even if they have no
LLC related traffic to track (that is, hits/misses/snoops).
Every physical memory address in the system is uniquely associated with a single CBo
instance via a proprietary hashing algorithm that is designed to keep the distribution of
traffic across the CBo instances relatively uniform for a wide range of possible address
patterns. This enables the individual CBo instances to operate independently, each
managing its slice of the physical address space without any CBo in a given socket ever
needing to communicate with the other CBos in that same socket.
The LLC hardware performance counters are exposed to the operating system through the MSR interface. The maximal amount of supported coherency engines for the Intel® Haswell EP/EN/EX microarchitecture is 24. It may be possible that your systems does not have all CBOXes, LIKWID will skip the unavailable ones in the setup phase. The name CBOX originates from the Nehalem EX uncore monitoring.
Counter name | Event name |
---|---|
CBOX<0-23>C0 | * |
CBOX<0-23>C1 | * |
CBOX<0-23>C2 | * |
CBOX<0-23>C3 | * |
Option | Argument | Operation | Comment |
---|---|---|---|
edgedetect | N | Set bit 18 in config register | |
invert | N | Set bit 23 in config register | |
threshold | 5 bit hex value | Set bits 24-28 in config register | |
tid | 5 bit hex value | Set bits 0-4 in MSR_UNC_C<0-23>_PMON_BOX_FILTER register | |
state | 6 bit hex value | Set bits 17-22 in MSR_UNC_C<0-23>_PMON_BOX_FILTER register | M: 0x28, F: 0x10, M: 0x08, E: 0x04, S: 0x02, I: 0x01 |
nid | 16 bit hex value | Set bits 0-15 in MSR_UNC_C<0-23>_PMON_BOX_FILTER1 register | Note: Node 0 has value 0x0001 |
opcode | 9 bit hex value | Set bits 20-28 in MSR_UNC_C<0-23>_PMON_BOX_FILTER1 register | A list of valid opcodes can be found in the Intel® Xeon E5-2600 v3 uncore Manual. |
match0 | 2 bit hex address | Set bits 30-31 in MSR_UNC_C<0-23>_PMON_BOX_FILTER1 register | See the Intel® Xeon E5-2600 v3 uncore Manual for more information. |
The Intel® Broadwell EP/EN/EX microarchitecture provides an event LLC_LOOKUP which can be filtered with the 'state' option. If no 'state' is set, LIKWID sets the state to 0x1F, the default value to measure all lookups.
The Intel® Broadwell EP/EN/EX microarchitecture provides measurements of the management box in the uncore. The description from Intel®:
The UBox serves as the system configuration controller for the Intel® Xeon® Processor E5 and E7 v4 Product Families
In this capacity, the UBox acts as the central unit for a variety of functions:
- The master for reading and writing physically distributed registers across using the Message Channel.
- The UBox is the intermediary for interrupt traffic, receiving interrupts from the system and dispatching interrupts to the appropriate core.
- The UBox serves as the system lock master used when quiescing the platform (e.g., Intel® QPI bus lock).
The single fixed-purpose counter counts the clock frequency of the clock source of the uncore. The uncore management performance counters are exposed to the operating system through the MSR interface. The name UBOX originates from the Nehalem EX uncore monitoring.
Counter name | Event name |
---|---|
UBOXFIX | UNCORE_CLOCK |
The Intel® Broadwell EP/EN/EX microarchitecture provides measurements of the management box in the uncore. The description from Intel®:
The UBox serves as the system configuration controller for the Intel® Xeon® Processor E5 and E7 v4 Product Families
In this capacity, the UBox acts as the central unit for a variety of functions:
- The master for reading and writing physically distributed registers across using the Message Channel.
- The UBox is the intermediary for interrupt traffic, receiving interrupts from the system and dispatching interrupts to the appropriate core.
- The UBox serves as the system lock master used when quiescing the platform (e.g., Intel® QPI bus lock).
The uncore management performance counters are exposed to the operating system through the MSR interface. The name UBOX originates from the Nehalem EX uncore monitoring.
Counter name | Event name |
---|---|
UBOX0 | * |
UBOX1 | * |
Option | Argument | Operation | Comment |
---|---|---|---|
edgedetect | N | Set bit 18 in config register | |
invert | N | Set bit 23 in config register | |
threshold | 5 bit hex value | Set bits 24-28 in config register |
The Intel® Broadwell EP/EN/EX microarchitecture provides measurements of the power control unit (PCU) in the uncore. The description from Intel®:
The PCU is the primary Power Controller for the Intel® Xeon® Processor E5 and E7 v4
Product Families.
The uncore implements a power control unit acting as a core/uncore power and thermal
manager. It runs its firmware on an internal micro-controller and coordinates the
socket’s power states.
The PCU algorithmically governs the P-state of the processor, C-state of the core and
the package C-state of the socket. It also enables the core to go to a higher
performance state (“turbo mode”) when the proper set of conditions are met.
Conversely, the PCU will throttle the processor to a lower performance state when a
thermal violation occurs.
Through specific events, the OS and the PCU will either promote or demote the C-State
of each core by altering the voltage and frequency. The system power state (S-state) of
all the sockets in the system is managed by the server legacy bridge in coordination
with all socket PCUs.
The PCU communicates to all the other units through multiple PMLink interfaces on-die
and Message Channel to access their registers. The OS and BIOS communicates to the
PCU through standardized MSR registers and ACPI.
The PCU also acts as the interface to external management controllers via PECI and
voltage regulators (NPTM). The DMI2 interface is the communication path from the
South Bridge for system power management.
Note: Many power saving features are tracked as events in their respective units. For
example, Intel® QPI Link Power saving states and Memory CKE statistics are captured
in the Intel® QPI Perfmon and IMC Perfmon respectively.
The PCU offers two fixed-purpose counters to retrieve the cycles CPU cores stay in state C6 and C3. The uncore management performance counters are exposed to the operating system through the MSR interface. The name WBOX originates from the Nehalem EX uncore monitoring.
Counter name | Event name |
---|---|
WBOX0FIX | CORES_IN_C3 |
WBOX1FIX | CORES_IN_C6 |
The Intel® Broadwell EP/EN/EX microarchitecture provides measurements of the power control unit (PCU) in the uncore. The description from Intel®:
The PCU is the primary Power Controller for the Intel® Xeon® Processor E5 and E7 v4
Product Families.
The uncore implements a power control unit acting as a core/uncore power and thermal
manager. It runs its firmware on an internal micro-controller and coordinates the
socket’s power states.
The PCU algorithmically governs the P-state of the processor, C-state of the core and
the package C-state of the socket. It also enables the core to go to a higher
performance state (“turbo mode”) when the proper set of conditions are met.
Conversely, the PCU will throttle the processor to a lower performance state when a
thermal violation occurs.
Through specific events, the OS and the PCU will either promote or demote the C-State
of each core by altering the voltage and frequency. The system power state (S-state) of
all the sockets in the system is managed by the server legacy bridge in coordination
with all socket PCUs.
The PCU communicates to all the other units through multiple PMLink interfaces on-die
and Message Channel to access their registers. The OS and BIOS communicates to the
PCU through standardized MSR registers and ACPI.
The PCU also acts as the interface to external management controllers via PECI and
voltage regulators (NPTM). The DMI2 interface is the communication path from the
South Bridge for system power management.
Note: Many power saving features are tracked as events in their respective units. For
example, Intel® QPI Link Power saving states and Memory CKE statistics are captured
in the Intel® QPI Perfmon and IMC Perfmon respectively.
The PCU performance counters are exposed to the operating system through the MSR interface. The name WBOX originates from the Nehalem EX uncore monitoring.
Counter name | Event name |
---|---|
WBOX0 | * |
WBOX1 | * |
WBOX2 | * |
WBOX3 | * |
Option | Argument | Operation | Comment |
---|---|---|---|
edgedetect | N | Set bit 18 in config register | |
invert | N | Set bit 23 in config register | |
threshold | 5 bit hex value | Set bits 24-28 in config register | |
match0 | 32 bit hex value | Set bits 0-31 in MSR_UNC_PCU_PMON_BOX_FILTER register |
Band0: bits 0-7, Band1: bits 8-15, Band2: bits 16-23, Band3: bits 24-31 |
occupancy | 2 bit hex value | Set bit 14-15 in config register | Cores in C0: 0x1, in C3: 0x2, in C6: 0x3 |
occ_edgedetect | N | Set bit 31 in config register | |
occ_invert | N | Set bit 30 in config register |
The Intel® Broadwell EP/EN/EX microarchitecture provides measurements of the integrated Memory Controllers (iMC) in the uncore. The description from Intel®:
The Intel® Xeon® Processor E5 and E7 v4 Product Families integrated Memory
Controller provides the interface to DRAM and communicates to the rest of the Uncore
through the Home Agent (i.e. the IMC does not connect to the Ring).
In conjunction with the HA, the memory controller also provides a variety of RAS
features, such as ECC, lockstep, memory access retry, memory scrubbing, thermal
throttling, mirroring, and rank sparing.
The integrated Memory Controllers performance counters are exposed to the operating system through PCI interfaces. There may be two memory controllers in the system. There are four different PCI devices per memory controller, each covering one memory channel. Each channel has one fixed counter for the DRAM clock. The four channels of the first memory controller are MBOX0-3, the four channels of the second memory controller (if available) are named MBOX4-7. The name MBOX originates from the Nehalem EX uncore monitoring.
Counter name | Event name |
---|---|
MBOX<0-7>FIX | DRAM_CLOCKTICKS |
The Intel® Broadwell EP/EN/EX microarchitecture provides measurements of the integrated Memory Controllers (iMC) in the uncore. The description from Intel®:
The Intel® Xeon® Processor E5 and E7 v4 Product Families integrated Memory
Controller provides the interface to DRAM and communicates to the rest of the Uncore
through the Home Agent (i.e. the IMC does not connect to the Ring).
In conjunction with the HA, the memory controller also provides a variety of RAS
features, such as ECC, lockstep, memory access retry, memory scrubbing, thermal
throttling, mirroring, and rank sparing.
The integrated Memory Controllers performance counters are exposed to the operating system through PCI interfaces. There may be two memory controllers in the system. There are four different PCI devices per memory controller, each covering one memory channel. Each channel has four different general-purpose counters. The four channels of the first memory controller are MBOX0-3, the four channels of the second memory controller (if available) are named MBOX4-7. The name MBOX originates from the Nehalem EX uncore monitoring.
Counter name | Event name |
---|---|
MBOX<0-7>C0 | * |
MBOX<0-7>C1 | * |
MBOX<0-7>C2 | * |
MBOX<0-7>C3 | * |
Option | Argument | Operation | Comment |
---|---|---|---|
edgedetect | N | Set bit 18 in config register | |
invert | N | Set bit 23 in config register | |
threshold | 8 bit hex value | Set bits 24-31 in config register |
The Intel® Broadwell EP/EN/EX microarchitecture provides measurements of the Ring-to-QPI (R3QPI) interface in the uncore. The description from Intel®:
R3QPI is the interface between the Intel® QPI Link Layer, which packetizes requests,
and the Ring.
R3QPI is the interface between the ring and the Intel® QPI Link Layer. It is responsible
for translating between ring protocol packets and flits that are used for transmitting
data across the Intel® QPI interface. It performs credit checking between the local
Intel® QPI LL, the remote Intel® QPI LL and other agents on the local ring.
The Ring-to-QPI performance counters are exposed to the operating system through PCI interfaces. Since the RBOXes manage the traffic from the LLC-connecting ring interface on the socket with the QPI interfaces (SBOXes), the amount is similar to the amount of SBOXes. See at SBOXes how many are available for which system configuration. The name RBOX originates from the Nehalem EX uncore monitoring.
Counter name | Event name |
---|---|
RBOX<0-3>C0 | * |
RBOX<0-3>C1 | * |
RBOX<0-3>C2 | * |
Option | Argument | Operation | Comment |
---|---|---|---|
edgedetect | N | Set bit 18 in config register | |
invert | N | Set bit 23 in config register | |
threshold | 8 bit hex value | Set bits 24-31 in config register |
The Intel® Broadwell EP/EN/EX microarchitecture provides measurements of the Ring-to-PCIe (R2PCIe) interface in the uncore. The description from Intel®:
R2PCIe represents the interface between the Ring and IIO traffic to/from PCIe.
The Ring-to-PCIe performance counters are exposed to the operating system through a PCI interface. Independent of the system's configuration, there is only one Ring-to-PCIe interface per CPU socket.
Counter name | Event name |
---|---|
PBOX0 | * |
PBOX1 | * |
PBOX2 | * |
PBOX3 | * |
Option | Argument | Operation | Comment |
---|---|---|---|
edgedetect | N | Set bit 18 in config register | |
invert | N | Set bit 23 in config register | |
threshold | 8 bit hex value | Set bits 24-31 in config register |
The Intel® Broadwell EP/EN/EX microarchitecture provides measurements of the IRP box in the uncore. The description from Intel®:
IRP is responsible for maintaining coherency for IIO traffic that needs to be coherent (e.g. cross-socket P2P).
The IRP box counters are exposed to the operating system through the PCI interface. The IBOX was introduced with the Intel® IvyBridge EP/EN/EX microarchitecture.
Counter name | Event name |
---|---|
IBOX<0,1>C0 | * |
IBOX<0,1>C1 | * |
Option | Argument | Operation | Comment |
---|---|---|---|
edgedetect | N | Set bit 18 in config register | |
invert | N | Set bit 23 in config register | |
threshold | 8 bit hex value | Set bits 24-31 in config register |
-
Applications
-
Config files
-
Daemons
-
Architectures
- Available counter options
- AMD
- Intel
- Intel Atom
- Intel Pentium M
- Intel Core2
- Intel Nehalem
- Intel NehalemEX
- Intel Westmere
- Intel WestmereEX
- Intel Xeon Phi (KNC)
- Intel Silvermont & Airmont
- Intel Goldmont
- Intel SandyBridge
- Intel SandyBridge EP/EN
- Intel IvyBridge
- Intel IvyBridge EP/EN/EX
- Intel Haswell
- Intel Haswell EP/EN/EX
- Intel Broadwell
- Intel Broadwell D
- Intel Broadwell EP
- Intel Skylake
- Intel Coffeelake
- Intel Kabylake
- Intel Xeon Phi (KNL)
- Intel Skylake X
- Intel Cascadelake SP/AP
- Intel Tigerlake
- Intel Icelake
- Intel Icelake X
- Intel SappireRapids
- Intel GraniteRapids
- Intel SierraForrest
- ARM
- POWER
-
Tutorials
-
Miscellaneous
-
Contributing