perf: add qcom l2 cache perf events driver
The L2 cache perf driver is named 'l2cache_counters' and can be used with perf tool to profile L2 cache events as below => DDR read (Read-Shared, Read-Unique, Read-Clean and Read-Not-Shared-Dirty transactions on GNOC Interface) => DDR write (Write-Back, Write-Clean and Write-Evict transactions on GNOC Interface => SNOOP Read (Read-Once, Read-Shared, Read-Unique, Read-Clean and Read-Not-Shared-Dirty transactions from GNOC to Cluster interface) => ACP Write(Write-Back, Write-Clean and Write-Evict transactions to ACP port of Collapsed Cluster) => Tenure counter(Low-Power mode tenure is used to count tenure (no. of XO- 19.2MHz) of L2 Low-Power mode. => Low/Mid/High occurrence counter: Based on threshold set for low and mid tenure counter, current tenure count is compared and based on which category it belongs, respective occurrence counter gets incremented. e.g: 1. 0 < Current Tenure <= Low-tenure threshold : Low-Tenure 2. Low-tenure < Current Tenure <= Mid-tenure threshold : Mid-Tenure 3. Mid-tenure < Current tenure : High-Tenure Change-Id: I9f8aedd21a92cbd6908deb5a8e4c7e32220bea74 Signed-off-by: Mukesh Ojha <mojha@codeaurora.org>
This commit is contained in:
parent
3081488d36
commit
5745954a5a
4 changed files with 1283 additions and 0 deletions
63
Documentation/perf/qcom_l2_counters.txt
Normal file
63
Documentation/perf/qcom_l2_counters.txt
Normal file
|
@ -0,0 +1,63 @@
|
|||
Qualcomm Technologies, Inc. l2 Cache counters
|
||||
=============================================
|
||||
|
||||
This driver supports the L2 cache clusters counters found in
|
||||
Qualcomm Technologies, Inc.
|
||||
|
||||
There are multiple physical L2 cache clusters, each with their
|
||||
own counters. Each cluster has one or more CPUs associated with it.
|
||||
|
||||
There is one logical L2 PMU exposed, which aggregates the results from
|
||||
the physical PMUs(counters).
|
||||
|
||||
The driver provides a description of its available events and configuration
|
||||
options in sysfs, see /sys/devices/l2cache_counters.
|
||||
|
||||
The "format" directory describes the format of the events.
|
||||
|
||||
And format is of the form 0xXXX
|
||||
Where,
|
||||
|
||||
1 bit(lsb) for group (group is either txn/tenure counter).
|
||||
4 bits for serial number for counter starting from 0 to 8.
|
||||
5 bits for bit position of counter enable bit in a register.
|
||||
|
||||
The driver provides a "cpumask" sysfs attribute which contains a mask
|
||||
consisting of one CPU per cluster which will be used to handle all the PMU
|
||||
events on that cluster.
|
||||
|
||||
Examples for use with perf:
|
||||
|
||||
perf stat -e l2cache_counters/ddr_read/,l2cache_counters/ddr_write/ -a sleep 1
|
||||
|
||||
perf stat -e l2cache_counters/cycles/ -C 2 sleep 1
|
||||
|
||||
Limitation: The driver does not support sampling, therefore "perf record" will
|
||||
not work. Per-task perf sessions are not supported.
|
||||
|
||||
For transaction counters we don't need to set any configuration
|
||||
before monitoring.
|
||||
|
||||
For tenure counter use case, we need to set threshold value of low and mid
|
||||
range occurrence counter value of cluster(as these occurrence counter exist
|
||||
for each cluster) in sysfs.
|
||||
|
||||
echo 1 > /sys/bus/eventsource/devices/l2cache_counters/which_cluster_tenure
|
||||
echo X > /sys/bus/event_source/devices/l2cache_counters/low_tenure_threshold
|
||||
echo Y > /sys/bus/event_source/devices/l2cache_counters/mid_tenure_threshold
|
||||
Here, X < Y
|
||||
|
||||
e.g:
|
||||
|
||||
perf stat -e l2cache_counters/low_range_occur/ -e
|
||||
l2cache_counters/mid_range_occur/ -e l2cache_counters/high_range
|
||||
_occur/ -C 4 sleep 10
|
||||
|
||||
Performance counter stats for 'CPU(s) 4':
|
||||
|
||||
7 l2cache_counters/low_range_occur/
|
||||
5 l2cache_counters/mid_range_occur/
|
||||
7 l2cache_counters/high_range_occur/
|
||||
|
||||
10.204140400 seconds time elapsed
|
||||
|
|
@ -77,6 +77,15 @@ config QCOM_L2_PMU
|
|||
Adds the L2 cache PMU into the perf events subsystem for
|
||||
monitoring L2 cache events.
|
||||
|
||||
config QCOM_L2_COUNTERS
|
||||
bool "Qualcomm Technologies L2-cache counters (PMU)"
|
||||
depends on ARCH_QCOM && ARM64
|
||||
help
|
||||
Provides support for the L2 cache counters
|
||||
in Qualcomm Technologies processors.
|
||||
Adds the L2 cache counters support into the perf events subsystem for
|
||||
monitoring L2 cache events.
|
||||
|
||||
config QCOM_L3_PMU
|
||||
bool "Qualcomm Technologies L3-cache PMU"
|
||||
depends on ARCH_QCOM && ARM64 && ACPI
|
||||
|
|
|
@ -6,6 +6,7 @@ obj-$(CONFIG_ARM_PMU) += arm_pmu.o arm_pmu_platform.o
|
|||
obj-$(CONFIG_ARM_PMU_ACPI) += arm_pmu_acpi.o
|
||||
obj-$(CONFIG_HISI_PMU) += hisilicon/
|
||||
obj-$(CONFIG_QCOM_L2_PMU) += qcom_l2_pmu.o
|
||||
obj-$(CONFIG_QCOM_L2_COUNTERS) += qcom_l2_counters.o
|
||||
obj-$(CONFIG_QCOM_L3_PMU) += qcom_l3_pmu.o
|
||||
obj-$(CONFIG_QCOM_LLCC_PMU) += qcom_llcc_pmu.o
|
||||
obj-$(CONFIG_XGENE_PMU) += xgene_pmu.o
|
||||
|
|
1210
drivers/perf/qcom_l2_counters.c
Normal file
1210
drivers/perf/qcom_l2_counters.c
Normal file
File diff suppressed because it is too large
Load diff
Loading…
Add table
Reference in a new issue