Commit graph

834 commits

Author SHA1 Message Date
Yicong Yang
e480898e76 drivers/perf: hisi: Support PMUs with no interrupt
We'll have PMUs don't have an interrupt to indicate the counter
overflow, but the Uncore PMU core assume all the PMUs have
interrupt. So handle this case in the core. The existing PMUs
won't be affected.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20250619125557.57372-7-yangyicong@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14 15:42:16 +01:00
Junhao He
35f5b36e8c drivers/perf: hisi: Relax the event number check of v2 PMUs
The supported event number range of each Uncore PMUs is provided by
each driver in hisi_pmu::check_event and out of range events
will be rejected. A later version with expanded event number range
needs to register the PMU with updated hisi_pmu::check_event
even if it's the only update, which means the expanded events
cannot be used unless the driver's updated. However the unsupported
events won't be counted by the hardware so we can relax the event
number check to allow the use the expanded events.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Junhao He <hejunhao3@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20250619125557.57372-6-yangyicong@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14 15:42:16 +01:00
Junhao He
1fd20ba0a1 drivers/perf: hisi: Add support for HiSilicon SLLC v3 PMU driver
SLLC v3 PMU has the following changes compared to previous version:
a) update the register layout
b) update the definition of SRCID_CTRL and TGTID_CTRL registers.
   To be compatible with v2, we use maximum width (11 bits)
   and mask the extra length for themselves.
c) remove latency events (driver does not need to be adapted).

SLLC v3 PMU is identified with HID HISI0264.

Signed-off-by: Junhao He <hejunhao3@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20250619125557.57372-5-yangyicong@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14 15:42:16 +01:00
Junhao He
29614c55fe drivers/perf: hisi: Use ACPI driver_data to retrieve SLLC PMU information
Make use of struct acpi_device_id::driver_data for version specific
information rather than judge the version register. This will help
to simplify the probe process and also a bit easier for extension.

Factor out SLLC register definition to struct hisi_sllc_pmu_regs.
No functional changes intended.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Junhao He <hejunhao3@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20250619125557.57372-4-yangyicong@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14 15:42:16 +01:00
Junhao He
17aa34e869 drivers/perf: hisi: Add support for HiSilicon DDRC v3 PMU driver
HiSilicon DDRC v3 PMU has the different interrupt register offset
compared to the v2. Add device information of v3 PMU with ACPI
HID HISI0235.

Signed-off-by: Junhao He <hejunhao3@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20250619125557.57372-3-yangyicong@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14 15:42:16 +01:00
Junhao He
dc86791ff6 drivers/perf: hisi: Simplify the probe process for each DDRC version
Version 1 and 2 of DDRC PMU also use different HID. Make use of
struct acpi_device_id::driver_data for version specific information
rather than judge the version register. This will help to
simplify the probe process and also a bit easier for extension.

In order to support this extend struct hisi_pmu_dev_info for version
specific counter bits and event range.

Signed-off-by: Junhao He <hejunhao3@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20250619125557.57372-2-yangyicong@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14 15:42:16 +01:00
Shouping Wang
89f0b9ccd3 perf/arm-ni: Support sharing IRQs within an NI instance
NI-700 has a distinct PMU interrupt output for each Clock Domain,
however some integrations may still combine these together externally.
The initial driver didn't attempt to support this, in anticipation of a
more general solution for IRQ sharing between system PMU instances, but
that's still a way off, so let's make this intermediate step for now to
at least allow sharing IRQs within an individual NI instance.

Now that CPU affinity and migration are cleaned up, it's fairly
straightforward to adopt similar logic to arm-cmn, to identify CDs with
a common interrupt and loop over them directly in the handler.

Signed-off-by: Shouping Wang <allen.wang@hj-micro.com>
[ rm: Rework for affinity handling, cosmetics, new commit message ]
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/f62db639d3b54c959ec477db7b8ccecbef1ca310.1752256072.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14 15:07:51 +01:00
Robin Murphy
6a5dc6c753 perf/arm-ni: Consolidate CPU affinity handling
Since overflow interrupts from the individual PMUs are infrequent and
unlikely to coincide, and we make no attempt to balance them across
CPUs anyway, there's really not much point tracking a separate CPU
affinity per PMU. Move the CPU affinity and hotplug migration up to
the NI instance level.

Tested-by: Shouping Wang <allen.wang@hj-micro.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/00b622872006c2f0c89485e343b1cb8caaa79c47.1752256072.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14 15:07:51 +01:00
Alok Tiwari
0259de6331 perf/cxlpmu: Fix typos in cxl_pmu.c comments and documentation
Fix several minor typo errors in comments:
- Remove duplicated word "a" in "a a VID / GroupID".
- Correct "Opcopdes" to "Opcodes" in CXL spec reference.
- Fix spelling of "implemnted" to "implemented".

Improves code readability and documentation consistency.

Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://lore.kernel.org/r/20250624194350.109790-4-alok.a.tiwari@oracle.com
Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14 13:36:27 +01:00
Alok Tiwari
3e870815cc perf/cxlpmu: Remove unintended newline from IRQ name format string
The IRQ name format string used in devm_kasprintf() mistakenly included
a newline character "\n".
This could lead to confusing log output or misformatted names in sysfs
or debug messages.

This fix removes the newline to ensure proper IRQ naming.

Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://lore.kernel.org/r/20250624194350.109790-3-alok.a.tiwari@oracle.com
Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14 13:36:27 +01:00
Alok Tiwari
6ae58c74e7 perf/cxlpmu: Fix devm_kcalloc() argument order in cxl_pmu_probe()
The previous code mistakenly swapped the count and size parameters.
This fix corrects the argument order in devm_kcalloc() to follow the
conventional count, size form, avoiding potential confusion or bugs.

Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://lore.kernel.org/r/20250624194350.109790-2-alok.a.tiwari@oracle.com
Signed-off-by: Will Deacon <will@kernel.org>
2025-07-14 13:36:27 +01:00
Leo Yan
ba2ff3e1b6 perf: arm_spe: Relax period restriction
The minimum interval specified the PMSIDR_EL1.Interval field is a
hardware recommendation. However, this value is set by hardware designer
before the production. It is not actual hardware limitation but tools
currently have no way to test shorter periods.

This change relaxes the limitation by allowing any non-zero periods,
with simplifying code with clamp_t().

The downside is that small periods may increase the risk of AUX ring
buffer overruns. When an overrun occurs, the perf core layer will
trigger an irq work to disable the event and wake up the tool in user
space to read the trace data. After the tool finishes reading, it will
re-enable the AUX event.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20250627163028.3503122-1-leo.yan@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2025-07-08 18:04:03 +01:00
Rob Herring (Arm)
58074a0fce perf: arm_pmuv3: Add support for the Branch Record Buffer Extension (BRBE)
The ARMv9.2 architecture introduces the optional Branch Record Buffer
Extension (BRBE), which records information about branches as they are
executed into set of branch record registers. BRBE is similar to x86's
Last Branch Record (LBR) and PowerPC's Branch History Rolling Buffer
(BHRB).

BRBE supports filtering by exception level and can filter just the
source or target address if excluded to avoid leaking privileged
addresses. The h/w filter would be sufficient except when there are
multiple events with disjoint filtering requirements. In this case, BRBE
is configured with a union of all the events' desired branches, and then
the recorded branches are filtered based on each event's filter. For
example, with one event capturing kernel events and another event
capturing user events, BRBE will be configured to capture both kernel
and user branches. When handling event overflow, the branch records have
to be filtered by software to only include kernel or user branch
addresses for that event. In contrast, x86 simply configures LBR using
the last installed event which seems broken.

It is possible on x86 to configure branch filter such that no branches
are ever recorded (e.g. -j save_type). For BRBE, events with a
configuration that will result in no samples are rejected.

Recording branches in KVM guests is not supported like x86. However,
perf on x86 allows requesting branch recording in guests. The guest
events are recorded, but the resulting branches are all from the host.
For BRBE, events with branch recording and "exclude_host" set are
rejected. Requiring "exclude_guest" to be set did not work. The default
for the perf tool does set "exclude_guest" if no exception level
options are specified. However, specifying kernel or user events
defaults to including both host and guest. In this case, only host
branches are recorded.

BRBE can support some additional exception branch types compared to
x86. On x86, all exceptions other than syscalls are recorded as IRQ.
With BRBE, it is possible to better categorize these exceptions. One
limitation relative to x86 is we cannot distinguish a syscall return
from other exception returns. So all exception returns are recorded as
ERET type. The FIQ branch type is omitted as the only FIQ user is Apple
platforms which don't support BRBE. The debug branch types are omitted
as there is no clear need for them.

BRBE records are invalidated whenever events are reconfigured, a new
task is scheduled in, or after recording is paused (and the records
have been recorded for the event). The architecture allows branch
records to be invalidated by the PE under implementation defined
conditions. It is expected that these conditions are rare.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Co-developed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Co-developed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: James Clark <james.clark@linaro.org>
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
tested-by: Adam Young <admiyo@os.amperecomputing.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20250611-arm-brbe-v19-v23-4-e7775563036e@kernel.org
[will: Fix sparse warnings about mixed declarations and code.
       Fix C99 comment syntax.]
Signed-off-by: Will Deacon <will@kernel.org>
2025-07-08 17:58:49 +01:00
Robin Murphy
860a831de1 perf/arm: Add missing .suppress_bind_attrs
PMU drivers should set .suppress_bind_attrs so that userspace is denied
the opportunity to pull the driver out from underneath an in-use PMU
(with predictably unpleasant consequences). Somehow both the CMN and NI
drivers have managed to miss this; put that right.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Leo Yan <leo.yan@arm.com>
Link: https://lore.kernel.org/r/acd48c341b33b96804a3969ee00b355d40c546e2.1751465293.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2025-07-04 18:07:26 +01:00
Robin Murphy
a7bfae2145 perf/arm-cmn: Reduce stack usage during discovery
Arnd reports that Clang's aggressive inlining of arm_cmn_discover() can
lead to stack frame size warnings, and while we could simply prevent
such inlining to hide the issue, it seems more productive to actually
heed the warning and do something about the overall stack footprint.
The xp_region array is already rather large, and CMN_MAX_XPS might only
grow larger in future, however it only serves as a convenience to save
repeating the first level's worth of register reads in the second pass
of discovery. There's no performance concern here, and it only takes a
small tweak to the flow to re-extract the offsets instead of stashing
them, so let's just do that and save several hundred bytes of stack.

Reported-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-and-tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Link: https://lore.kernel.org/r/e7dd41bf0f1b098e2e4b01ef91318a4b272abff8.1751046159.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2025-07-04 18:06:55 +01:00
Colin Ian King
b6e37b27bf perf: imx9_perf: make the read-only array mask static const
Don't populate the read-only array mask on the stack at run time,
instead make it static const.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://lore.kernel.org/r/20250611133917.170888-1-colin.i.king@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
2025-07-04 18:05:47 +01:00
Zhiyuan Dai
8b177e9a4e perf/arm-cmn: Broaden module description for wider interconnect support
The current MODULE_DESCRIPTION only mentions CMN-600, but this driver
now supports several Arm mesh interconnects including CMN-650, CMN-700,
CI-700, and CMN-S3.

Update the MODULE_DESCRIPTION to reflect the expanded scope.

Signed-off-by: Zhiyuan Dai <daizhiyuan@phytium.com.cn>
Link: https://lore.kernel.org/r/20250522032122.949373-1-daizhiyuan@phytium.com.cn
Signed-off-by: Will Deacon <will@kernel.org>
2025-07-04 18:05:07 +01:00
Robin Murphy
c872d7c837 perf/arm-ni: Set initial IRQ affinity
While we do request our IRQs with the right flags to stop their affinity
changing unexpectedly, we forgot to actually set it to start with. Oops.

Cc: stable@vger.kernel.org
Fixes: 4d5a7680f2 ("perf: Add driver for Arm NI-700 interconnect PMU")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Shouping Wang <allen.wang@hj-micro.com>
Link: https://lore.kernel.org/r/614ced9149ee8324e58930862bd82cbf46228d27.1747149165.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2025-07-04 18:03:55 +01:00
Linus Torvalds
47cf96fbe3 arm64 updates for 6.16
ACPI, EFI and PSCI:
  - Decouple Arm's "Software Delegated Exception Interface" (SDEI)
    support from the ACPI GHES code so that it can be used by platforms
    booted with device-tree.
 
  - Remove unnecessary per-CPU tracking of the FPSIMD state across EFI
    runtime calls.
 
  - Fix a node refcount imbalance in the PSCI device-tree code.
 
 CPU Features:
  - Ensure register sanitisation is applied to fields in ID_AA64MMFR4.
 
  - Expose AIDR_EL1 to userspace via sysfs, primarily so that KVM guests
    can reliably query the underlying CPU types from the VMM.
 
  - Re-enabling of SME support (CONFIG_ARM64_SME) as a result of fixes
    to our context-switching, signal handling and ptrace code.
 
 Entry code:
  - Hook up TIF_NEED_RESCHED_LAZY so that CONFIG_PREEMPT_LAZY can be
    selected.
 
 Memory management:
  - Prevent BSS exports from being used by the early PI code.
 
  - Propagate level and stride information to the low-level TLB
    invalidation routines when operating on hugetlb entries.
 
  - Use the page-table contiguous hint for vmap() mappings with
    VM_ALLOW_HUGE_VMAP where possible.
 
  - Optimise vmalloc()/vmap() page-table updates to use "lazy MMU mode"
    and hook this up on arm64 so that the trailing DSB (used to publish
    the updates to the hardware walker) can be deferred until the end of
    the mapping operation.
 
  - Extend mmap() randomisation for 52-bit virtual addresses (on par with
    48-bit addressing) and remove limited support for randomisation of
    the linear map.
 
 Perf and PMUs:
  - Add support for probing the CMN-S3 driver using ACPI.
 
  - Minor driver fixes to the CMN, Arm-NI and amlogic PMU drivers.
 
 Selftests:
  - Fix FPSIMD and SME tests to align with the freshly re-enabled SME
    support.
 
  - Fix default setting of the OUTPUT variable so that tests are
    installed in the right location.
 
 vDSO:
  - Replace raw counter access from inline assembly code with a call to
    the the __arch_counter_get_cntvct() helper function.
 
 Miscellaneous:
  - Add some missing header inclusions to the CCA headers.
 
  - Rework rendering of /proc/cpuinfo to follow the x86-approach and
    avoid repeated buffer expansion (the user-visible format remains
    identical).
 
  - Remove redundant selection of CONFIG_CRC32
 
  - Extend early error message when failing to map the device-tree blob.
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmg1uTgQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNFv2CAC9S5OW0btOAo7V/LFBpLhJM3hdIV6Sn6N1
 d/K5znuqPBG6VPfBrshaZltEl/C3U8KG4H8xrlX5cSo7CRuf3DgVBw3kiZ6ERZj6
 1gnKR54juA1oWhcroPl0s76ETWj3N4gO036u2qOhWNAYflDunh1+bCIGJkG4H/yP
 wqtWn974YUbad/zQJSbG3IMO1yvxZ/PsNpVF8HjyQ0/ZPWsYTscrhNQ0hWro17sR
 CTcUaGxH4GrXW24EGNgkLB9aq67X2rtGGtaIlp5oFl8FuLklc7TYbPwJp8cPCTNm
 0Sp0mpuR9M675pYIKoCI9m5twc46znRIKmbXi5LvPd77418y3jTf
 =03N4
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
 "The headline feature is the re-enablement of support for Arm's
  Scalable Matrix Extension (SME) thanks to a bumper crop of fixes
  from Mark Rutland.

  If matrices aren't your thing, then Ryan's page-table optimisation
  work is much more interesting.

  Summary:

  ACPI, EFI and PSCI:

   - Decouple Arm's "Software Delegated Exception Interface" (SDEI)
     support from the ACPI GHES code so that it can be used by platforms
     booted with device-tree

   - Remove unnecessary per-CPU tracking of the FPSIMD state across EFI
     runtime calls

   - Fix a node refcount imbalance in the PSCI device-tree code

  CPU Features:

   - Ensure register sanitisation is applied to fields in ID_AA64MMFR4

   - Expose AIDR_EL1 to userspace via sysfs, primarily so that KVM
     guests can reliably query the underlying CPU types from the VMM

   - Re-enabling of SME support (CONFIG_ARM64_SME) as a result of fixes
     to our context-switching, signal handling and ptrace code

  Entry code:

   - Hook up TIF_NEED_RESCHED_LAZY so that CONFIG_PREEMPT_LAZY can be
     selected

  Memory management:

   - Prevent BSS exports from being used by the early PI code

   - Propagate level and stride information to the low-level TLB
     invalidation routines when operating on hugetlb entries

   - Use the page-table contiguous hint for vmap() mappings with
     VM_ALLOW_HUGE_VMAP where possible

   - Optimise vmalloc()/vmap() page-table updates to use "lazy MMU mode"
     and hook this up on arm64 so that the trailing DSB (used to publish
     the updates to the hardware walker) can be deferred until the end
     of the mapping operation

   - Extend mmap() randomisation for 52-bit virtual addresses (on par
     with 48-bit addressing) and remove limited support for
     randomisation of the linear map

  Perf and PMUs:

   - Add support for probing the CMN-S3 driver using ACPI

   - Minor driver fixes to the CMN, Arm-NI and amlogic PMU drivers

  Selftests:

   - Fix FPSIMD and SME tests to align with the freshly re-enabled SME
     support

   - Fix default setting of the OUTPUT variable so that tests are
     installed in the right location

  vDSO:

   - Replace raw counter access from inline assembly code with a call to
     the the __arch_counter_get_cntvct() helper function

  Miscellaneous:

   - Add some missing header inclusions to the CCA headers

   - Rework rendering of /proc/cpuinfo to follow the x86-approach and
     avoid repeated buffer expansion (the user-visible format remains
     identical)

   - Remove redundant selection of CONFIG_CRC32

   - Extend early error message when failing to map the device-tree
     blob"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (83 commits)
  arm64: cputype: Add cputype definition for HIP12
  arm64: el2_setup.h: Make __init_el2_fgt labels consistent, again
  perf/arm-cmn: Add CMN S3 ACPI binding
  arm64/boot: Disallow BSS exports to startup code
  arm64/boot: Move global CPU override variables out of BSS
  arm64/boot: Move init_pgdir[] and init_idmap_pgdir[] into __pi_ namespace
  perf/arm-cmn: Initialise cmn->cpu earlier
  kselftest/arm64: Set default OUTPUT path when undefined
  arm64: Update comment regarding values in __boot_cpu_mode
  arm64: mm: Drop redundant check in pmd_trans_huge()
  arm64/mm: Re-organise setting up FEAT_S1PIE registers PIRE0_EL1 and PIR_EL1
  arm64/mm: Permit lazy_mmu_mode to be nested
  arm64/mm: Disable barrier batching in interrupt contexts
  arm64/cpuinfo: only show one cpu's info in c_show()
  arm64/mm: Batch barriers when updating kernel mappings
  mm/vmalloc: Enter lazy mmu mode while manipulating vmalloc ptes
  arm64/mm: Support huge pte-mapped pages in vmap
  mm/vmalloc: Gracefully unmap huge ptes
  mm/vmalloc: Warn on improper use of vunmap_range()
  arm64/mm: Hoist barriers out of set_ptes_anysz() loop
  ...
2025-05-28 14:55:35 -07:00
Kan Liang
f1a6fe2ab1 perf/apple_m1: Remove driver-specific throttle support
The throttle support has been added in the generic code. Remove
the driver-specific throttle support.

Besides the throttle, perf_event_overflow may return true because of
event_limit. It already does an inatomic event disable. The pmu->stop
is not required either.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250520181644.2673067-10-kan.liang@linux.intel.com
2025-05-21 13:57:44 +02:00
Kan Liang
1507376528 perf/arm: Remove driver-specific throttle support
The throttle support has been added in the generic code. Remove
the driver-specific throttle support.

Besides the throttle, perf_event_overflow may return true because of
event_limit. It already does an inatomic event disable. The pmu->stop
is not required either.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Leo Yan <leo.yan@arm.com>
Link: https://lore.kernel.org/r/20250520181644.2673067-9-kan.liang@linux.intel.com
2025-05-21 13:57:44 +02:00
Robin Murphy
8c138a189f perf/arm-cmn: Add CMN S3 ACPI binding
An ACPI binding for CMN S3 was not yet finalised when the driver support
was originally written, but v1.2 of DEN0093 "ACPI for Arm Components"
has at last been published; support ACPI systems using the proper HID.

Cc: stable@vger.kernel.org
Fixes: 0dc2f4963f ("perf/arm-cmn: Support CMN S3")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/7dafe147f186423020af49d7037552ee59c60e97.1747652164.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2025-05-19 12:31:54 +01:00
Robin Murphy
597704e201 perf/arm-cmn: Initialise cmn->cpu earlier
For all the complexity of handling affinity for CPU hotplug, what we've
apparently managed to overlook is that arm_cmn_init_irqs() has in fact
always been setting the *initial* affinity of all IRQs to CPU 0, not the
CPU we subsequently choose for event scheduling. Oh dear.

Cc: stable@vger.kernel.org
Fixes: 0ba64770a2 ("perf: Add Arm CMN-600 PMU driver")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Link: https://lore.kernel.org/r/b12fccba6b5b4d2674944f59e4daad91cd63420b.1747069914.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2025-05-16 15:18:58 +01:00
Anand Moon
097469a2b0 perf/amlogic: Replace smp_processor_id() with raw_smp_processor_id() in meson_ddr_pmu_create()
The Amlogic DDR PMU driver meson_ddr_pmu_create() function incorrectly uses
smp_processor_id(), which assumes disabled preemption. This leads to kernel
warnings during module loading because meson_ddr_pmu_create() can be called
in a preemptible context.

Following kernel warning and stack trace:
[   31.745138] [   T2289] BUG: using smp_processor_id() in preemptible [00000000] code: (udev-worker)/2289
[   31.745154] [   T2289] caller is debug_smp_processor_id+0x28/0x38
[   31.745172] [   T2289] CPU: 4 UID: 0 PID: 2289 Comm: (udev-worker) Tainted: GW 6.14.0-0-MANJARO-ARM #1 59519addcbca6ba8de735e151fd7b9e97aac7ff0
[   31.745181] [   T2289] Tainted: [W]=WARN
[   31.745183] [   T2289] Hardware name: Hardkernel ODROID-N2Plus (DT)
[   31.745188] [   T2289] Call trace:
[   31.745191] [   T2289]  show_stack+0x28/0x40 (C)
[   31.745199] [   T2289]  dump_stack_lvl+0x4c/0x198
[   31.745205] [   T2289]  dump_stack+0x20/0x50
[   31.745209] [   T2289]  check_preemption_disabled+0xec/0xf0
[   31.745213] [   T2289]  debug_smp_processor_id+0x28/0x38
[   31.745216] [   T2289]  meson_ddr_pmu_create+0x200/0x560 [meson_ddr_pmu_g12 8095101c49676ad138d9961e3eddaee10acca7bd]
[   31.745237] [   T2289]  g12_ddr_pmu_probe+0x20/0x38 [meson_ddr_pmu_g12 8095101c49676ad138d9961e3eddaee10acca7bd]
[   31.745246] [   T2289]  platform_probe+0x98/0xe0
[   31.745254] [   T2289]  really_probe+0x144/0x3f8
[   31.745258] [   T2289]  __driver_probe_device+0xb8/0x180
[   31.745261] [   T2289]  driver_probe_device+0x54/0x268
[   31.745264] [   T2289]  __driver_attach+0x11c/0x288
[   31.745267] [   T2289]  bus_for_each_dev+0xfc/0x160
[   31.745274] [   T2289]  driver_attach+0x34/0x50
[   31.745277] [   T2289]  bus_add_driver+0x160/0x2b0
[   31.745281] [   T2289]  driver_register+0x78/0x120
[   31.745285] [   T2289]  __platform_driver_register+0x30/0x48
[   31.745288] [   T2289]  init_module+0x30/0xfe0 [meson_ddr_pmu_g12 8095101c49676ad138d9961e3eddaee10acca7bd]
[   31.745298] [   T2289]  do_one_initcall+0x11c/0x438
[   31.745303] [   T2289]  do_init_module+0x68/0x228
[   31.745311] [   T2289]  load_module+0x118c/0x13a8
[   31.745315] [   T2289]  __arm64_sys_finit_module+0x274/0x390
[   31.745320] [   T2289]  invoke_syscall+0x74/0x108
[   31.745326] [   T2289]  el0_svc_common+0x90/0xf8
[   31.745330] [   T2289]  do_el0_svc+0x2c/0x48
[   31.745333] [   T2289]  el0_svc+0x60/0x150
[   31.745337] [   T2289]  el0t_64_sync_handler+0x80/0x118
[   31.745341] [   T2289]  el0t_64_sync+0x1b8/0x1c0

Changes replaces smp_processor_id() with raw_smp_processor_id() to
ensure safe CPU ID retrieval in preemptible contexts.

Cc: Jiucheng Xu <jiucheng.xu@amlogic.com>
Fixes: 2016e2113d ("perf/amlogic: Add support for Amlogic meson G12 SoC DDR PMU driver")
Signed-off-by: Anand Moon <linux.amoon@gmail.com>
Link: https://lore.kernel.org/r/20250407063206.5211-1-linux.amoon@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
2025-05-09 12:16:31 +01:00
Robin Murphy
11b0f576e0 perf/arm-cmn: Fix REQ2/SNP2 mixup
Somehow the encodings for REQ2/SNP2 channels in XP events
got mixed up... Unmix them.

CC: stable@vger.kernel.org
Fixes: 23760a0144 ("perf/arm-cmn: Add CMN-700 support")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/087023e9737ac93d7ec7a841da904758c254cb01.1746717400.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2025-05-09 12:15:31 +01:00
Krzysztof Kozlowski
70cbcb2850 perf: Do not enable by default during compile testing
Enabling the compile test should not cause automatic enabling of all
drivers, but only allow to choose to compile them.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20250417074650.81561-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
2025-04-17 14:28:56 +01:00
Hongbo Yao
fc5106088d perf: arm-ni: Fix missing platform_set_drvdata()
Add missing platform_set_drvdata in arm_ni_probe(), otherwise
calling platform_get_drvdata() in remove returns NULL.

Fixes: 4d5a7680f2 ("perf: Add driver for Arm NI-700 interconnect PMU")
Signed-off-by: Hongbo Yao <andy.xu@hj-micro.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/20250401054248.3985814-1-andy.xu@hj-micro.com
Signed-off-by: Will Deacon <will@kernel.org>
2025-04-17 14:27:34 +01:00
Hongbo Yao
7f57afde6a perf: arm-ni: Unregister PMUs on probe failure
When a resource allocation fails in one clock domain of an NI device,
we need to properly roll back all previously registered perf PMUs in
other clock domains of the same device.

Otherwise, it can lead to kernel panics.

Calling arm_ni_init+0x0/0xff8 [arm_ni] @ 2374
arm-ni ARMHCB70:00: Failed to request PMU region 0x1f3c13000
arm-ni ARMHCB70:00: probe with driver arm-ni failed with error -16
list_add corruption: next->prev should be prev (fffffd01e9698a18),
but was 0000000000000000. (next=ffff10001a0decc8).
pstate: 6340009 (nZCv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
pc : list_add_valid_or_report+0x7c/0xb8
lr : list_add_valid_or_report+0x7c/0xb8
Call trace:
 __list_add_valid_or_report+0x7c/0xb8
 perf_pmu_register+0x22c/0x3a0
 arm_ni_probe+0x554/0x70c [arm_ni]
 platform_probe+0x70/0xe8
 really_probe+0xc6/0x4d8
 driver_probe_device+0x48/0x170
 __driver_attach+0x8e/0x1c0
 bus_for_each_dev+0x64/0xf0
 driver_add+0x138/0x260
 bus_add_driver+0x68/0x138
 __platform_driver_register+0x2c/0x40
 arm_ni_init+0x14/0x2a [arm_ni]
 do_init_module+0x36/0x298
---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Oops - BUG: Fatal exception
SMP: stopping secondary CPUs

Fixes: 4d5a7680f2 ("perf: Add driver for Arm NI-700 interconnect PMU")
Signed-off-by: Hongbo Yao <andy.xu@hj-micro.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/20250403070918.4153839-1-andy.xu@hj-micro.com
Signed-off-by: Will Deacon <will@kernel.org>
2025-04-17 14:21:22 +01:00
Robin Murphy
674cd77402 perf/arm-cmn: Remove CMN-600 DTC domain special case
The special case for trying to infer the DTC domain for DTC-adjacent
nodes on CMN-600 is fragile and buggy - currently resulting in subtly
messed up DTC counter allocation - and the theoretical benefit it
offers to a tiny minority of use-cases arguably doesn't outweigh the
inconsistency it offers to others anyway. Just get rid of it.

Fixes: ab33c66fd8 ("perf/arm-cmn: Enable per-DTC counter allocation")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/67985e39f53b56385d79a4f1264cf7f9cacedb58.1742308248.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2025-04-17 14:08:10 +01:00
Linus Torvalds
7d06015d93 pci-v6.15-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmfmt1AUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vxqNw//cC1BlVe4UUVR5r9nfpoFeGAZeDJz
 32naGvZKjwL0tR6dStO/BEZx4QrBp+smVfJfuxtQxfzLHLgMigM2jVhfa7XUmaun
 7yZlJZu4Jmydc57sPf56CFOYOFP6zyPzSaE8u1Eb4IIqpvuoYpvDayDt6PSsLmFS
 PDzqmicT3nuNbbcfE4rYLyL6JsXooKCR1h+NNcDjy7Run9DvQbE6N2PPvXCu6O97
 aC3+kYUydEpgn9DfjBDghe+GBQCkBPldwnWqXBxKDFmYj5bKFujNccS9/IDSEuuX
 oWntDRAXgLWg048sBC1AuJQajF3UaqffRGJkzUBaZWbU/jB9t5N/Z3GpYlXzizRx
 CAqnt/ciGUKVbaESKwoeeIqgK+wG1bnrmoEaJHXFGqjr6sjm2A2T5EzyBMJ1hFwE
 wUq6SDnkp5igG7rWtsBPo/lGa5h/pNlaXng11570ikD2ZfHVfRgwy2MpXYxChrkt
 X2q/lRYU7yyfNJQ8O5LQJ6bYztatjxT0TxXNFv+cxfVrdI7vnQMuaeBE352jn+Lo
 aZ8fTJDFbRXtrZolcoetZjBdHwPIS42wYQtvxo/ylUl64xKEzZEzN09XODjwn74v
 nuOzDtbM0TAjZWxi6bwRFPnemTEAQxPuv1i4VdPQ7yblvC0at2agNBsz6Fdq60GS
 s80YMJVUU0UcVoc=
 =gM1i
 -----END PGP SIGNATURE-----

Merge tag 'pci-v6.15-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull pci updates from Bjorn Helgaas:
 "Enumeration:

   - Enable Configuration RRS SV, which makes device readiness visible,
     early instead of during child bus scanning (Bjorn Helgaas)

   - Log debug messages about reset methods being used (Bjorn Helgaas)

   - Avoid reset when it has been disabled via sysfs (Nishanth
     Aravamudan)

   - Add common pci-ep-bus.yaml schema for exporting several peripherals
     of a single PCI function via devicetree (Andrea della Porta)

   - Create DT nodes for PCI host bridges to enable loading device tree
     overlays to create platform devices for PCI devices that have
     several features that require multiple drivers (Herve Codina)

  Resource management:

   - Enlarge devres table[] to accommodate bridge windows, ROM, IOV
     BARs, etc., and validate BAR index in devres interfaces (Philipp
     Stanner)

   - Fix typo that repeatedly distributed resources to a bridge instead
     of iterating over subordinate bridges, which resulted in too little
     space to assign some BARs (Kai-Heng Feng)

   - Relax bridge window tail sizing for optional resources, e.g., IOV
     BARs, to avoid failures when removing and re-adding devices (Ilpo
     Järvinen)

   - Allow drivers to enable devices even if we haven't assigned
     optional IOV resources to them (Ilpo Järvinen)

   - Rework handling of optional resources (IOV BARs, ROMs) to reduce
     failures if we can't allocate them (Ilpo Järvinen)

   - Fix a NULL dereference in the SR-IOV VF creation error path (Shay
     Drory)

   - Fix s390 mmio_read/write syscalls, which didn't cause page faults
     in some cases, which broke vfio-pci lazy mapping on first access
     (Niklas Schnelle)

   - Add pdev->non_mappable_bars to replace CONFIG_VFIO_PCI_MMAP, which
     was disabled only for s390 (Niklas Schnelle)

   - Support mmap of PCI resources on s390 except for ISM devices
     (Niklas Schnelle)

  ASPM:

   - Delay pcie_link_state deallocation to avoid dangling pointers that
     cause invalid references during hot-unplug (Daniel Stodden)

  Power management:

   - Allow PCI bridges to go to D3Hot when suspending on all non-x86
     systems (Manivannan Sadhasivam)

  Power control:

   - Create pwrctrl devices in pci_scan_device() to make it more
     symmetric with pci_pwrctrl_unregister() and make pwrctrl devices
     for PCI bridges possible (Manivannan Sadhasivam)

   - Unregister pwrctrl devices in pci_destroy_dev() so DOE, ASPM, etc.
     can still access devices after pci_stop_dev() (Manivannan
     Sadhasivam)

   - If there's a pwrctrl device for a PCI device, skip scanning it
     because the pwrctrl core will rescan the bus after the device is
     powered on (Manivannan Sadhasivam)

   - Add a pwrctrl driver for PCI slots based on voltage regulators
     described via devicetree (Manivannan Sadhasivam)

  Bandwidth control:

   - Add set_pcie_speed.sh to TEST_PROGS to fix issue when executing the
     set_pcie_cooling_state.sh test case (Yi Lai)

   - Avoid a NULL pointer dereference when we run out of bus numbers to
     assign for a bridge secondary bus (Lukas Wunner)

  Hotplug:

   - Drop superfluous pci_hotplug_slot_list, try_module_get() calls, and
     NULL pointer checks (Lukas Wunner)

   - Drop shpchp module init/exit logging, replace shpchp dbg() with
     ctrl_dbg(), and remove unused dbg(), err(), info(), warn() wrappers
     (Ilpo Järvinen)

   - Drop 'shpchp_debug' module parameter in favor of standard dynamic
     debugging (Ilpo Järvinen)

   - Drop unused cpcihp .get_power(), .set_power() function pointers
     (Guilherme Giacomo Simoes)

   - Disable hotplug interrupts in portdrv only when pciehp is not
     enabled to avoid issuing two hotplug commands too close together
     (Feng Tang)

   - Skip pciehp 'device replaced' check if the device has been removed
     to address a deadlock when resuming after a device was removed
     during system sleep (Lukas Wunner)

   - Don't enable pciehp hotplug interupt when resuming in poll mode
     (Ilpo Järvinen)

  Virtualization:

   - Fix bugs in 'pci=config_acs=' kernel command line parameter (Tushar
     Dave)

  DOE:

   - Expose supported DOE features via sysfs (Alistair Francis)

   - Allow DOE support to be enabled even if CXL isn't enabled (Alistair
     Francis)

  Endpoint framework:

   - Convert PCI device data so pci-epf-test works correctly on
     big-endian endpoint systems (Niklas Cassel)

   - Add BAR_RESIZABLE type to endpoint framework and add DWC core
     support for EPF drivers to set BAR_RESIZABLE type and size (Niklas
     Cassel)

   - Fix pci-epf-test double free that causes an oops if the host
     reboots and PERST# deassertion restarts endpoint BAR allocation
     (Christian Bruel)

   - Fix endpoint BAR testing so tests can skip disabled BARs instead of
     reporting them as failures (Niklas Cassel)

   - Widen endpoint test BAR size variable to accommodate BARs larger
     than INT_MAX (Niklas Cassel)

   - Remove unused tools 'pci' build target left over after moving tests
     to tools/testing/selftests/pci_endpoint (Jianfeng Liu)

  Altera PCIe controller driver:

   - Add DT binding and driver support for Agilex family (P-Tile,
     F-Tile, R-Tile) (Matthew Gerlach and D M, Sharath Kumar)

  AMD MDB PCIe controller driver:

   - Add DT binding and driver for AMD MDB (Multimedia DMA Bridge)
     (Thippeswamy Havalige)

  Broadcom STB PCIe controller driver:

   - Add BCM2712 MSI-X DT binding and interrupt controller drivers and
     add softdep on irq_bcm2712_mip driver to ensure that it is loaded
     first (Stanimir Varbanov)

   - Expand inbound window map to 64GB so it can accommodate BCM2712
     (Stanimir Varbanov)

   - Add BCM2712 support and DT updates (Stanimir Varbanov)

   - Apply link speed restriction before bringing link up, not after
     (Jim Quinlan)

   - Update Max Link Speed in Link Capabilities via the internal
     writable register, not the read-only config register (Jim Quinlan)

   - Handle regulator_bulk_get() error to avoid panic when we call
     regulator_bulk_free() later (Jim Quinlan)

   - Disable regulators only when removing the bus immediately below a
     Root Port because we don't support regulators deeper in the
     hierarchy (Jim Quinlan)

   - Make const read-only arrays static (Colin Ian King)

  Cadence PCIe endpoint driver:

   - Correct MSG TLP generation so endpoints can generate INTx messages
     (Hans Zhang)

  Freescale i.MX6 PCIe controller driver:

   - Identify the second controller on i.MX8MQ based on devicetree
     'linux,pci-domain' instead of DBI 'reg' address (Richard Zhu)

   - Remove imx_pcie_cpu_addr_fixup() since dwc core can now derive the
     ATU input address (using parent_bus_offset) from devicetree (Frank
     Li)

  Freescale Layerscape PCIe controller driver:

   - Drop deprecated 'num-ib-windows' and 'num-ob-windows' and
     unnecessary 'status' from example (Krzysztof Kozlowski)

   - Correct the syscon_regmap_lookup_by_phandle_args("fsl,pcie-scfg")
     arg_count to fix probe failure on LS1043A (Ioana Ciornei)

  HiSilicon STB PCIe controller driver:

   - Call phy_exit() to clean up if histb_pcie_probe() fails (Christophe
     JAILLET)

  Intel Gateway PCIe controller driver:

   - Remove intel_pcie_cpu_addr() since dwc core can now derive the ATU
     input address (using parent_bus_offset) from devicetree (Frank Li)

  Intel VMD host bridge driver:

   - Convert vmd_dev.cfg_lock from spinlock_t to raw_spinlock_t so
     pci_ops.read() will never sleep, even on PREEMPT_RT where
     spinlock_t becomes a sleepable lock, to avoid calling a sleeping
     function from invalid context (Ryo Takakura)

  MediaTek PCIe Gen3 controller driver:

   - Remove leftover mac_reset assert for Airoha EN7581 SoC (Lorenzo
     Bianconi)

   - Add EN7581 PBUS controller 'mediatek,pbus-csr' DT property and
     program host bridge memory aperture to this syscon node (Lorenzo
     Bianconi)

  Qualcomm PCIe controller driver:

   - Add qcom,pcie-ipq5332 binding (Varadarajan Narayanan)

   - Add qcom i.MX8QM and i.MX8QXP/DXP optional DMA interrupt (Alexander
     Stein)

   - Add optional dma-coherent DT property for Qualcomm SA8775P (Dmitry
     Baryshkov)

   - Make DT iommu property required for SA8775P and prohibited for
     SDX55 (Dmitry Baryshkov)

   - Add DT IOMMU and DMA-related properties for Qualcomm SM8450 (Dmitry
     Baryshkov)

   - Add endpoint DT properties for SAR2130P and enable endpoint mode in
     driver (Dmitry Baryshkov)

   - Describe endpoint BAR0 and BAR2 as 64-bit only and BAR1 and BAR3 as
     RESERVED (Manivannan Sadhasivam)

  Rockchip DesignWare PCIe controller driver:

   - Describe rk3568 and rk3588 BARs as Resizable, not Fixed (Niklas
     Cassel)

  Synopsys DesignWare PCIe controller driver:

   - Add debugfs-based Silicon Debug, Error Injection, Statistical
     Counter support for DWC (Shradha Todi)

   - Add debugfs property to expose LTSSM status of DWC PCIe link (Hans
     Zhang)

   - Add Rockchip support for DWC debugfs features (Niklas Cassel)

   - Add dw_pcie_parent_bus_offset() to look up the parent bus address
     of a specified 'reg' property and return the offset from the CPU
     physical address (Frank Li)

   - Use dw_pcie_parent_bus_offset() to derive CPU -> ATU addr offset
     via 'reg[config]' for host controllers and 'reg[addr_space]' for
     endpoint controllers (Frank Li)

   - Apply struct dw_pcie.parent_bus_offset in ATU users to remove use
     of .cpu_addr_fixup() when programming ATU (Frank Li)

  TI J721E PCIe driver:

   - Correct the 'link down' interrupt bit for J784S4 (Siddharth
     Vadapalli)

  TI Keystone PCIe controller driver:

   - Describe AM65x BARs 2 and 5 as Resizable (not Fixed) and reduce
     alignment requirement from 1MB to 64KB (Niklas Cassel)

  Xilinx Versal CPM PCIe controller driver:

   - Free IRQ domain in probe error path to avoid leaking it
     (Thippeswamy Havalige)

   - Add DT .compatible "xlnx,versal-cpm5nc-host" and driver support for
     Versal Net CPM5NC Root Port controller (Thippeswamy Havalige)

   - Add driver support for CPM5_HOST1 (Thippeswamy Havalige)

  Miscellaneous:

   - Convert fsl,mpc83xx-pcie binding to YAML (J. Neuschäfer)

   - Use for_each_available_child_of_node_scoped() to simplify apple,
     kirin, mediatek, mt7621, tegra drivers (Zhang Zekun)"

* tag 'pci-v6.15-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (197 commits)
  PCI: layerscape: Fix arg_count to syscon_regmap_lookup_by_phandle_args()
  PCI: j721e: Fix the value of .linkdown_irq_regfield for J784S4
  misc: pci_endpoint_test: Add support for PCITEST_IRQ_TYPE_AUTO
  PCI: endpoint: pci-epf-test: Expose supported IRQ types in CAPS register
  PCI: dw-rockchip: Endpoint mode cannot raise INTx interrupts
  PCI: endpoint: Add intx_capable to epc_features struct
  dt-bindings: PCI: Add common schema for devices accessible through PCI BARs
  PCI: intel-gw: Remove intel_pcie_cpu_addr()
  PCI: imx6: Remove imx_pcie_cpu_addr_fixup()
  PCI: dwc: Use parent_bus_offset to remove need for .cpu_addr_fixup()
  PCI: dwc: ep: Ensure proper iteration over outbound map windows
  PCI: dwc: ep: Use devicetree 'reg[addr_space]' to derive CPU -> ATU addr offset
  PCI: dwc: ep: Consolidate devicetree handling in dw_pcie_ep_get_resources()
  PCI: dwc: ep: Call epc_create() early in dw_pcie_ep_init()
  PCI: dwc: Use devicetree 'reg[config]' to derive CPU -> ATU addr offset
  PCI: dwc: Add dw_pcie_parent_bus_offset() checking and debug
  PCI: dwc: Add dw_pcie_parent_bus_offset()
  PCI/bwctrl: Fix NULL pointer dereference on bus number exhaustion
  PCI: xilinx-cpm: Add cpm_csr register mapping for CPM5_HOST1 variant
  PCI: brcmstb: Make const read-only arrays static
  ...
2025-03-28 19:36:53 -07:00
Linus Torvalds
054570267d lsm/stable-6.15 PR 20250323
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmfgWgMUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNW5RAAvCDq5gBtY0aTNlULe637EVLSh+t8
 PkSzHzu/NlzU6BfjtwSm2fuML8welTGxSwUPxUzMCI91gPdkGeFktefavT3xa+QI
 BHWROn7fEJ/KmRZvngPeIkgLr5xhF5nBJmc/Jw71qem20zRzNgJnpzMX16d10Phx
 dxd2xOO1qM3bv6Z9RcIssZRGaN+PHngpWWg+0B69XuaBUso87S6NDyKNn1XPmvoz
 as96k+Wk/xAZGVEeCbs/+H5rBx6DLg+FfTRa06Oh4BFsqedpkDPxLrTgCJGJkA0H
 dsK6O/993zvjx0Jn4ZPoJ9n35S82BmkCsz4bGq1xVl6FYUiMcm3/8yO41wllS+w4
 j+RlTU/RIdB7n8EKyMMl1hj1stTvt3Bi9F5Cbf7ZEv0snfR00K4KVpi17jnFjUHv
 kpOiEtXZb/NGQip7UAuUq0PisfqbiO4jJurYHRetDgv1WCy6+C8ufM5t6I+cnvmG
 VG+dlxcW+rDIn6bLRVuGi9TJRsQ6eox9ipa+qEKNNiOXgftELcgT7m74nAS5m0uv
 n5rDa221nPXecEB0X7d6YUFk711lly90dbelNeLrmv1w6jl8L1PpS1oBaW+UzGu9
 46eGBd6pzu9otvK9WVyDEdotDOCrgH0sd7pTetqDhLJZ7KrGwyyqO2gD/JroUKcC
 lnxBQwPnat86iI8=
 =oxfV
 -----END PGP SIGNATURE-----

Merge tag 'lsm-pr-20250323' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm

Pull lsm updates from Paul Moore:

 - Various minor updates to the LSM Rust bindings

   Changes include marking trivial Rust bindings as inlines and comment
   tweaks to better reflect the LSM hooks.

 - Add LSM/SELinux access controls to io_uring_allowed()

   Similar to the io_uring_disabled sysctl, add a LSM hook to
   io_uring_allowed() to enable LSMs a simple way to enforce security
   policy on the use of io_uring. This pull request includes SELinux
   support for this new control using the io_uring/allowed permission.

 - Remove an unused parameter from the security_perf_event_open() hook

   The perf_event_attr struct parameter was not used by any currently
   supported LSMs, remove it from the hook.

 - Add an explicit MAINTAINERS entry for the credentials code

   We've seen problems in the past where patches to the credentials code
   sent by non-maintainers would often languish on the lists for
   multiple months as there was no one explicitly tasked with the
   responsibility of reviewing and/or merging credentials related code.

   Considering that most of the code under security/ has a vested
   interest in ensuring that the credentials code is well maintained,
   I'm volunteering to look after the credentials code and Serge Hallyn
   has also volunteered to step up as an official reviewer. I posted the
   MAINTAINERS update as a RFC to LKML in hopes that someone else would
   jump up with an "I'll do it!", but beyond Serge it was all crickets.

 - Update Stephen Smalley's old email address to prevent confusion

   This includes a corresponding update to the mailmap file.

* tag 'lsm-pr-20250323' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
  mailmap: map Stephen Smalley's old email addresses
  lsm: remove old email address for Stephen Smalley
  MAINTAINERS: add Serge Hallyn as a credentials reviewer
  MAINTAINERS: add an explicit credentials entry
  cred,rust: mark Credential methods inline
  lsm,rust: reword "destroy" -> "release" in SecurityCtx
  lsm,rust: mark SecurityCtx methods inline
  perf: Remove unnecessary parameter of security check
  lsm: fix a missing security_uring_allowed() prototype
  io_uring,lsm,selinux: add LSM hooks for io_uring_setup()
  io_uring: refactor io_uring_allowed()
2025-03-25 15:44:19 -07:00
Linus Torvalds
edb0e8f6e2 ARM:
* Nested virtualization support for VGICv3, giving the nested
 hypervisor control of the VGIC hardware when running an L2 VM
 
 * Removal of 'late' nested virtualization feature register masking,
   making the supported feature set directly visible to userspace
 
 * Support for emulating FEAT_PMUv3 on Apple silicon, taking advantage
   of an IMPLEMENTATION DEFINED trap that covers all PMUv3 registers
 
 * Paravirtual interface for discovering the set of CPU implementations
   where a VM may run, addressing a longstanding issue of guest CPU
   errata awareness in big-little systems and cross-implementation VM
   migration
 
 * Userspace control of the registers responsible for identifying a
   particular CPU implementation (MIDR_EL1, REVIDR_EL1, AIDR_EL1),
   allowing VMs to be migrated cross-implementation
 
 * pKVM updates, including support for tracking stage-2 page table
   allocations in the protected hypervisor in the 'SecPageTable' stat
 
 * Fixes to vPMU, ensuring that userspace updates to the vPMU after
   KVM_RUN are reflected into the backing perf events
 
 LoongArch:
 
 * Remove unnecessary header include path
 
 * Assume constant PGD during VM context switch
 
 * Add perf events support for guest VM
 
 RISC-V:
 
 * Disable the kernel perf counter during configure
 
 * KVM selftests improvements for PMU
 
 * Fix warning at the time of KVM module removal
 
 x86:
 
 * Add support for aging of SPTEs without holding mmu_lock.  Not taking mmu_lock
   allows multiple aging actions to run in parallel, and more importantly avoids
   stalling vCPUs.  This includes an implementation of per-rmap-entry locking;
   aging the gfn is done with only a per-rmap single-bin spinlock taken, whereas
   locking an rmap for write requires taking both the per-rmap spinlock and
   the mmu_lock.
 
   Note that this decreases slightly the accuracy of accessed-page information,
   because changes to the SPTE outside aging might not use atomic operations
   even if they could race against a clear of the Accessed bit.  This is
   deliberate because KVM and mm/ tolerate false positives/negatives for
   accessed information, and testing has shown that reducing the latency of
   aging is far more beneficial to overall system performance than providing
   "perfect" young/old information.
 
 * Defer runtime CPUID updates until KVM emulates a CPUID instruction, to
   coalesce updates when multiple pieces of vCPU state are changing, e.g. as
   part of a nested transition.
 
 * Fix a variety of nested emulation bugs, and add VMX support for synthesizing
   nested VM-Exit on interception (instead of injecting #UD into L2).
 
 * Drop "support" for async page faults for protected guests that do not set
   SEND_ALWAYS (i.e. that only want async page faults at CPL3)
 
 * Bring a bit of sanity to x86's VM teardown code, which has accumulated
   a lot of cruft over the years.  Particularly, destroy vCPUs before
   the MMU, despite the latter being a VM-wide operation.
 
 * Add common secure TSC infrastructure for use within SNP and in the
   future TDX
 
 * Block KVM_CAP_SYNC_REGS if guest state is protected.  It does not make
   sense to use the capability if the relevant registers are not
   available for reading or writing.
 
 * Don't take kvm->lock when iterating over vCPUs in the suspend notifier to
   fix a largely theoretical deadlock.
 
 * Use the vCPU's actual Xen PV clock information when starting the Xen timer,
   as the cached state in arch.hv_clock can be stale/bogus.
 
 * Fix a bug where KVM could bleed PVCLOCK_GUEST_STOPPED across different
   PV clocks; restrict PVCLOCK_GUEST_STOPPED to kvmclock, as KVM's suspend
   notifier only accounts for kvmclock, and there's no evidence that the
   flag is actually supported by Xen guests.
 
 * Clean up the per-vCPU "cache" of its reference pvclock, and instead only
   track the vCPU's TSC scaling (multipler+shift) metadata (which is moderately
   expensive to compute, and rarely changes for modern setups).
 
 * Don't write to the Xen hypercall page on MSR writes that are initiated by
   the host (userspace or KVM) to fix a class of bugs where KVM can write to
   guest memory at unexpected times, e.g. during vCPU creation if userspace has
   set the Xen hypercall MSR index to collide with an MSR that KVM emulates.
 
 * Restrict the Xen hypercall MSR index to the unofficial synthetic range to
   reduce the set of possible collisions with MSRs that are emulated by KVM
   (collisions can still happen as KVM emulates Hyper-V MSRs, which also reside
   in the synthetic range).
 
 * Clean up and optimize KVM's handling of Xen MSR writes and xen_hvm_config.
 
 * Update Xen TSC leaves during CPUID emulation instead of modifying the CPUID
   entries when updating PV clocks; there is no guarantee PV clocks will be
   updated between TSC frequency changes and CPUID emulation, and guest reads
   of the TSC leaves should be rare, i.e. are not a hot path.
 
 x86 (Intel):
 
 * Fix a bug where KVM unnecessarily reads XFD_ERR from hardware and thus
   modifies the vCPU's XFD_ERR on a #NM due to CR0.TS=1.
 
 * Pass XFD_ERR as the payload when injecting #NM, as a preparatory step
   for upcoming FRED virtualization support.
 
 * Decouple the EPT entry RWX protection bit macros from the EPT Violation
   bits, both as a general cleanup and in anticipation of adding support for
   emulating Mode-Based Execution Control (MBEC).
 
 * Reject KVM_RUN if userspace manages to gain control and stuff invalid guest
   state while KVM is in the middle of emulating nested VM-Enter.
 
 * Add a macro to handle KVM's sanity checks on entry/exit VMCS control pairs
   in anticipation of adding sanity checks for secondary exit controls (the
   primary field is out of bits).
 
 x86 (AMD):
 
 * Ensure the PSP driver is initialized when both the PSP and KVM modules are
   built-in (the initcall framework doesn't handle dependencies).
 
 * Use long-term pins when registering encrypted memory regions, so that the
   pages are migrated out of MIGRATE_CMA/ZONE_MOVABLE and don't lead to
   excessive fragmentation.
 
 * Add macros and helpers for setting GHCB return/error codes.
 
 * Add support for Idle HLT interception, which elides interception if the vCPU
   has a pending, unmasked virtual IRQ when HLT is executed.
 
 * Fix a bug in INVPCID emulation where KVM fails to check for a non-canonical
   address.
 
 * Don't attempt VMRUN for SEV-ES+ guests if the vCPU's VMSA is invalid, e.g.
   because the vCPU was "destroyed" via SNP's AP Creation hypercall.
 
 * Reject SNP AP Creation if the requested SEV features for the vCPU don't
   match the VM's configured set of features.
 
 Selftests:
 
 * Fix again the Intel PMU counters test; add a data load and do CLFLUSH{OPT} on the data
   instead of executing code.  The theory is that modern Intel CPUs have
   learned new code prefetching tricks that bypass the PMU counters.
 
 * Fix a flaw in the Intel PMU counters test where it asserts that an event is
   counting correctly without actually knowing what the event counts on the
   underlying hardware.
 
 * Fix a variety of flaws, bugs, and false failures/passes dirty_log_test, and
   improve its coverage by collecting all dirty entries on each iteration.
 
 * Fix a few minor bugs related to handling of stats FDs.
 
 * Add infrastructure to make vCPU and VM stats FDs available to tests by
   default (open the FDs during VM/vCPU creation).
 
 * Relax an assertion on the number of HLT exits in the xAPIC IPI test when
   running on a CPU that supports AMD's Idle HLT (which elides interception of
   HLT if a virtual IRQ is pending and unmasked).
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmfcTkEUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroMnQAf/cPx72hJOdNy4Qrm8M33YLXVRVV00
 yEZ8eN8TWdOclr0ltE/w/ELGh/qS4CU8pjURAk0A6lPioU+mdcTn3dPEqMDMVYom
 uOQ2lusEHw0UuSnGZSEjvZJsE/Ro2NSAsHIB6PWRqig1ZBPJzyu0frce34pMpeQH
 diwriJL9lKPAhBWXnUQ9BKoi1R0P5OLW9ahX4SOWk7cAFg4DLlDE66Nqf6nKqViw
 DwEucTiUEg5+a3d93gihdD4JNl+fb3vI2erxrMxjFjkacl0qgqRu3ei3DG0MfdHU
 wNcFSG5B1n0OECKxr80lr1Ip1KTVNNij0Ks+w6Gc6lSg9c4PptnNkfLK3A==
 =nnCN
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "ARM:

   - Nested virtualization support for VGICv3, giving the nested
     hypervisor control of the VGIC hardware when running an L2 VM

   - Removal of 'late' nested virtualization feature register masking,
     making the supported feature set directly visible to userspace

   - Support for emulating FEAT_PMUv3 on Apple silicon, taking advantage
     of an IMPLEMENTATION DEFINED trap that covers all PMUv3 registers

   - Paravirtual interface for discovering the set of CPU
     implementations where a VM may run, addressing a longstanding issue
     of guest CPU errata awareness in big-little systems and
     cross-implementation VM migration

   - Userspace control of the registers responsible for identifying a
     particular CPU implementation (MIDR_EL1, REVIDR_EL1, AIDR_EL1),
     allowing VMs to be migrated cross-implementation

   - pKVM updates, including support for tracking stage-2 page table
     allocations in the protected hypervisor in the 'SecPageTable' stat

   - Fixes to vPMU, ensuring that userspace updates to the vPMU after
     KVM_RUN are reflected into the backing perf events

  LoongArch:

   - Remove unnecessary header include path

   - Assume constant PGD during VM context switch

   - Add perf events support for guest VM

  RISC-V:

   - Disable the kernel perf counter during configure

   - KVM selftests improvements for PMU

   - Fix warning at the time of KVM module removal

  x86:

   - Add support for aging of SPTEs without holding mmu_lock.

     Not taking mmu_lock allows multiple aging actions to run in
     parallel, and more importantly avoids stalling vCPUs. This includes
     an implementation of per-rmap-entry locking; aging the gfn is done
     with only a per-rmap single-bin spinlock taken, whereas locking an
     rmap for write requires taking both the per-rmap spinlock and the
     mmu_lock.

     Note that this decreases slightly the accuracy of accessed-page
     information, because changes to the SPTE outside aging might not
     use atomic operations even if they could race against a clear of
     the Accessed bit.

     This is deliberate because KVM and mm/ tolerate false
     positives/negatives for accessed information, and testing has shown
     that reducing the latency of aging is far more beneficial to
     overall system performance than providing "perfect" young/old
     information.

   - Defer runtime CPUID updates until KVM emulates a CPUID instruction,
     to coalesce updates when multiple pieces of vCPU state are
     changing, e.g. as part of a nested transition

   - Fix a variety of nested emulation bugs, and add VMX support for
     synthesizing nested VM-Exit on interception (instead of injecting
     #UD into L2)

   - Drop "support" for async page faults for protected guests that do
     not set SEND_ALWAYS (i.e. that only want async page faults at CPL3)

   - Bring a bit of sanity to x86's VM teardown code, which has
     accumulated a lot of cruft over the years. Particularly, destroy
     vCPUs before the MMU, despite the latter being a VM-wide operation

   - Add common secure TSC infrastructure for use within SNP and in the
     future TDX

   - Block KVM_CAP_SYNC_REGS if guest state is protected. It does not
     make sense to use the capability if the relevant registers are not
     available for reading or writing

   - Don't take kvm->lock when iterating over vCPUs in the suspend
     notifier to fix a largely theoretical deadlock

   - Use the vCPU's actual Xen PV clock information when starting the
     Xen timer, as the cached state in arch.hv_clock can be stale/bogus

   - Fix a bug where KVM could bleed PVCLOCK_GUEST_STOPPED across
     different PV clocks; restrict PVCLOCK_GUEST_STOPPED to kvmclock, as
     KVM's suspend notifier only accounts for kvmclock, and there's no
     evidence that the flag is actually supported by Xen guests

   - Clean up the per-vCPU "cache" of its reference pvclock, and instead
     only track the vCPU's TSC scaling (multipler+shift) metadata (which
     is moderately expensive to compute, and rarely changes for modern
     setups)

   - Don't write to the Xen hypercall page on MSR writes that are
     initiated by the host (userspace or KVM) to fix a class of bugs
     where KVM can write to guest memory at unexpected times, e.g.
     during vCPU creation if userspace has set the Xen hypercall MSR
     index to collide with an MSR that KVM emulates

   - Restrict the Xen hypercall MSR index to the unofficial synthetic
     range to reduce the set of possible collisions with MSRs that are
     emulated by KVM (collisions can still happen as KVM emulates
     Hyper-V MSRs, which also reside in the synthetic range)

   - Clean up and optimize KVM's handling of Xen MSR writes and
     xen_hvm_config

   - Update Xen TSC leaves during CPUID emulation instead of modifying
     the CPUID entries when updating PV clocks; there is no guarantee PV
     clocks will be updated between TSC frequency changes and CPUID
     emulation, and guest reads of the TSC leaves should be rare, i.e.
     are not a hot path

  x86 (Intel):

   - Fix a bug where KVM unnecessarily reads XFD_ERR from hardware and
     thus modifies the vCPU's XFD_ERR on a #NM due to CR0.TS=1

   - Pass XFD_ERR as the payload when injecting #NM, as a preparatory
     step for upcoming FRED virtualization support

   - Decouple the EPT entry RWX protection bit macros from the EPT
     Violation bits, both as a general cleanup and in anticipation of
     adding support for emulating Mode-Based Execution Control (MBEC)

   - Reject KVM_RUN if userspace manages to gain control and stuff
     invalid guest state while KVM is in the middle of emulating nested
     VM-Enter

   - Add a macro to handle KVM's sanity checks on entry/exit VMCS
     control pairs in anticipation of adding sanity checks for secondary
     exit controls (the primary field is out of bits)

  x86 (AMD):

   - Ensure the PSP driver is initialized when both the PSP and KVM
     modules are built-in (the initcall framework doesn't handle
     dependencies)

   - Use long-term pins when registering encrypted memory regions, so
     that the pages are migrated out of MIGRATE_CMA/ZONE_MOVABLE and
     don't lead to excessive fragmentation

   - Add macros and helpers for setting GHCB return/error codes

   - Add support for Idle HLT interception, which elides interception if
     the vCPU has a pending, unmasked virtual IRQ when HLT is executed

   - Fix a bug in INVPCID emulation where KVM fails to check for a
     non-canonical address

   - Don't attempt VMRUN for SEV-ES+ guests if the vCPU's VMSA is
     invalid, e.g. because the vCPU was "destroyed" via SNP's AP
     Creation hypercall

   - Reject SNP AP Creation if the requested SEV features for the vCPU
     don't match the VM's configured set of features

  Selftests:

   - Fix again the Intel PMU counters test; add a data load and do
     CLFLUSH{OPT} on the data instead of executing code. The theory is
     that modern Intel CPUs have learned new code prefetching tricks
     that bypass the PMU counters

   - Fix a flaw in the Intel PMU counters test where it asserts that an
     event is counting correctly without actually knowing what the event
     counts on the underlying hardware

   - Fix a variety of flaws, bugs, and false failures/passes
     dirty_log_test, and improve its coverage by collecting all dirty
     entries on each iteration

   - Fix a few minor bugs related to handling of stats FDs

   - Add infrastructure to make vCPU and VM stats FDs available to tests
     by default (open the FDs during VM/vCPU creation)

   - Relax an assertion on the number of HLT exits in the xAPIC IPI test
     when running on a CPU that supports AMD's Idle HLT (which elides
     interception of HLT if a virtual IRQ is pending and unmasked)"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (216 commits)
  RISC-V: KVM: Optimize comments in kvm_riscv_vcpu_isa_disable_allowed
  RISC-V: KVM: Teardown riscv specific bits after kvm_exit
  LoongArch: KVM: Register perf callbacks for guest
  LoongArch: KVM: Implement arch-specific functions for guest perf
  LoongArch: KVM: Add stub for kvm_arch_vcpu_preempted_in_kernel()
  LoongArch: KVM: Remove PGD saving during VM context switch
  LoongArch: KVM: Remove unnecessary header include path
  KVM: arm64: Tear down vGIC on failed vCPU creation
  KVM: arm64: PMU: Reload when resetting
  KVM: arm64: PMU: Reload when user modifies registers
  KVM: arm64: PMU: Fix SET_ONE_REG for vPMC regs
  KVM: arm64: PMU: Assume PMU presence in pmu-emul.c
  KVM: arm64: PMU: Set raw values from user to PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR}
  KVM: arm64: Create each pKVM hyp vcpu after its corresponding host vcpu
  KVM: arm64: Factor out pKVM hyp vcpu creation to separate function
  KVM: arm64: Initialize HCRX_EL2 traps in pKVM
  KVM: arm64: Factor out setting HCRX_EL2 traps into separate function
  KVM: x86: block KVM_CAP_SYNC_REGS if guest state is protected
  KVM: x86: Add infrastructure for secure TSC
  KVM: x86: Push down setting vcpu.arch.user_set_tsc
  ...
2025-03-25 14:22:07 -07:00
Linus Torvalds
2d09a9449e arm64 updates for 6.15:
Perf and PMUs:
 
  - Support for the "Rainier" CPU PMU from Arm
 
  - Preparatory driver changes and cleanups that pave the way for BRBE
    support
 
  - Support for partial virtualisation of the Apple-M1 PMU
 
  - Support for the second event filter in Arm CSPMU designs
 
  - Minor fixes and cleanups (CMN and DWC PMUs)
 
  - Enable EL2 requirements for FEAT_PMUv3p9
 
 Power, CPU topology:
 
  - Support for AMUv1-based average CPU frequency
 
  - Run-time SMT control wired up for arm64 (CONFIG_HOTPLUG_SMT). It adds
    a generic topology_is_primary_thread() function overridden by x86 and
    powerpc
 
 New(ish) features:
 
  - MOPS (memcpy/memset) support for the uaccess routines
 
 Security/confidential compute:
 
  - Fix the DMA address for devices used in Realms with Arm CCA. The
    CCA architecture uses the address bit to differentiate between shared
    and private addresses
 
  - Spectre-BHB: assume CPUs Linux doesn't know about vulnerable by
    default
 
 Memory management clean-ups:
 
  - Drop the P*D_TABLE_BIT definition in preparation for 128-bit PTEs
 
  - Some minor page table accessor clean-ups
 
  - PIE/POE (permission indirection/overlay) helpers clean-up
 
 Kselftests:
 
  - MTE: skip hugetlb tests if MTE is not supported on such mappings and
    user correct naming for sync/async tag checking modes
 
 Miscellaneous:
 
  - Add a PKEY_UNRESTRICTED definition as 0 to uapi (toolchain people
    request)
 
  - Sysreg updates for new register fields
 
  - CPU type info for some Qualcomm Kryo cores
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmfjB2QACgkQa9axLQDI
 XvGrfg//W3Bx9+jw1G/XHHEQqGEVFmvltvxZUkvgV0Qki0rPSMnappJhZRL9n0Nm
 V6PvGd2KoKHZuL3g5ViZb3cs2R9BiD2JB6PncwBKuxumHGh3vz3kk1JMkDVfWdHv
 qAceOckFJD9rXjPZn+PDsfYiEi2i3RRWIP5VglZ14ue8j3prHQ6DJXLUQF2GYvzE
 /bgLSq44wp5N59ddy23+qH9rxrHzz3bgpbVv/F56W/LErvE873mRmyFwiuGJm+M0
 Pn8ra572rI6a4sgSwrMTeNPBU+F9o5AbqwauVhkz428RdMvgfEuW6qHUBnGWJDmt
 HotXmu+4Eb2KJks/iQkDo4OTJ38yUqvvZZJtP171ms3E4yqESSJngWP6O2A6LF+y
 xhe0sESF/Ew6jLhM6/hvOmBcE2AyB14JE3ymqLkXbWub4NXddBn2AF1WXFjF4CBw
 F8KSUhNLekrCYKv1k9M3nhvkcpoS9FkTF/TI+zEg546alI/GLPih6uDRkgMAODh1
 RDJYixHsf2NDDRQbfwvt9Xua/KKpDF6qNkHLA4OiqqVUwh1hkas24Lrnp8vmce4o
 wIpWCLqYWey8Rl3XWuWgWz2Xu58fHH4Dl2k72Z8I0pwp3abCDa9xEj79G0Svk7Si
 Q+FCYrNlpKee1RXBC+1MUD/Gl5r/28dEUFkAzPD80F7AgafXPd0=
 =Kc9c
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:
 "Nothing major this time around.

  Apart from the usual perf/PMU updates, some page table cleanups, the
  notable features are average CPU frequency based on the AMUv1
  counters, CONFIG_HOTPLUG_SMT and MOPS instructions (memcpy/memset) in
  the uaccess routines.

  Perf and PMUs:

   - Support for the 'Rainier' CPU PMU from Arm

   - Preparatory driver changes and cleanups that pave the way for BRBE
     support

   - Support for partial virtualisation of the Apple-M1 PMU

   - Support for the second event filter in Arm CSPMU designs

   - Minor fixes and cleanups (CMN and DWC PMUs)

   - Enable EL2 requirements for FEAT_PMUv3p9

  Power, CPU topology:

   - Support for AMUv1-based average CPU frequency

   - Run-time SMT control wired up for arm64 (CONFIG_HOTPLUG_SMT). It
     adds a generic topology_is_primary_thread() function overridden by
     x86 and powerpc

  New(ish) features:

   - MOPS (memcpy/memset) support for the uaccess routines

  Security/confidential compute:

   - Fix the DMA address for devices used in Realms with Arm CCA. The
     CCA architecture uses the address bit to differentiate between
     shared and private addresses

   - Spectre-BHB: assume CPUs Linux doesn't know about vulnerable by
     default

  Memory management clean-ups:

   - Drop the P*D_TABLE_BIT definition in preparation for 128-bit PTEs

   - Some minor page table accessor clean-ups

   - PIE/POE (permission indirection/overlay) helpers clean-up

  Kselftests:

   - MTE: skip hugetlb tests if MTE is not supported on such mappings
     and user correct naming for sync/async tag checking modes

  Miscellaneous:

   - Add a PKEY_UNRESTRICTED definition as 0 to uapi (toolchain people
     request)

   - Sysreg updates for new register fields

   - CPU type info for some Qualcomm Kryo cores"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (72 commits)
  arm64: mm: Don't use %pK through printk
  perf/arm_cspmu: Fix missing io.h include
  arm64: errata: Add newer ARM cores to the spectre_bhb_loop_affected() lists
  arm64: cputype: Add MIDR_CORTEX_A76AE
  arm64: errata: Add KRYO 2XX/3XX/4XX silver cores to Spectre BHB safe list
  arm64: errata: Assume that unknown CPUs _are_ vulnerable to Spectre BHB
  arm64: errata: Add QCOM_KRYO_4XX_GOLD to the spectre_bhb_k24_list
  arm64/sysreg: Enforce whole word match for open/close tokens
  arm64/sysreg: Fix unbalanced closing block
  arm64: Kconfig: Enable HOTPLUG_SMT
  arm64: topology: Support SMT control on ACPI based system
  arch_topology: Support SMT control for OF based system
  cpu/SMT: Provide a default topology_is_primary_thread()
  arm64/mm: Define PTDESC_ORDER
  perf/arm_cspmu: Add PMEVFILT2R support
  perf/arm_cspmu: Generalise event filtering
  perf/arm_cspmu: Move register definitons to header
  arm64/kernel: Always use level 2 or higher for early mappings
  arm64/mm: Drop PXD_TABLE_BIT
  arm64/mm: Check pmd_table() in pmd_trans_huge()
  ...
2025-03-25 13:16:16 -07:00
Robin Murphy
9651f7899c perf/arm_cspmu: Fix missing io.h include
Adding the writel() calls needs io.h, which apparently gets
transiently included somewhere on arm64, but not elsewhere.

Fixes: 6de0298a39 ("perf/arm_cspmu: Generalise event filtering")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202503150649.Dol8RBSh-lkp@intel.com/
Closes: https://lore.kernel.org/oe-kbuild-all/202503152245.cAG4FMfi-lkp@intel.com/
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/657935ca177024ad08d5ec6f85e8faf75f82cf65.1742212833.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2025-03-17 14:53:03 +00:00
Robin Murphy
a28f3cbfd1 perf/arm_cspmu: Add PMEVFILT2R support
Architecturally we have two filters for each regular event counter,
so add generic support for the second one too.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Link: https://lore.kernel.org/r/b11be3f23a72bc27088b115099c8fe865b70babc.1741190362.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2025-03-13 21:35:04 +00:00
Robin Murphy
6de0298a39 perf/arm_cspmu: Generalise event filtering
The notion of a single u32 filter value for any event doesn't scale well
when the potential architectural scope is already two 64-bit values, and
implementations may add custom stuff on the side too. Rather than try to
thread arbitrary filter data through the common path, let's just make
the set_ev_filter op self-contained in terms of parsing and configuring
any and all filtering for the given event - splitting out a distinct op
for cycles events which inherently differ - and let implementations
override the whole thing if they want to do something different. This
already allows the Ampere code to stop looking a bit hacky.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Link: https://lore.kernel.org/r/c0cd4d4c12566dbf1b062ccd60241b3e0639f4cc.1741190362.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2025-03-13 21:35:04 +00:00
Robin Murphy
862f7ad4d7 perf/arm_cspmu: Move register definitons to header
Implementations may occasionally want to refer to register offsets, so
for the sake of consistency move all of the register definitions to join
the PMIIDR fields in the private header where they can be shared. As an
example nicety, we can then define Ampere's imp-def filters in terms of
the architectural PMIMPDEF range rather than open-coded offsets.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Link: https://lore.kernel.org/r/5a3c796560665b51cb63fec0d473afd8f8d0a836.1741190362.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2025-03-13 21:35:04 +00:00
Will Deacon
823437ed29 Merge branch 'perf/m1-guest-events' of git://git.kernel.org/pub/scm/linux/kernel/git/oupton/linux into for-next/perf
Pull Apple-M1 PMU driver changes from Oliver Upton, which form a prefix
of the series in the KVM/Arm tree that allows the PMU to be virtualised.
Sort of, anyway.

* 'perf/m1-guest-events' of git://git.kernel.org/pub/scm/linux/kernel/git/oupton/linux:
  drivers/perf: apple_m1: Support host/guest event filtering
  drivers/perf: apple_m1: Refactor event select/filter configuration
2025-03-13 21:15:28 +00:00
Oliver Upton
2d00cab849 drivers/perf: apple_m1: Provide helper for mapping PMUv3 events
Apple M* parts carry some IMP DEF traps for guest accesses to PMUv3
registers, even though the underlying hardware doesn't implement PMUv3.
This means it is possible to virtualize PMUv3 for KVM guests.

Add a helper for mapping common PMUv3 event IDs onto hardware event IDs,
keeping the implementation-specific crud in the PMU driver rather than
KVM proper. Populate the pmceid_bitmap based on the supported events so
KVM can provide synthetic PMCEID* values to the guest.

Tested-by: Janne Grunau <j@jannau.net>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250305202641.428114-13-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-11 12:54:29 -07:00
Oliver Upton
46573d944f drivers/perf: apple_m1: Support host/guest event filtering
The PMU appears to have a separate register for filtering 'guest'
exception levels (i.e. EL1 and !ELIsInHost(EL0)) which has the same
layout as PMCR1_EL1. Conveniently, there exists a VHE register alias
(PMCR1_EL12) that can be used to configure it.

Support guest events by programming the EL12 register with the intended
guest kernel/userspace filters. Limit support for guest events to VHE
(i.e. kernel running at EL2), as it avoids involving KVM to context
switch PMU registers. VHE is the only supported mode on M* parts anyway,
so this isn't an actual feature limitation.

Tested-by: Janne Grunau <j@jannau.net>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250305202641.428114-3-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-11 12:52:32 -07:00
Oliver Upton
75ecffc361 drivers/perf: apple_m1: Refactor event select/filter configuration
Supporting guest mode events will necessitate programming two event
filters. Prepare by splitting up the programming of the event selector +
event filter into separate headers.

Opportunistically replace RMW patterns with sysreg_clear_set_s().

Tested-by: Janne Grunau <j@jannau.net>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250305202641.428114-2-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-03-11 12:52:32 -07:00
Manivannan Sadhasivam
5d2b978ff9
perf/dwc_pcie: Move common DWC struct definitions to 'pcie-dwc.h'
Move the common DWC struct definitions, which are shared across all the
DesginWare PCIe IPs, to a new header file called 'pcie-dwc.h', so that
other users e.g., debugfs, perf and sysfs can make use of them.

Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Shradha Todi <shradha.t@samsung.com>
Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Tested-by: Hrishikesh Deleep <hrishikesh.d@samsung.com>
Link: https://lore.kernel.org/r/20250221131548.59616-2-shradha.t@samsung.com
[kwilczynski: commit log, tidy up the new header file]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-03 20:02:19 +00:00
Yunhui Cui
7f35b42980 perf/dwc_pcie: fix duplicate pci_dev devices
During platform_device_register, wrongly using struct device
pci_dev as platform_data caused a kmemdup copy of pci_dev. Worse
still, accessing the duplicated device leads to list corruption as its
mutex content (e.g., list, magic) remains the same as the original.

Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com>
Link: https://lore.kernel.org/r/20250220121716.50324-3-cuiyunhui@bytedance.com
Signed-off-by: Will Deacon <will@kernel.org>
2025-03-01 06:12:37 +00:00
Yunhui Cui
6eb1e8ef58 perf/dwc_pcie: fix some unreleased resources
Release leaked resources, such as plat_dev and dev_info.

Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com>
Link: https://lore.kernel.org/r/20250220121716.50324-2-cuiyunhui@bytedance.com
Signed-off-by: Will Deacon <will@kernel.org>
2025-03-01 06:12:36 +00:00
Robin Murphy
678a5d3d6d perf/arm-cmn: Minor event type housekeeping
While handling RN-D nodes under the functionally-identical RN-I type
works fine for perf tool users using the "rnid_" event aliases, and that
is the documented and expected ABI, there's little reason not to be
permissive and accept the actual RN-D type as an additional encoding for
the same events as well. This may be convenient for other tooling
generating event configs directly from its own topology data.

In the RN-I event mood, it also seems as good a time as any to clean up
a forgotten macro for CCLA_RNI events which ended up being unnecessary.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Link: https://lore.kernel.org/r/ef46a47fc4ab909093f14b2b4289a4835836ab6c.1738851844.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2025-03-01 05:18:40 +00:00
Rob Herring (Arm)
c2e793da59 perf: apple_m1: Don't disable counter in m1_pmu_enable_event()
Currently m1_pmu_enable_event() starts by disabling the event counter
it has been asked to enable. This should not be necessary as the
counter (and the PMU as a whole) should not be active when
m1_pmu_enable_event() is called.

Cc: Marc Zyngier <maz@kernel.org>
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Tested-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250218-arm-brbe-v19-v20-6-4e9922fc2e8e@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2025-03-01 05:08:10 +00:00
Rob Herring (Arm)
7bf1001e0d perf: arm_v7_pmu: Don't disable counter in (armv7|krait_|scorpion_)pmu_enable_event()
Currently (armv7|krait_|scorpion_)pmu_enable_event() start by disabling
the event counter it has been asked to enable. This should not be
necessary as the counter (and the PMU as a whole) should not be active
when *_enable_event() is called.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Tested-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250218-arm-brbe-v19-v20-5-4e9922fc2e8e@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2025-03-01 05:08:10 +00:00
Rob Herring (Arm)
7a53877482 perf: arm_v7_pmu: Drop obvious comments for enabling/disabling counters and interrupts
The function calls for enabling/disabling counters and interrupts are
pretty obvious as to what they are doing, and the comments don't add
any additional value.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Tested-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250218-arm-brbe-v19-v20-4-4e9922fc2e8e@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2025-03-01 05:08:10 +00:00
Mark Rutland
4b0567ad0b perf: arm_pmuv3: Don't disable counter in armv8pmu_enable_event()
Currently armv8pmu_enable_event() starts by disabling the event counter
it has been asked to enable. This should not be necessary as the counter
(and the PMU as a whole) should not be active when
armv8pmu_enable_event() is called.

Remove the redundant call to armv8pmu_disable_event_counter(). At the
same time, remove the comment immeditately above as everything it says
is obvious from the function names below.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Tested-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250218-arm-brbe-v19-v20-3-4e9922fc2e8e@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2025-03-01 05:08:10 +00:00
Mark Rutland
dcca27bc1e perf: arm_pmu: Don't disable counter in armpmu_add()
Currently armpmu_add() tries to handle a newly-allocated counter having
a stale associated event, but this should not be possible, and if this
were to happen the current mitigation is insufficient and potentially
expensive. It would be better to warn if we encounter the impossible
case.

Calls to pmu::add() and pmu::del() are serialized by the core perf code,
and armpmu_del() clears the relevant slot in pmu_hw_events::events[]
before clearing the bit in pmu_hw_events::used_mask such that the
counter can be reallocated. Thus when armpmu_add() allocates a counter
index from pmu_hw_events::used_mask, it should not be possible to observe
a stale even in pmu_hw_events::events[] unless either
pmu_hw_events::used_mask or pmu_hw_events::events[] have been corrupted.

If this were to happen, we'd end up with two events with the same
event->hw.idx, which would clash with each other during reprogramming,
deletion, etc, and produce bogus results. Add a WARN_ON_ONCE() for this
case so that we can detect if this ever occurs in practice.

That possiblity aside, there's no need to call arm_pmu::disable(event)
for the new event. The PMU reset code initialises the counter in a
disabled state, and armpmu_del() will disable the counter before it can
be reused. Remove the redundant disable.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Tested-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250218-arm-brbe-v19-v20-2-4e9922fc2e8e@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2025-03-01 05:08:10 +00:00