linux

mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-08-31 23:27:20 +00:00

Author	SHA1	Message	Date
Robert Richter	74bf125abd	cxl/port: Replace put_cxl_root() by a cleanup helper Function put_cxl_root() is only used by its cleanup helper. Remove the function entirely and only use the helper. Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: "Fabio M. De Francesco" <fabio.m.de.francesco@linux.intel.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20250509150700.2817697-9-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-05-09 09:48:26 -07:00
Robert Richter	5ed826fc4b	cxl/region: Move find_cxl_root() to cxl_add_to_region() When adding an endpoint to a region, the root port is determined first. Move this directly into cxl_add_to_region(). This is in preparation of the initialization of endpoints that iterates the port hierarchy from the endpoint up to the root port. As a side-effect the root argument is removed from the argument lists of cxl_add_to_region() and related functions. Now, the endpoint is the only parameter to add a region. This simplifies the function interface. Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: "Fabio M. De Francesco" <fabio.m.de.francesco@linux.intel.com> Tested-by: Gregory Price <gourry@gourry.net> Acked-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20250509150700.2817697-8-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-05-09 09:48:26 -07:00
Robert Richter	0ee2d97810	cxl/region: Avoid duplicate call of cxl_port_pick_region_decoder() Function cxl_port_pick_region_decoder() is called twice, in alloc_region_ref() and cxl_rr_alloc_decoder(). Both functions are subsequently called from cxl_port_attach_region(). Make the decoder a function argument to both which avoids a duplicate call of cxl_port_pick_region_decoder(). Now, cxl_rr_alloc_decoder() no longer allocates the decoder. Instead, the previously picked decoder is assigned to the region reference. Hence, rename the function to cxl_rr_assign_decoder(). Moving the call out of alloc_region_ref() also moves it out of the xa_for_each() loop in there. Now, cxld is determined no longer only for each auto-generated region, but now once for all regions regardless of auto-generated or not. This is fine as the cxld argument is needed for all regions in cxl_rr_assign_decoder() and an error would be returned otherwise anyway. So it is better to determine the decoder in front of all this and fail early if missing instead of running through all that code with multiple calls of cxl_port_pick_region_decoder(). Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: "Fabio M. De Francesco" <fabio.m.de.francesco@linux.intel.com> Tested-by: Gregory Price <gourry@gourry.net> Acked-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20250509150700.2817697-7-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-05-09 09:48:26 -07:00
Robert Richter	a3a96873b2	cxl/region: Rename function to cxl_port_pick_region_decoder() Current function cxl_region_find_decoder() is used to find a port's decoder during region setup. In the region creation path the function is an allocator to find a free port. In the region assembly path, it is recalling the decoder that platform firmware picked for validation purposes. Rename function to cxl_port_pick_region_decoder() that better describes its use and update the function's description. The result of cxl_port_pick_region_decoder() is recorded in a 'struct cxl_region_ref' in @port for later recall when other endpoints might also be targets of the picked decoder. Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: "Fabio M. De Francesco" <fabio.m.de.francesco@linux.intel.com> Tested-by: Gregory Price <gourry@gourry.net> Acked-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20250509150700.2817697-6-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-05-09 09:48:26 -07:00
Robert Richter	99ff9060b2	cxl: Introduce parent_port_of() helper Often a parent port must be determined. Introduce the parent_port_of() helper function to avoid open coding of determination of a parent port. Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: "Fabio M. De Francesco" <fabio.m.de.francesco@linux.intel.com> Tested-by: Gregory Price <gourry@gourry.net> Acked-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20250509150700.2817697-5-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-05-09 09:48:25 -07:00
Robert Richter	88bc0503c4	cxl/pci: Add comments to cxl_hdm_decode_init() There are various configuration cases of HDM decoder registers causing different code paths. Add comments to cxl_hdm_decode_init() to better explain them. Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: "Fabio M. De Francesco" <fabio.m.de.francesco@linux.intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Tested-by: Gregory Price <gourry@gourry.net> Acked-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20250509150700.2817697-4-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-05-09 09:48:25 -07:00
Robert Richter	d858631b1c	cxl/pci: Moving code in cxl_hdm_decode_init() Commit `3f9e075317` ("cxl/pci: simplify the check of mem_enabled in cxl_hdm_decode_init()") changed the code flow in this function. The root port is determined before a check to leave the function. Since the root port is not used by the check it can be moved to run the check first. This improves code readability and avoids unnesessary code execution. Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: "Fabio M. De Francesco" <fabio.m.de.francesco@linux.intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Gregory Price <gourry@gourry.net> Acked-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20250509150700.2817697-3-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-05-09 09:48:25 -07:00
Robert Richter	21339b30f0	cxl: Remove else after return Remove unnecessary 'else' after return. Improves readability of code. It is easier to place comments. Check and fix all occurrences under drivers/cxl/. Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: "Fabio M. De Francesco" <fabio.m.de.francesco@linux.intel.com> Tested-by: Gregory Price <gourry@gourry.net> Acked-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20250509150700.2817697-2-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-05-09 09:48:25 -07:00
Gregory Price	ce32b0c9c5	cxl: core/region - ignore interleave granularity when ways=1 When validating decoder IW/IG when setting up regions, the granularity is irrelevant when iw=1 - all accesses will always route to the only target anyway - so all ig values are "correct". Loosen the requirement that `ig = (parent_iw * parent_ig)` when iw=1. On some Zen5 platforms, the platform BIOS specifies a 256-byte interleave granularity window for host bridges when there is only one target downstream. This leads to Linux rejecting the configuration of a region with a x2 root with two x1 hostbridges. Decoder Programming: root - iw:2 ig:256 hb1 - iw:1 ig:256 (Linux expects 512) hb2 - iw:1 ig:256 (Linux expects 512) ep1 - iw:2 ig:256 ep2 - iw:2 ig:256 This change allows all decoders downstream of a passthrough decoder to also be configured as passthrough (iw:1 ig:X), but still disallows downstream decoders from applying subsequent interleaves. e.g. in the above example if there was another decoder south of hb1 attempting to interleave 2 endpoints - Linux would enforce hb1.ig=512 because the southern decoder would have iw:2 and require ig=pig*piw. [DJ: Fixed up against 6.15-rc1] Signed-off-by: Gregory Price <gourry@gourry.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Tested-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20250402232552.999634-1-gourry@gourry.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-04-28 08:48:30 -07:00
Dave Jiang	cdafa67c02	cxl: Remove always true condition for cxlctl_validate_hw_command() smatch warnings: drivers/cxl/core/features.c:441 cxlctl_validate_hw_command() warn: always true condition '(scope >= 0) => (0-u32max >= 0)' Remove the check entirely as it has no effect. Expectation is both of these operations should be allowed for all check levels. They are read only and have no change effects. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202504041033.2HBboAZR-lkp@intel.com/ Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://patch.msgid.link/20250404165418.3032414-1-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-04-28 08:48:30 -07:00
Smita Koralahalli	078d3ee7c1	cxl/core/regs.c: Skip Memory Space Enable check for RCD and RCH Ports According to CXL r3.2 section 8.2.1.2, the PCI_COMMAND register fields, including Memory Space Enable bit, have no effect on the behavior of an RCD Upstream Port. Retaining this check may incorrectly cause cxl_pci_probe() to fail on a valid RCD upstream Port. While the specification is explicit only for RCD Upstream Ports, this check is solely for accessing the RCRB, which is always mapped through memory space. Therefore, its safe to remove the check entirely. In practice, firmware reliably enables the Memory Space Enable bit for RCH Downstream Ports and no failures have been observed. Removing the check simplifies the code and avoids unnecessary special-casing, while relying on BIOS/firmware to configure devices correctly. Moreover, any failures due to inaccessible RCRB regions will still be caught either in __rcrb_to_component() or while parsing the component register block. The following failure was observed in dmesg when the check was present: cxl_pci 0000:7f:00.0: No component registers (-6) Fixes: `d5b1a27143` ("cxl/acpi: Extract component registers of restricted hosts from RCRB") Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Cc: <stable@vger.kernel.org> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Robert Richter <rrichter@amd.com> Link: https://patch.msgid.link/20250407192734.70631-1-Smita.KoralahalliChannabasappa@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-04-21 08:30:13 -07:00
Li Ming	25174d5cd2	cxl/feature: Update out_len in set feature failure case CXL subsystem supports userspace to configure features via fwctl interface, it will configure features by using Set Feature command. Whatever Set Feature succeeds or fails, CXL driver always needs to return a structure fwctl_rpc_cxl_out to caller, and returned size is updated in a out_len parameter. The out_len should be updated not only when the set feature succeeds, but also when the set feature fails. Fixes: `eb5dfcb9e3` ("cxl: Add support to handle user feature commands for set feature") Signed-off-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://patch.msgid.link/20250410024521.514095-1-ming.li@zohomail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-04-18 09:33:56 -07:00
Dave Jiang	dc915672f9	cxl: Fix devm host device for CXL fwctl initialization Testing revealed the following error message for a CXL memdev that has Feature support: [ 56.690430] cxl mem0: Resources present before probing Attach the allocation of cxl_fwctl to the parent device of cxl_memdev. devm_add_* calls for cxl_memdev should not happen before the memdev probe function or outside the scope of the memdev driver. cxl_test missed this bug because cxl_test always arranges for the cxl_mem driver to be loaded before cxl_mock_mem runs. So the driver core always finds the devres list idle in that case. [DJ: Updated subject title and added commit log suggestion from djbw] Fixes: `858ce2f56b` ("cxl: Add FWCTL support to CXL") Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/linux-cxl/6801aea053466_71fe2944c@dwillia2-xfh.jf.intel.com.notmuch/ Link: https://patch.msgid.link/20250418002933.406439-1-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-04-18 09:33:04 -07:00
Li Ming	36aace15d9	cxl/pci: Drop the parameter is_port of cxl_gpf_get_dvsec() The first parameter of cxl_gpf_get_dvsec() is a struct device, can be used to distinguish if the device is a cxl dport or a cxl pci device by checking the PCIe type of it, so the parameter is_port is unnecessary to cxl_gpf_get_dvsec(), using parameter struct device is enough. Signed-off-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://patch.msgid.link/20250323093110.233040-4-ming.li@zohomail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-04-09 12:48:18 -07:00
Li Ming	6af941db6a	cxl/pci: Update Port GPF timeout only when the first EP attaching update_gpf_port_dvsec() is used to update GPF Phase timeout, if a CXL switch is under a CXL root port, update_gpf_port_dvsec() will be invoked on the CXL root port when each cxl memory device under the CXL switch is attaching. It is enough to be invoked once, others are redundant. When the first EP attaching, it always triggers its ancestor dports to locate their own Port GPF DVSEC. The change is that invoking update_gpf_port_dvsec() on these ancestor dports after ancestor dport locating a Port GPF DVSEC. It guarantees that update_gpf_port_dvsec() is invoked on a dport only happens during the first EP attaching. Signed-off-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://patch.msgid.link/20250323093110.233040-3-ming.li@zohomail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-04-09 12:48:18 -07:00
Li Ming	87d2de042c	cxl/core: Fix caching dport GPF DVSEC issue Per Table 8-2 in CXL r3.2 section 8.1.1 and CXL r3.2 section 8.1.6, only CXL Downstream switch ports and CXL root ports have GPF DVSEC for CXL Port(DVSEC ID 04h). CXL subsystem has a gpf_dvsec in struct cxl_port which is used to cache the offset of a GPF DVSEC in PCIe configuration space. It will be updated during the first EP attaching to the cxl_port, so the gpf_dvsec can only cache the GPF DVSEC offset of the dport which the first EP is under. Will not have chance to update it during other EPs attaching. That means CXL subsystem will use the same GPF DVSEC offset for all dports under the port, it will be a problem if the GPF DVSEC offset cached in cxl_port is not the right offset for a dport. Moving gpf_dvsec from struct cxl_port to struct cxl_dport, make every cxl dport has their own GPF DVSEC offset caching, and each cxl dport uses its own GPF DVSEC offset for GPF DVSEC accessing. Fixes: `a52b6a2c1c` ("cxl/pci: Support Global Persistent Flush (GPF)") Signed-off-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://patch.msgid.link/20250323093110.233040-2-ming.li@zohomail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-04-09 12:48:18 -07:00
Linus Torvalds	01ecadbe09	cxl for v6.15 - Add support for Global Persistent Flush (GPF) - Cleanup of DPA partition metadata handling - Remove the CXL_DECODER_MIXED enum that's not needed anymore - Introduce helpers to access resource and perf meta data - Introduce 'struct cxl_dpa_partition' and 'struct cxl_range_info' - Make cxl_dpa_alloc() DPA partition number agnostic - Remove cxl_decoder_mode - Cleanup partition size and perf helpers - Remove unused CXL partition values - Add logging support for CXL CPER endpoint and port protocol errors - Prefix protocol error struct and function names with cxl_ - Move protocol error definitions and structures to a common location - Remove drivers/firmware/efi/cper_cxl.h to include/linux/cper.h - Add support in GHES to process CXL CPER protocol errors - Process CXL CPER protocol errors - Add trace logging for CXL PCIe port RAS errors - Remove redundant gp_port init - Add validation of cxl device serial number - CXL ABI documentation updates/fixups - A series that uses guard() to clean up open coded mutex lockings and remove gotos for error handling. - Some followup patches to support dirty shutdown accounting - Add helper to retrieve DVSEC offset for dirty shutdown registers - Rename cxl_get_dirty_shutdown() to cxl_arm_dirty_shutdown() - Add support for dirty shutdown count via sysfs - cxl_test support for dirty shutdown - A series to support CXL mailbox Features commands. Mostly in preparation for CXL EDAC code to utilize the Features commands. It's also in preparation for CXL fwctl support to utilize the CXL Features. The commands include "Get Supported Features", "Get Feature", and "Set Feature". - A series to support extended linear cache support described by the ACPI HMAT table. The addition helps enumerate the cache and also provides additional RAS reporting support for configuration with extended linear cache. (and related fixes for the series). - An update to cxl_test to support a 3-way capable CFMWS. - A documentation fix to remove unused "mixed mode". -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE5DAy15EJMCV1R6v9YGjFFmlTOEoFAmfqtP4ACgkQYGjFFmlT OEqx9A//UsCWf1CH8bvjKXxSTlQmtPlNpcXe+gVR0sc5cL2VFxKf93AY8Zo1Br5A b40gtZJz9QwjwGwIvDiki9U2bopOyX3aMOyBJMYmLuL/irY8ENx2ra7ODbxe7uGn oZwpwG2sEGQxIAG2bCpVuCDIt8JjNvsTJo45TICs07w9TWTmH4Swpbz1g8VGpDz/ kCQcXXHSHZleR5BzqVRKxjjqGEUFj2xDMzAI8VSL+7izMMoPLbjwnl2c1fwaLBPd iJTMboTXDj7eVMta/qqGkG7pshM81SnkSzy8cxImj3r4SRgRTZg9U8vhrR3K1kdH F05Ozd12tljtNXLWthENZPUbfcovy9oTxzMt/gVut7j6C7H3s3KCSbV7zhz5BmfD XcapOX4Cu7ptn88KLqE5a98oLuq2DXrLOcX5vKPYBfAO+68rC+gSAPSbzfZlSHa0 1/TsxVvzDQUBVZWL94DeHvemyQb58GQBOypeNZbH8P4gAhWJqk3hZEO+wlSxpfd+ R7wgabfKJUJ82KusCZHIW1Wg3/IrXb4yC+UyiObS5RgIJWpRmOkuJEHDvEUje+Dj aOWw/H3vZgeZnpW87FRxzvDJx1/0jZI1vsxH65m2wrvz6n5aGIA/Q6pgqCdU/m6c I231bl1bmZzJ8u3+vOZL4tFHcYHh4XCwQp+ZQt1uDa0fA5LbLhc= =ZME1 -----END PGP SIGNATURE----- Merge tag 'cxl-for-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull Compute Express Link (CXL) updates from Dave Jiang: - Add support for Global Persistent Flush (GPF) - Cleanup of DPA partition metadata handling: - Remove the CXL_DECODER_MIXED enum that's not needed anymore - Introduce helpers to access resource and perf meta data - Introduce 'struct cxl_dpa_partition' and 'struct cxl_range_info' - Make cxl_dpa_alloc() DPA partition number agnostic - Remove cxl_decoder_mode - Cleanup partition size and perf helpers - Remove unused CXL partition values - Add logging support for CXL CPER endpoint and port protocol errors: - Prefix protocol error struct and function names with cxl_ - Move protocol error definitions and structures to a common location - Remove drivers/firmware/efi/cper_cxl.h to include/linux/cper.h - Add support in GHES to process CXL CPER protocol errors - Process CXL CPER protocol errors - Add trace logging for CXL PCIe port RAS errors - Remove redundant gp_port init - Add validation of cxl device serial number - CXL ABI documentation updates/fixups - A series that uses guard() to clean up open coded mutex lockings and remove gotos for error handling. - Some followup patches to support dirty shutdown accounting: - Add helper to retrieve DVSEC offset for dirty shutdown registers - Rename cxl_get_dirty_shutdown() to cxl_arm_dirty_shutdown() - Add support for dirty shutdown count via sysfs - cxl_test support for dirty shutdown - A series to support CXL mailbox Features commands. Mostly in preparation for CXL EDAC code to utilize the Features commands. It's also in preparation for CXL fwctl support to utilize the CXL Features. The commands include "Get Supported Features", "Get Feature", and "Set Feature". - A series to support extended linear cache support described by the ACPI HMAT table. The addition helps enumerate the cache and also provides additional RAS reporting support for configuration with extended linear cache. (and related fixes for the series). - An update to cxl_test to support a 3-way capable CFMWS - A documentation fix to remove unused "mixed mode" * tag 'cxl-for-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (39 commits) cxl/region: Fix the first aliased address miscalculation cxl/region: Quiet some dev_warn()s in extended linear cache setup cxl/Documentation: Remove 'mixed' from sysfs mode doc cxl: Fix warning from emitting resource_size_t as long long int on 32bit systems cxl/test: Define a CFMWS capable of a 3 way HB interleave cxl/mem: Do not return error if CONFIG_CXL_MCE unset tools/testing/cxl: Set Shutdown State support cxl/pmem: Export dirty shutdown count via sysfs cxl/pmem: Rename cxl_dirty_shutdown_state() cxl/pci: Introduce cxl_gpf_get_dvsec() cxl/pci: Support Global Persistent Flush (GPF) cxl: Document missing sysfs files cxl: Plug typos in ABI doc cxl/pmem: debug invalid serial number data cxl/cdat: Remove redundant gp_port initialization cxl/memdev: Remove unused partition values cxl/region: Drop goto pattern of construct_region() cxl/region: Drop goto pattern in cxl_dax_region_alloc() cxl/core: Use guard() to drop goto pattern of cxl_dpa_alloc() cxl/core: Use guard() to drop the goto pattern of cxl_dpa_free() ...	2025-04-02 20:04:43 -07:00
Li Ming	aae0594a70	cxl/region: Fix the first aliased address miscalculation In extended linear cache(ELC) case, cxl_port_get_spa_cache_alias() helps to get the aliased address of a SPA, it considers the first address in CXL memory range is "region start + region cache size + 1", but it should be "region start + region cache size". So if a SPA is equal to "region start + region cache size", its aliased address should be "SPA - region cache size". Signed-off-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/20250317070124.815028-1-ming.li@zohomail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-03-20 11:28:45 -07:00
Dave Jiang	eb5dfcb9e3	cxl: Add support to handle user feature commands for set feature Add helper function to parse the user data from fwctl RPC ioctl and send the parsed input parameters to cxl_set_feature() call. Link: https://patch.msgid.link/r/20250307205648.1021626-6-dave.jiang@intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2025-03-17 14:41:37 -03:00
Dave Jiang	5908f3ed6d	cxl: Add support to handle user feature commands for get feature Add helper function to parse the user data from fwctl RPC ioctl and send the parsed input parameters to cxl_get_feature() call. Link: https://patch.msgid.link/r/20250307205648.1021626-5-dave.jiang@intel.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2025-03-17 14:41:37 -03:00
Dave Jiang	4d1c09cef2	cxl: Add support for fwctl RPC command to enable CXL feature commands fwctl provides a fwctl_ops->fw_rpc() callback in order to issue ioctls to a device. The cxl fwctl driver will start by supporting the CXL Feature commands: Get Supported Features, Get Feature, and Set Feature. The fw_rpc() callback provides 'enum fwctl_rpc_scope' parameter where it indicates the security scope of the call. The Get Supported Features and Get Feature calls can be executed with the scope of FWCTL_RPC_CONFIGRATION. The Set Feature call is gated by the effects of the Feature reported by Get Supported Features call for the specific Feature. Only "Get Supported Features" is supported in this patch. Additional commands will be added in follow on patches. "Get Supported Features" will filter the Features that are exclusive to the kernel. The flag field of the Feature details will be cleared of the "Changeable" field and the "set feat size" will be set to 0 to indicate that the feature is not changeable. Link: https://patch.msgid.link/r/20250307205648.1021626-4-dave.jiang@intel.com Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2025-03-17 14:41:37 -03:00
Dave Jiang	858ce2f56b	cxl: Add FWCTL support to CXL Add fwctl support code to allow sending of CXL feature commands from userspace through as ioctls via FWCTL. Provide initial setup bits. The CXL PCI probe function will call devm_cxl_setup_fwctl() after the cxl_memdev has been enumerated in order to setup FWCTL char device under the cxl_memdev like the existing memdev char device for issuing CXL raw mailbox commands from userspace via ioctls. Link: https://patch.msgid.link/r/20250307205648.1021626-2-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2025-03-17 14:41:36 -03:00
Dave Jiang	3b5d43245f	Merge branch 'for-6.15/features' into cxl-for-next Add CXL Features support. Setup code for enabling in kernel usage of CXL Features. Expecting EDAC/RAS to utilize CXL Features in kernel for things such as memory sparing. Also prepartion for enabling of CXL FWCTL support to issue allowed Features from user space.	2025-03-17 09:22:59 -07:00
Alison Schofield	74d9c59658	cxl/region: Quiet some dev_warn()s in extended linear cache setup Extended Linear Cache (ELC) setup code emits a dev_warn(), "Extended linear cache calculation failed." for issues found while setting up the ELC. For platforms without CONFIG_ACPI_HMAT, every auto region setup will emit the warning because the default !ACPI_HMAT return value is EOPNOTSUPP. Suppress it by skipping the warn for EOPNOTSUPP. Change the EOPNOTSUPP in the actual ELC failure path to ENXIO. Remove the check and enusing dev_warn() when region resource size is NULL. The endpoint decoders hpa_range used to create the resource is checked in init_hdm_decoder(), so it cannot be NULL here. For good measure, add the rc value to the dev_warn(). It will either be the -ENOENT returned by HMAT if the mem target is not found, or the -ENXIO from the region driver calculation. Reviewed-by: Li Ming <ming.li@zohomail.com> Signed-off-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20250306213700.2606304-1-alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-03-14 16:28:15 -07:00
Dave Jiang	3d3e3b9444	cxl: Fix warning from emitting resource_size_t as long long int on 32bit systems Reported by kernel test bot from an ARM build: drivers/cxl/core/region.c:3263:26: warning: format '%lld' expects argument of type 'long long int', but argument 3 has type 'resource_size_t' {aka 'unsigned int'} [-Wformat=] On a 32bit system, resource_size_t is defined as 32bit long vs on a 64bit system it is defined as 64bit long. Use %pa format to deal with resource_size_t. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202503010252.mIDhZ5kY-lkp@intel.com/ Fixes: `0ec9849b63` ("acpi/hmat / cxl: Add extended linear cache support for CXL") Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20250228204739.3849309-1-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-03-14 16:27:54 -07:00
Li Ming	84f8b6e242	cxl/mem: Do not return error if CONFIG_CXL_MCE unset CONFIG_CXL_MCE depends on CONFIG_MEMORY_FAILURE, if CONFIG_CXL_MCE is not set, devm_cxl_register_mce_notifier() will return an -EOPNOTSUPP, it will cause cxl_mem_state_create() failure , and then cxl pci device probing failed. In this case, it should not break cxl pci device probing. Add a checking in cxl_mem_state_create() to check if the returned value of devm_cxl_register_mce_notifier() is -EOPNOTSUPP, if yes, just output a warning log, do not let cxl_mem_state_create() return an error. Signed-off-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/20250227101848.388595-1-ming.li@zohomail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-03-14 16:27:28 -07:00
Dave Jiang	763e15d047	Merge branch 'for-6.15/extended-linear-cache' into cxl-for-next2 Add support for Extended Linear Cache for CXL. Add enumeration support of the cache. Add MCE notification of the aliased memory address.	2025-03-14 16:22:34 -07:00
Dave Jiang	d781a45270	Merge branch 'for-6.15/dirty-shutdown' into cxl-for-next2 Add support for Global Persistent Flush (GPF) and dirty shutdown accounting.	2025-03-14 16:11:42 -07:00
Dave Jiang	b6faa9c613	Merge branch 'for-6.15/guard_cleanups' into cxl-for-next2 A series of CXL refactoring using scope based resource management to remove goto patterns on the cleanup paths.	2025-03-14 16:11:06 -07:00
Davidlohr Bueso	7d0ecc0bd8	cxl/pmem: Export dirty shutdown count via sysfs Similar to how the acpi_nfit driver exports Optane dirty shutdown count, introduce: /sys/bus/cxl/devices/nvdimm-bridge0/ndbusX/nmemY/cxl/dirty_shutdown Under the conditions that 1) dirty shutdown can be set, 2) Device GPF DVSEC exists, and 3) the count itself can be retrieved. Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://patch.msgid.link/20250220220235.276831-4-dave@stgolabs.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-03-14 15:55:26 -07:00
Davidlohr Bueso	86349aaaea	cxl/pmem: Rename cxl_dirty_shutdown_state() ... to a better suited 'cxl_arm_dirty_shutdown()'. Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://patch.msgid.link/20250220220235.276831-3-dave@stgolabs.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-03-14 15:55:25 -07:00
Davidlohr Bueso	021b7e42fa	cxl/pci: Introduce cxl_gpf_get_dvsec() Add a helper to fetch the port/device GPF dvsecs. This is currently only used for ports, but a later patch to export dirty count to users will make use of the device one. Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://patch.msgid.link/20250220220235.276831-2-dave@stgolabs.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-03-14 15:54:59 -07:00
Davidlohr Bueso	a52b6a2c1c	cxl/pci: Support Global Persistent Flush (GPF) Add support for GPF flows. It is found that the CXL specification around this to be a bit too involved from the driver side. And while this should really all handled by the hardware, this patch takes things with a grain of salt. Upon respective port enumeration, both phase timeouts are set to a max of 20 seconds, which is the NMI watchdog default for lockup detection. The premise is that the kernel does not have enough information to set anything better than a max across the board and hope devices finish their GPF flows within the platform energy budget. Timeout detection is based on dirty Shutdown semantics. The driver will mark it as dirty, expecting that the device clear it upon a successful GPF event. The admin may consult the device Health and check the dirty shutdown counter to see if there was a problem with data integrity. [ davej: Explicitly set return to 0 in update_gpf_port_dvsec() ] [ davej: Add spec reference for 'struct cxl_mbox_set_shutdown_state_in ] [ davej: Fix 0-day reported issue ] Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20250124233533.910535-1-dave@stgolabs.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-03-14 15:50:22 -07:00
Li Ming	e0feac20d1	cxl/cdat: Remove redundant gp_port initialization gp_port is already pointed to the grandparent port during its definition, remove a redundant code to let gp_port point to the grandparent port again. Signed-off-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://patch.msgid.link/20250211062054.300108-1-ming.li@zohomail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-03-14 15:00:55 -07:00
Ira Weiny	16ca2f5431	cxl/memdev: Remove unused partition values The next volatile and next persistent values are unused and are cluttering the cxl_memdev_state. Remove these values. Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/20250206-cxl-cleanup-v1-1-9ddf26dd8433@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-03-14 15:00:45 -07:00
Li Ming	5ec67596e3	cxl/region: Drop goto pattern of construct_region() Some operations need to be protected by the cxl_region_rwsem in construct_region(). Currently, construct_region() uses down_write() and up_write() for the cxl_region_rwsem locking, so there is a goto pattern after down_write() invoked to release cxl_region_rwsem. construct region() can be optimized to remove the goto pattern. The changes are creating a new function called __construct_region() which will include all checking and operations protected by the cxl_region_rwsem, and using guard(rwsem_write) to replace down_write() and up_write() in __construct_region(). Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Li Ming <ming.li@zohomail.com> Link: https://patch.msgid.link/20250221013205.126419-1-ming.li@zohomail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-03-14 14:59:34 -07:00
Li Ming	9e7b7ab5af	cxl/region: Drop goto pattern in cxl_dax_region_alloc() In cxl_dax_region_alloc(), there is a goto pattern to release the rwsem cxl_region_rwsem when the function returns, the down_read() and up_read can be replaced by a guard(rwsem_read) then the goto pattern can be removed. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Li Ming <ming.li@zohomail.com> Link: https://patch.msgid.link/20250221012453.126366-7-ming.li@zohomail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-03-14 14:46:32 -07:00
Li Ming	a81ebe7d19	cxl/core: Use guard() to drop goto pattern of cxl_dpa_alloc() In cxl_dpa_alloc(), some checking and operations need to be protected by a rwsem called cxl_dpa_rwsem, so there is a goto pattern in cxl_dpa_alloc() to release the rwsem. The goto pattern can be optimized by using guard() to hold the rwsem. Creating a new function called __cxl_dpa_alloc() to include all checking and operations needed to be protected by cxl_dpa_rwsem. Using guard(rwsem_write()) to hold cxl_dpa_rwsem at the beginning of the new function. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Li Ming <ming.li@zohomail.com> Link: https://patch.msgid.link/20250221012453.126366-6-ming.li@zohomail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-03-14 14:45:04 -07:00
Li Ming	16fe6ec4ac	cxl/core: Use guard() to drop the goto pattern of cxl_dpa_free() cxl_dpa_free() has a goto pattern to call up_write() for cxl_dpa_rwsem, it can be removed by using a guard() to replace the down_write() and up_write(). Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Li Ming <ming.li@zohomail.com> Link: https://patch.msgid.link/20250221012453.126366-5-ming.li@zohomail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-03-14 14:37:54 -07:00
Li Ming	a58afda8bf	cxl/memdev: cxl_memdev_ioctl() cleanup In cxl_memdev_ioctl(), the down_read(&cxl_memdev_rwsem) and up_read(&cxl_memdev_rwsem) can be replaced by a guard(rwsem_read)(&cxl_memdev_rwsem), it helps to remove the open-coded up_read(&cxl_memdev_rwsem). Besides, the local var 'rc' can be also removed to make the code more cleaner. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Li Ming <ming.li@zohomail.com> Link: https://patch.msgid.link/20250221012453.126366-4-ming.li@zohomail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-03-14 14:37:47 -07:00
Li Ming	3ad4f59f38	cxl/core: cxl_mem_sanitize() cleanup In cxl_mem_sanitize(), the down_read() and up_read() for cxl_region_rwsem can be simply replaced by a guard(rwsem_read), and the local variable 'rc' can be removed. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Li Ming <ming.li@zohomail.com> Link: https://patch.msgid.link/20250221012453.126366-3-ming.li@zohomail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-03-14 14:37:42 -07:00
Li Ming	eeba74747a	cxl/core: Use guard() to replace open-coded down_read/write() Some down/up_read() and down/up_write() cases can be replaced by a guard() simply to drop explicit unlock invoked. It helps to align coding style with current CXL subsystem's. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Li Ming <ming.li@zohomail.com> Link: https://patch.msgid.link/20250221012453.126366-2-ming.li@zohomail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-03-14 14:37:01 -07:00
Dave Jiang	9387c6aec0	Merge branch 'for-6.15/fw-first-error-logging' into cxl-for-next2 Add logging support for CXL CPER endpoint and port protocol errors. Including the 2 patches that was completed later. Link: https://lore.kernel.org/linux-cxl/20250123084421.127697-1-Smita.KoralahalliChannabasappa@amd.com/ Link: https://lore.kernel.org/linux-cxl/20250310223839.31342-1-Smita.KoralahalliChannabasappa@amd.com/	2025-03-14 14:27:17 -07:00
Smita Koralahalli	02f4f0177d	cxl/pci: Add trace logging for CXL PCIe Port RAS errors The CXL drivers use kernel trace functions for logging endpoint and Restricted CXL host (RCH) Downstream Port RAS errors. Similar functionality is required for CXL Root Ports, CXL Downstream Switch Ports, and CXL Upstream Switch Ports. Introduce trace logging functions for both RAS correctable and uncorrectable errors specific to CXL PCIe Ports. Use them to trace FW-First Protocol errors. Co-developed-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://patch.msgid.link/20250310223839.31342-3-Smita.KoralahalliChannabasappa@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-03-14 14:22:08 -07:00
Smita Koralahalli	36f257e3b0	acpi/ghes, cxl/pci: Process CXL CPER Protocol Errors When PCIe AER is in FW-First, OS should process CXL Protocol errors from CPER records. Introduce support for handling and logging CXL Protocol errors. The defined trace events cxl_aer_uncorrectable_error and cxl_aer_correctable_error trace native CXL AER endpoint errors. Reuse them to trace FW-First Protocol errors. Since the CXL code is required to be called from process context and GHES is in interrupt context, use workqueues for processing. Similar to CXL CPER event handling, use kfifo to handle errors as it simplifies queue processing by providing lock free fifo operations. Add the ability for the CXL sub-system to register a workqueue to process CXL CPER protocol errors. [DJ: return cxl_cper_register_prot_err_work() directly in cxl_ras_init()] Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://patch.msgid.link/20250310223839.31342-2-Smita.KoralahalliChannabasappa@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-03-14 14:21:45 -07:00
Dave Jiang	516e5bd0b6	cxl: Add mce notifier to emit aliased address for extended linear cache Below is a setup with extended linear cache configuration with an example layout of memory region shown below presented as a single memory region consists of 256G memory where there's 128G of DRAM and 128G of CXL memory. The kernel sees a region of total 256G of system memory. 128G DRAM 128G CXL memory \|-----------------------------------\|-------------------------------------\| Data resides in either DRAM or far memory (FM) with no replication. Hot data is swapped into DRAM by the hardware behind the scenes. When error is detected in one location, it is possible that error also resides in the aliased location. Therefore when a memory location that is flagged by MCE is part of the special region, the aliased memory location needs to be offlined as well. Add an mce notify callback to identify if the MCE address location is part of an extended linear cache region and handle accordingly. Added symbol export to set_mce_nospec() in x86 code in order to call set_mce_nospec() from the CXL MCE notify callback. Link: https://lore.kernel.org/linux-cxl/668333b17e4b2_5639294fd@dwillia2-xfh.jf.intel.com.notmuch/ Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20250226162224.3633792-5-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-02-26 14:13:49 -07:00
Dave Jiang	8c520c5f1e	cxl: Add extended linear cache address alias emission for cxl events Add the aliased address of extended linear cache when emitting event trace for poison, DRAM and general media of CXL events. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20250226162224.3633792-4-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-02-26 14:07:52 -07:00
Dave Jiang	0ec9849b63	acpi/hmat / cxl: Add extended linear cache support for CXL The current cxl region size only indicates the size of the CXL memory region without accounting for the extended linear cache size. Retrieve the cache size from HMAT and append that to the cxl region size for the cxl region range that matches the SRAT range that has extended linear cache enabled. The SRAT defines the whole memory range that includes the extended linear cache and the CXL memory region. The new HMAT ECN/ECR to the Memory Side Cache Information Structure defines the size of the extended linear cache size and matches to the SRAT Memory Affinity Structure by the memory proxmity domain. Add a helper to match the cxl range to the SRAT memory range in order to retrieve the cache size. There are several places that checks the cxl region range against the decoder range. Use new helper to check between the two ranges and address the new cache size. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20250226162224.3633792-3-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-02-26 13:45:22 -07:00
Dave Jiang	a8b773f242	cxl: Setup exclusive CXL features that are reserved for the kernel Certain features will be exclusively used by components such as in kernel RAS driver. Setup an exclusion list that can be used to detect if a feature is exclusive to the kernel. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Tested-by: Shiju Jose <shiju.jose@huawei.com> Link: https://patch.msgid.link/20250220194438.2281088-7-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-02-26 08:51:32 -07:00
Shiju Jose	14d502cc27	cxl/mbox: Add SET_FEATURE mailbox command Add support for SET_FEATURE mailbox command. CXL spec r3.2 section 8.2.9.6 describes optional device specific features. CXL devices supports features with changeable attributes. The settings of a feature can be optionally modified using Set Feature command. CXL spec r3.2 section 8.2.9.6.3 describes Set Feature command. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Link: https://patch.msgid.link/20250220194438.2281088-6-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2025-02-26 08:51:32 -07:00

1 2 3 4 5 ...

637 commits