linux/Documentation/ABI/testing/sysfs-edac-ecs
Shiju Jose bcbd069b11 EDAC: Add a Error Check Scrub control feature
Add an Error Check Scrub (ECS) control to manage a memory device's ECS
feature.

The ECS is a feature defined in JEDEC DDR5 SDRAM Specification (JESD79-5) and
allows the DRAM to internally read, correct single-bit errors, and write back
corrected data bits to the DRAM array while providing transparency to error
counts.

The DDR5 device contains a number of memory media Field Replaceable Units
(FRU) per device. The DDR5 ECS feature and thus the ECS control driver
supports configuring the ECS parameters per FRU.

Memory devices support the ECS feature register with the EDAC device driver,
which retrieves the ECS descriptor from the EDAC ECS driver.  This driver
exposes sysfs ECS control attributes to userspace via

  /sys/bus/edac/devices/<dev-name>/ecs_fruX/.

The common sysfs ECS control interface abstracts the control of an arbitrary
ECS functionality to a common set of functions.

Support for the ECS feature is added separately because the control attributes
of the DDR5 ECS feature differ from those of the scrub feature.

The sysfs ECS attribute nodes are only present if the client driver has
implemented the corresponding attribute callback function and passed the
necessary operations to the EDAC RAS feature driver during registration.

  [ bp: Massage, fixup edac_dev_register() retvals. ]

Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Tested-by: Fan Ni <fan.ni@samsung.com>
Link: https://lore.kernel.org/r/20250212143654.1893-4-shiju.jose@huawei.com
2025-02-25 15:42:32 +01:00

74 lines
2.4 KiB
Text

What: /sys/bus/edac/devices/<dev-name>/ecs_fruX
Date: March 2025
KernelVersion: 6.15
Contact: linux-edac@vger.kernel.org
Description:
The sysfs EDAC bus devices /<dev-name>/ecs_fruX subdirectory
pertains to the memory media ECS (Error Check Scrub) control
feature, where <dev-name> directory corresponds to a device
registered with the EDAC device driver for the ECS feature.
/ecs_fruX belongs to the media FRUs (Field Replaceable Unit)
under the memory device.
The sysfs ECS attr nodes are only present if the parent
driver has implemented the corresponding attr callback
function and provided the necessary operations to the EDAC
device driver during registration.
What: /sys/bus/edac/devices/<dev-name>/ecs_fruX/log_entry_type
Date: March 2025
KernelVersion: 6.15
Contact: linux-edac@vger.kernel.org
Description:
(RW) The log entry type of how the DDR5 ECS log is reported.
- 0 - per DRAM.
- 1 - per memory media FRU.
- All other values are reserved.
What: /sys/bus/edac/devices/<dev-name>/ecs_fruX/mode
Date: March 2025
KernelVersion: 6.15
Contact: linux-edac@vger.kernel.org
Description:
(RW) The mode of how the DDR5 ECS counts the errors.
Error count is tracked based on two different modes
selected by DDR5 ECS Control Feature - Codeword mode and
Row Count mode. If the ECS is under Codeword mode, then
the error count increments each time a codeword with check
bit errors is detected. If the ECS is under Row Count mode,
then the error counter increments each time a row with
check bit errors is detected.
- 0 - ECS counts rows in the memory media that have ECC errors.
- 1 - ECS counts codewords with errors, specifically, it counts
the number of ECC-detected errors in the memory media.
- All other values are reserved.
What: /sys/bus/edac/devices/<dev-name>/ecs_fruX/reset
Date: March 2025
KernelVersion: 6.15
Contact: linux-edac@vger.kernel.org
Description:
(WO) ECS reset ECC counter.
- 1 - reset ECC counter to the default value.
- All other values are reserved.
What: /sys/bus/edac/devices/<dev-name>/ecs_fruX/threshold
Date: March 2025
KernelVersion: 6.15
Contact: linux-edac@vger.kernel.org
Description:
(RW) DDR5 ECS threshold count per gigabits of memory cells.
The ECS error count is subject to the ECS Threshold count
per Gbit, which masks error counts less than the Threshold.
Supported values are 256, 1024 and 4096.
All other values are reserved.