linux/drivers/scsi/mpt3sas
Damien Le Moal 15592a11d5 scsi: mpt3sas: Correctly handle ATA device errors
With the ATA error model, an NCQ command failure always triggers an abort
(termination) of all NCQ commands queued on the device. In such case, the
SAT or the host must handle the failed command according to the command
sense data and immediately retry all other NCQ commands that were aborted
due to the failed NCQ command.

For SAS HBAs controlled by the mpt3sas driver, NCQ command aborts are not
handled by the HBA SAT and sent back to the host, with an ioc log
information equal to 0x31080000 (IOC_LOGINFO_PREFIX_PL with the PL code
PL_LOGINFO_CODE_SATA_NCQ_FAIL_ALL_CMDS_AFTR_ERR). The function
_scsih_io_done() always forces a retry of commands terminated with the
status MPI2_IOCSTATUS_SCSI_IOC_TERMINATED using the SCSI result
DID_SOFT_ERROR, regardless of the log_info for the command.  This
correctly forces the retry of collateral NCQ abort commands, but with the
retry counter for the command being incremented. If a command to an ATA
device is subject to too many retries due to other NCQ commands failing
(e.g. read commands trying to access unreadable sectors), the collateral
NCQ abort commands may be terminated with an error as they run out of
retries. This violates the SAT specification and causes hard-to-debug
command errors.

Solve this issue by modifying the handling of the
MPI2_IOCSTATUS_SCSI_IOC_TERMINATED status to check if a command is for an
ATA device and if the command loginfo indicates an NCQ collateral
abort. If that is the case, force the command retry using the SCSI result
DID_IMM_RETRY to avoid incrementing the command retry count.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20250606052747.742998-3-dlemoal@kernel.org
Tested-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-06-19 23:00:03 -04:00
..
mpi SCSI misc on 20250326 2025-03-26 19:57:34 -07:00
Kconfig
Makefile
mpt3sas_base.c Merge patch series "mpt3sas driver udpates" 2025-02-20 21:47:48 -05:00
mpt3sas_base.h Merge patch series "mpt3sas driver udpates" 2025-02-20 21:47:48 -05:00
mpt3sas_config.c scsi: mpt3sas: Remove unused config functions 2025-02-03 18:00:11 -05:00
mpt3sas_ctl.c scsi: mpt3sas: Drop unused variable in mpt3sas_send_mctp_passthru_req() 2025-06-16 14:23:50 -04:00
mpt3sas_ctl.h scsi: mpt3sas: Report driver capability as part of IOCINFO command 2025-02-20 21:46:35 -05:00
mpt3sas_debug.h
mpt3sas_debugfs.c
mpt3sas_scsih.c scsi: mpt3sas: Correctly handle ATA device errors 2025-06-19 23:00:03 -04:00
mpt3sas_transport.c scsi: mpt3sas: Mark device strings as nonstring 2025-02-28 11:51:32 -08:00
mpt3sas_trigger_diag.c
mpt3sas_trigger_diag.h
mpt3sas_trigger_pages.h scsi: mpt3sas: Fix typo of "TRIGGER" 2023-11-15 08:52:02 -05:00
mpt3sas_warpdrive.c move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00