linux/drivers/infiniband/hw
Mike Marciniszyn f6a3cfec3c IB/hfi1: Fix early init panic
The following trace can be observed with an init failure such as firmware
load failures:

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
  PGD 0 P4D 0
  Oops: 0010 [#1] SMP PTI
  CPU: 0 PID: 537 Comm: kworker/0:3 Tainted: G           OE    --------- -  - 4.18.0-240.el8.x86_64 #1
  Workqueue: events work_for_cpu_fn
  RIP: 0010:0x0
  Code: Bad RIP value.
  RSP: 0000:ffffae5f878a3c98 EFLAGS: 00010046
  RAX: 0000000000000000 RBX: ffff95e48e025c00 RCX: 0000000000000000
  RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff95e48e025c00
  RBP: ffff95e4bf3660a4 R08: 0000000000000000 R09: ffffffff86d5e100
  R10: ffff95e49e1de600 R11: 0000000000000001 R12: ffff95e4bf366180
  R13: ffff95e48e025c00 R14: ffff95e4bf366028 R15: ffff95e4bf366000
  FS:  0000000000000000(0000) GS:ffff95e4df200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: ffffffffffffffd6 CR3: 0000000f86a0a003 CR4: 00000000001606f0
  Call Trace:
   receive_context_interrupt+0x1f/0x40 [hfi1]
   __free_irq+0x201/0x300
   free_irq+0x2e/0x60
   pci_free_irq+0x18/0x30
   msix_free_irq.part.2+0x46/0x80 [hfi1]
   msix_clean_up_interrupts+0x2b/0x70 [hfi1]
   hfi1_init_dd+0x640/0x1a90 [hfi1]
   do_init_one.isra.19+0x34d/0x680 [hfi1]
   local_pci_probe+0x41/0x90
   work_for_cpu_fn+0x16/0x20
   process_one_work+0x1a7/0x360
   worker_thread+0x1cf/0x390
   ? create_worker+0x1a0/0x1a0
   kthread+0x112/0x130
   ? kthread_flush_work_fn+0x10/0x10
   ret_from_fork+0x35/0x40

The free_irq() results in a callback to the registered interrupt handler,
and rcd->do_interrupt is NULL because the receive context data structures
are not fully initialized.

Fix by ensuring that the do_interrupt is always assigned and adding a
guards in the slow path handler to detect and handle a partially
initialized receive context and noop the receive.

Link: https://lore.kernel.org/r/20211129192003.101968.33612.stgit@awfm-01.cornelisnetworks.com
Cc: stable@vger.kernel.org
Fixes: b0ba3c18d6 ("IB/hfi1: Move normal functions from hfi1_devdata to const array")
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-12-07 13:22:54 -04:00
..
bnxt_re RDMA/bnxt_re: Remove unsupported bnxt_re_modify_ah callback 2021-11-03 09:06:36 -03:00
cxgb4 RDMA: Remove redundant 'flush_workqueue()' calls 2021-10-12 13:21:23 -03:00
efa RDMA/efa: Add support for dmabuf memory regions 2021-10-28 08:58:26 -03:00
hfi1 IB/hfi1: Fix early init panic 2021-12-07 13:22:54 -04:00
hns RDMA/hns: Do not destroy QP resources in the hw resetting phase 2021-11-25 13:20:24 -04:00
irdma Linux 5.15 2021-11-01 14:49:20 -03:00
mlx4 RDMA/mlx4: Do not fail the registration on port stats 2021-11-17 16:45:16 -04:00
mlx5 RDMA/mlx5: Fix releasing unallocated memory in dereg MR flow 2021-11-25 13:16:39 -04:00
mthca RDMA: switch from 'pci_' to 'dma_' API 2021-08-23 13:43:54 -03:00
ocrdma RDMA: Globally allocate and release QP memory 2021-08-03 13:44:27 -03:00
qedr RDMA v5.16 merge window pull request 2021-11-03 08:05:59 -07:00
qib Linux 5.15 2021-11-01 14:49:20 -03:00
usnic RDMA: Constify netdev->dev_addr accesses 2021-10-25 14:33:09 -03:00
vmw_pvrdma RDMA: switch from 'pci_' to 'dma_' API 2021-08-23 13:43:54 -03:00
Makefile RDMA/irdma: Add irdma Kconfig/Makefile and remove i40iw 2021-06-02 20:06:36 -03:00