linux/arch/x86
Peter Newman fe1f071438 x86/resctrl: Fix task CLOSID/RMID update race
When the user moves a running task to a new rdtgroup using the task's
file interface or by deleting its rdtgroup, the resulting change in
CLOSID/RMID must be immediately propagated to the PQR_ASSOC MSR on the
task(s) CPUs.

x86 allows reordering loads with prior stores, so if the task starts
running between a task_curr() check that the CPU hoisted before the
stores in the CLOSID/RMID update then it can start running with the old
CLOSID/RMID until it is switched again because __rdtgroup_move_task()
failed to determine that it needs to be interrupted to obtain the new
CLOSID/RMID.

Refer to the diagram below:

CPU 0                                   CPU 1
-----                                   -----
__rdtgroup_move_task():
  curr <- t1->cpu->rq->curr
                                        __schedule():
                                          rq->curr <- t1
                                        resctrl_sched_in():
                                          t1->{closid,rmid} -> {1,1}
  t1->{closid,rmid} <- {2,2}
  if (curr == t1) // false
   IPI(t1->cpu)

A similar race impacts rdt_move_group_tasks(), which updates tasks in a
deleted rdtgroup.

In both cases, use smp_mb() to order the task_struct::{closid,rmid}
stores before the loads in task_curr().  In particular, in the
rdt_move_group_tasks() case, simply execute an smp_mb() on every
iteration with a matching task.

It is possible to use a single smp_mb() in rdt_move_group_tasks(), but
this would require two passes and a means of remembering which
task_structs were updated in the first loop. However, benchmarking
results below showed too little performance impact in the simple
approach to justify implementing the two-pass approach.

Times below were collected using `perf stat` to measure the time to
remove a group containing a 1600-task, parallel workload.

CPU: Intel(R) Xeon(R) Platinum P-8136 CPU @ 2.00GHz (112 threads)

  # mkdir /sys/fs/resctrl/test
  # echo $$ > /sys/fs/resctrl/test/tasks
  # perf bench sched messaging -g 40 -l 100000

task-clock time ranges collected using:

  # perf stat rmdir /sys/fs/resctrl/test

Baseline:                     1.54 - 1.60 ms
smp_mb() every matching task: 1.57 - 1.67 ms

  [ bp: Massage commit message. ]

Fixes: ae28d1aae4 ("x86/resctrl: Use an IPI instead of task_work_add() to update PQR_ASSOC MSR")
Fixes: 0efc89be94 ("x86/intel_rdt: Update task closid immediately on CPU in rmdir and unmount")
Signed-off-by: Peter Newman <peternewman@google.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/r/20221220161123.432120-1-peternewman@google.com
2023-01-10 19:47:30 +01:00
..
boot x86/boot: Avoid using Intel mnemonics in AT&T syntax asm 2023-01-10 13:03:23 +01:00
coco x86/insn: Avoid namespace clash by separating instruction decoder MMIO type from MMIO trace type 2023-01-03 18:46:06 +01:00
configs
crypto - Add the call depth tracking mitigation for Retbleed which has 2022-12-14 15:03:00 -08:00
entry - Add the call depth tracking mitigation for Retbleed which has 2022-12-14 15:03:00 -08:00
events perf/x86/rapl: Add support for Intel Emerald Rapids 2023-01-04 21:00:29 +01:00
hyperv x86/hyperv: Remove unregister syscore call from Hyper-V cleanup 2022-11-29 17:55:29 +00:00
ia32
include x86/insn: Avoid namespace clash by separating instruction decoder MMIO type from MMIO trace type 2023-01-03 18:46:06 +01:00
kernel x86/resctrl: Fix task CLOSID/RMID update race 2023-01-10 19:47:30 +01:00
kvm Merge branch 'kvm-late-6.1-fixes' into HEAD 2022-12-28 07:19:14 -05:00
lib x86/insn: Avoid namespace clash by separating instruction decoder MMIO type from MMIO trace type 2023-01-03 18:46:06 +01:00
math-emu
mm x86/pat: Fix pat_x_mtrr_type() for MTRR disabled case 2023-01-10 17:21:53 +01:00
net - Add the call depth tracking mitigation for Retbleed which has 2022-12-14 15:03:00 -08:00
pci x86/PCI: Use pr_info() when possible 2022-12-10 10:33:18 -06:00
platform pci-v6.2-changes 2022-12-14 09:54:10 -08:00
power - Add the call depth tracking mitigation for Retbleed which has 2022-12-14 15:03:00 -08:00
purgatory
ras
realmode x86/boot: Skip realmode init code when running as Xen PV guest 2022-11-25 12:05:22 +01:00
tools
um [elf][non-regset] uninline elf_core_copy_task_fpregs() (and lose pt_regs argument) 2022-11-24 23:24:23 -05:00
video
virt/vmx/tdx
xen - Add the call depth tracking mitigation for Retbleed which has 2022-12-14 15:03:00 -08:00
.gitignore
Kbuild
Kconfig powerpc updates for 6.2 2022-12-19 07:13:33 -06:00
Kconfig.assembler
Kconfig.cpu
Kconfig.debug
Makefile Kbuild updates for v6.2 2022-12-19 12:33:32 -06:00
Makefile.um
Makefile_32.cpu