Among the scheduler's statuses, the only one that indicates an error is
DRM_GPU_SCHED_STAT_ENODEV. Any status other than DRM_GPU_SCHED_STAT_ENODEV
signifies that the operation succeeded and the GPU is in a nominal state.
However, to provide more information about the GPU's status, it is needed
to convey more information than just "OK".
Therefore, rename DRM_GPU_SCHED_STAT_NOMINAL to
DRM_GPU_SCHED_STAT_RESET, which better communicates the meaning of this
status. The status DRM_GPU_SCHED_STAT_RESET indicates that the GPU has
hung, but it has been successfully reset and is now in a nominal state
again.
Reviewed-by: Philipp Stanner <phasta@kernel.org>
Link: https://lore.kernel.org/r/20250714-sched-skip-reset-v6-1-5c5ba4f55039@igalia.com
Signed-off-by: Maíra Canal <mcanal@igalia.com>
The device bo is allocated from the device heap memory. (a trunk of
memory dedicated to device)
Rename amdxdna_gem_insert_node_locked to amdxdna_gem_heap_alloc
and move related sanity checks into it.
Add amdxdna_gem_dev_obj_free and move device bo free code into it.
Calculate the kernel virtual address of device bo by the device
heap memory address and offset.
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://lore.kernel.org/r/20250616091418.2605476-1-lizhi.hou@amd.com
This will be used in a later commit to trace the drm client_id in
some of the gpu_scheduler trace events.
This requires changing all the users of drm_sched_job_init to
add an extra parameter.
The newly added drm_client_id field in the drm_sched_fence is a bit
of a duplicate of the owner one. One suggestion I received was to
merge those 2 fields - this can't be done right now as amdgpu uses
some special values (AMDGPU_FENCE_OWNER_*) that can't really be
translated into a client id. Christian is working on getting rid of
those; when it's done we should be able to squash owner/drm_client_id
together.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Philipp Stanner <phasta@kernel.org>
Link: https://lore.kernel.org/r/20250526125505.2360-3-pierre-eric.pelloux-prayer@amd.com
Add amdxdna_gem_prime_export() and amdxdna_gem_prime_import() for BO
import and export. Register mmu notifier for imported BO as well. When
MMU_NOTIFIER_UNMAP event is received, queue work to remove the notifier.
The same BO could be mapped multiple times if it is exported and imported
by an application. Use a link list to track VMAs the BO been mapped.
v2: Rebased and call get_dma_buf() before dma_buf_attach()
v3: Removed import_attach usage
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://lore.kernel.org/r/20250325200105.2744079-1-lizhi.hou@amd.com
It is required by firmware to wait up to 2 seconds for pending commands
before sending the destroy hardware context command. After 2 seconds
wait, if there are still pending commands, driver needs to cancel them.
So the context destroy steps need to be:
1. Stop drm scheduler. (drm_sched_entity_destroy)
2. Wait up to 2 seconds for pending commands.
3. Destroy hardware context and cancel the rest pending requests.
4. Wait all jobs associated with the hwctx are freed.
5. Free job resources.
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250124173536.148676-1-lizhi.hou@amd.com
drm_sched_init() has a great many parameters and upcoming new
functionality for the scheduler might add even more. Generally, the
great number of parameters reduces readability and has already caused
one missnaming, addressed in:
commit 6f1cacf4eb ("drm/nouveau: Improve variable name in
nouveau_sched_init()").
Introduce a new struct for the scheduler init parameters and port all
users.
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Acked-by: Matthew Brost <matthew.brost@intel.com> # for Xe
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> # for Panfrost and Panthor
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com> # for Etnaviv
Reviewed-by: Frank Binns <frank.binns@imgtec.com> # for Imagination
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> # for Sched
Reviewed-by: Maíra Canal <mcanal@igalia.com> # for v3d
Reviewed-by: Danilo Krummrich <dakr@kernel.org>
Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> # for amdxdna
Signed-off-by: Philipp Stanner <phasta@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20250211111422.21235-2-phasta@kernel.org
Fix sparse warning:
symbol 'sched_ops' was not declared. Should it be static?
Fixes: aac243092b ("accel/amdxdna: Add command execution")
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250113182617.1256094-2-lizhi.hou@amd.com
For input ioctl structures, it is better to check if the pad is zero.
Thus, the pad bytes might be usable in the future.
Suggested-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241217165446.2607585-1-lizhi.hou@amd.com
Add SET_STATE ioctl to configure device power mode for aie2 device.
Three modes are supported initially.
POWER_MODE_DEFAULT: Enable clock gating and set DPM (Dynamic Power
Management) level to value which has been set by resource solver or
maximum DPM level the device supports.
POWER_MODE_HIGH: Enable clock gating and set DPM level to maximum DPM
level the device supports.
POWER_MODE_TURBO: Disable clock gating and set DPM level to maximum DPM
level the device supports.
Disabling clock gating means all clocks always run on full speed. And
the different clock frequency are used based on DPM level been set.
Initially, the driver set the power mode to default mode.
Co-developed-by: Narendra Gutta <VenkataNarendraKumar.Gutta@amd.com>
Signed-off-by: Narendra Gutta <VenkataNarendraKumar.Gutta@amd.com>
Co-developed-by: George Yang <George.Yang@amd.com>
Signed-off-by: George Yang <George.Yang@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241213232933.1545388-4-lizhi.hou@amd.com
Switch mailbox message id and hardware context id management over from
the idr api to the xarray api.
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241213232933.1545388-3-lizhi.hou@amd.com
Hardware mailbox message receiving handler calls mmput to release the
process mm. If the process has already exited, the mmput here may call mmu
notifier handler, amdxdna_hmm_invalidate, which will cause a dead lock.
Using mmput_async instead prevents this dead lock.
Fixes: aac243092b ("accel/amdxdna: Add command execution")
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241206220001.164049-3-lizhi.hou@amd.com
Add interfaces for user application to submit command and wait for its
completion.
Co-developed-by: Min Ma <min.ma@amd.com>
Signed-off-by: Min Ma <min.ma@amd.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241118172942.2014541-8-lizhi.hou@amd.com
There different types of BOs are supported:
- shmem
A user application uses shmem BOs as input/output for its workload running
on NPU.
- device memory heap
The fixed size buffer dedicated to the device.
- device buffer
The buffer object allocated from device memory heap.
- command buffer
The buffer object created for delivering commands. The command buffer
object is small and pinned on creation.
New IOCTLs are added: CREATE_BO, GET_BO_INFO, SYNC_BO. SYNC_BO is used
to explicitly flush CPU cache for BO memory.
Co-developed-by: Min Ma <min.ma@amd.com>
Signed-off-by: Min Ma <min.ma@amd.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241118172942.2014541-7-lizhi.hou@amd.com
The hardware can be shared among multiple user applications. The
hardware resources are allocated/freed based on the request from
user application via driver IOCTLs.
DRM_IOCTL_AMDXDNA_CREATE_HWCTX
Allocate tile columns and create a hardware context structure to track the
usage and status of the resources. A hardware context ID is returned for
XDNA command execution.
DRM_IOCTL_AMDXDNA_DESTROY_HWCTX
Release hardware context based on its ID. The tile columns belong to
this hardware context will be reclaimed.
DRM_IOCTL_AMDXDNA_CONFIG_HWCTX
Config hardware context. Bind the hardware context to the required
resources.
Co-developed-by: Min Ma <min.ma@amd.com>
Signed-off-by: Min Ma <min.ma@amd.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241118172942.2014541-6-lizhi.hou@amd.com