linux

mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-08-05 16:54:27 +00:00

Author	SHA1	Message	Date
Dennis Li	a891d239f9	drm/amdgpu: set error query ready after all IPs late init If set error query ready in amdgpu_ras_late_init, which will cause some IP blocks aren't initialized, but their error query is ready. Signed-off-by: Dennis Li <Dennis.Li@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:49 -04:00
Evan Quan	7dd8c205ea	drm/amdgpu: code cleanup around gpu reset Make code more readable. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:49 -04:00
Evan Quan	9e94d22c00	drm/amdgpu: optimize the gpu reset for XGMI setup V2 This is basically just some code cosmetic. The current design for XGMI setup gput reset is to operate on current device(adev) first and then on other devices from the hive(by another 'for' loop). But actually we can do some sort to the device list(to put current device 1st position) and handle all the devices in a single 'for' loop. V2: added missing hive->hive_lock protection Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:49 -04:00
Evan Quan	52fb44cf30	drm/amdgpu: correct cancel_delayed_work_sync on gpu reset As for XGMI setup, it should be performed on other devices from the hive also. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:49 -04:00
Evan Quan	a2f63ee8b5	drm/amdgpu: correct fbdev suspend on gpu reset As for XGMI setup, it needs to be performed on all the devices from the same hive. Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:49 -04:00
Bernard Zhao	10f39758b8	drm/amdgpu: cleanup coding style in amdkfd a bit Make the code a bit more readable by using a common error handling pattern. Signed-off-by: Bernard Zhao <bernard@vivo.com> Reviewed-by: Christian König <christian.koenig@amd.com>. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:49 -04:00
Kevin Wang	e05185b341	drm/amdgpu: clean up unused variable about ring lru clean up unused variable: 1. ring_lru_list 2. ring_lru_list_lock related-commit: drm/amdgpu: remove ring lru handling Signed-off-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:49 -04:00
Dennis Li	4cc1178e16	drm/amdgpu: replace DRM prefix with PCI device info for gfx/mmhub Prefix RAS message printing in gfx/mmhub with PCI device info, which assists the debug in multiple GPU case. Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:49 -04:00
Jiawei	7aba19182e	drm/amdgpu: disble vblank when unloading sriov driver disble vblank in dce_vitual_crtc_commit(), which is skipped under sriov before Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Jiawei <Jiawei.Gu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:49 -04:00
Yong Zhao	d69b8971e5	drm/amdgpu: Print CU information by default during initialization This is convenient for multiple teams to obtain the information. Also, add device info by using dev_info(). Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:49 -04:00
Yong Zhao	e1046a1f70	drm/amdgpu: Adjust the SDMA doorbell info printing Turn off the printing by default because it is not very useful, while adding more details. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:49 -04:00
Jonathan Kim	d84a430d9f	drm/amdgpu: fix race between pstate and remote buffer map Vega20 arbitrates pstate at hive level and not device level. Last peer to remote buffer unmap could drop P-State while another process is still remote buffer mapped. With this fix, P-States still needs to be disabled for now as SMU bug was discovered on synchronous P2P transfers. This should be fixed in the next FW update. Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:46 -04:00
James Zhu	4f610503f0	Revert "drm/amdgpu: Disable gfx off if VCN is busy" This reverts commit `3fded222f4` This is work around for vcn1 only. Currently vcn1 has separate begin_use and idle work handle. Signed-off-by: James Zhu <James.Zhu@amd.com> Tested-by: changzhu <Changfeng.Zhu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:46 -04:00
Guchun Chen	12c17b9d62	drm/amdgpu: fix kernel page fault issue by ras recovery on sGPU When running ras uncorrectable error injection and triggering GPU reset on sGPU, below issue is observed. It's caused by the list uninitialized when accessing. [ 80.047227] BUG: unable to handle page fault for address: ffffffffc0f4f750 [ 80.047300] #PF: supervisor write access in kernel mode [ 80.047351] #PF: error_code(0x0003) - permissions violation [ 80.047404] PGD 12c20e067 P4D 12c20e067 PUD 12c210067 PMD 41c4ee067 PTE 404316061 [ 80.047477] Oops: 0003 [#1] SMP PTI [ 80.047516] CPU: 7 PID: 377 Comm: kworker/7:2 Tainted: G OE 5.4.0-rc7-guchchen #1 [ 80.047594] Hardware name: System manufacturer System Product Name/TUF Z370-PLUS GAMING II, BIOS 0411 09/21/2018 [ 80.047888] Workqueue: events amdgpu_ras_do_recovery [amdgpu] Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: John Clements <John.Clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:46 -04:00
Kent Russell	69d0c18dda	drm/amdgpu: Disable FRU read on Arcturus Update the list with supported Arcturus chips, but disable for now until final list is confirmed. Ideally we can poll atombios for FRU support, instead of maintaining this list of chips, but this will enable serial number reading for supported ASICs for the time-being. Signed-off-by: Kent Russell <kent.russell@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:46 -04:00
Rajneesh Bhardwaj	53c9c89ac1	drm/amdgpu/gmc: Fix spelling mistake. Fixes a minor typo in the file. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:46 -04:00
Kent Russell	fdd21e62b0	Revert "drm/amdgpu: use the BAR if possible in amdgpu_device_vram_access v2" This reverts commit `c12b84d6e0`. The original patch causes a RAS event and subsequent kernel hard-hang when running the KFDMemoryTest.PtraceAccessInvisibleVram on VG20 and Arcturus dmesg output at hang time: [drm] RAS event of type ERREVENT_ATHUB_INTERRUPT detected! amdgpu 0000:67:00.0: GPU reset begin! Evicting PASID 0x8000 queues Started evicting pasid 0x8000 qcm fence wait loop timeout expired The cp might be in an unrecoverable state due to an unsuccessful queues preemption Failed to evict process queues Failed to suspend process 0x8000 Finished evicting pasid 0x8000 Started restoring pasid 0x8000 Finished restoring pasid 0x8000 [drm] UVD VCPU state may lost due to RAS ERREVENT_ATHUB_INTERRUPT amdgpu: [powerplay] Failed to send message 0x26, response 0x0 amdgpu: [powerplay] Failed to set soft min gfxclk ! amdgpu: [powerplay] Failed to upload DPM Bootup Levels! amdgpu: [powerplay] Failed to send message 0x7, response 0x0 amdgpu: [powerplay] [DisableAllSMUFeatures] Failed to disable all smu features! amdgpu: [powerplay] [DisableDpmTasks] Failed to disable all smu features! amdgpu: [powerplay] [PowerOffAsic] Failed to disable DPM! [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] ERROR suspend of IP block <powerplay> failed -5 Signed-off-by: Kent Russell <kent.russell@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:46 -04:00
Alex Deucher	079c72ad3a	drm/amdgpu/gfx9: add gfxoff quirk Fix screen corruption with firefox. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=207171 Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:46 -04:00
John Clements	7f70443fd8	drm/amdgpu: set mp1 state before reload Set MP1 state to prepare for unload before reloading SMU FW Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:46 -04:00
John Clements	40e611bdd1	drm/amdgpu: update psp fw loading sequence Added dedicated function to check if particular fw should be skipped from loading. Added dedicated function for SMU FW loading via PSP Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:45 -04:00
Prike Liang	ced1ba9761	drm/amdgpu: fix the hw hang during perform system reboot and reset The system reboot failed as some IP blocks enter power gate before perform hw resource destory. Meanwhile use unify interface to set device CGPG to ungate state can simplify the amdgpu poweroff or reset ungate guard. Fixes: `487eca11a3` ("drm/amdgpu: fix gfx hang during suspend with video playback (v2)") Signed-off-by: Prike Liang <Prike.Liang@amd.com> Tested-by: Mengbing Wang <Mengbing.Wang@amd.com> Tested-by: Paul Menzel <pmenzel@molgen.mpg.de> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-22 18:11:45 -04:00
Jason Yan	8e2f842063	drm/amdgpu: remove dead code in si_dpm.c This code is dead, let's remove it. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-13 12:02:42 -04:00
Aurabindo Pillai	dd4fa6c1b8	drm/amd/amdgpu: remove hardcoded module name in prints Let format prefixes take care of printing the module name through pr_fmt and dev_fmt definitions. Signed-off-by: Aurabindo Pillai <mail@aurabindo.in> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-13 12:02:40 -04:00
Aurabindo Pillai	539489fc91	drm/amd/amdgpu: add print prefix for dev_* variants Define dev_fmt macro for informative print messages Signed-off-by: Aurabindo Pillai <mail@aurabindo.in> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-13 12:02:37 -04:00
Aurabindo Pillai	d57229b1da	drm/amd/amdgpu: add prefix for pr_* prints amdgpu uses lots of pr_* calls for printing error messages. With this prefix, errors shall be more obvious to the end use regarding its origin, and may help debugging. Prefix format: [xxx.xxxxx] amdgpu: ... Signed-off-by: Aurabindo Pillai <mail@aurabindo.in> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-13 12:02:35 -04:00
Alex Deucher	a4c2468027	drm/amdgpu/ring: simplify scheduler setup logic Set up a GPU scheduler based on the ring flag rather than the ring type. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-13 12:02:26 -04:00
Alex Deucher	a783910d5c	drm/amdgpu/kiq: add no_scheduler flag to KIQ We don't want a GPU scheduler for this ring. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-13 12:02:24 -04:00
Alex Deucher	cb3d108501	drm/amdgpu/ring: add no_scheduler flag This allows IPs to flag whether a specific ring requires a GPU scheduler or not. E.g., sometimes instances of an IP are asymmetric and have different capabilities. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-13 12:02:19 -04:00
Evan Quan	dadce777e0	drm/amdgpu: fix wrong vram lost counter increment V2 Vram lost counter is wrongly increased by two during baco reset. V2: assumed vram lost for mode1 reset on all ASICs Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-13 12:02:08 -04:00
Guchun Chen	ed72aa21c7	drm/amdgpu: replace DRM prefix with PCI device info for GFX RAS Prefix RAS message printing in GFX IP with PCI device info, which assists the debug in multiple GPU case. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-13 12:02:03 -04:00
Yintian Tao	d32709dac6	drm/amdgpu: resume kiq access debugfs If there is no GPU hang, user still can access debugfs through kiq. Signed-off-by: Yintian Tao <yttao@amd.com> Reviewed-by: Monk Liu <Monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-13 12:01:56 -04:00
Guchun Chen	6952e99cfd	drm/amdgpu: refine ras related message print Prefix ras related kernel message logging with PCI device info by replacing DRM_INFO/WARN/ERROR with dev_info/warn/err. This can clearly tell user about GPU device information where ras is. And add some other ras message printing to make it more clear and friendly as well. Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-13 12:01:50 -04:00
Guchun Chen	1f3ef0efba	drm/amdgpu: add uncorrectable error count print in UMC ecc irq cb Uncorrectable error count printing is missed when issuing UMC UE injection. When going to the error count log function in GPU recover work thread, there is no chance to get correct error count value by last error injection and print, because the error status register is automatically cleared after reading in UMC ecc irq callback. So add such message printing in UMC ecc irq cb to be consistent with other RAS error interrupt cases. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-13 12:01:44 -04:00
Yintian Tao	95a2f91738	drm/amdgpu: restrict debugfs register access under SR-IOV Under bare metal, there is no more else to take care of the GPU register access through MMIO. Under Virtualization, to access GPU register is implemented through KIQ during run-time due to world-switch. Therefore, under SR-IOV user can only access debugfs to r/w GPU registers when meets all three conditions below. - amdgpu_gpu_recovery=0 - TDR happened - in_gpu_reset=0 v2: merge amdgpu_virt_can_access_debugfs() into amdgpu_virt_enable_access_debugfs() v3: drop ret variable in amdgpu_virt_enable_access_debugfs() and directly return result Signed-off-by: Yintian Tao <yttao@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-13 12:01:04 -04:00
John Clements	9a785c7ad1	drm/amdgpu: increased atom cmd timeout added macro to define timeout Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-09 10:43:33 -04:00
Aurabindo Pillai	ad36d71b3f	amdgpu_kms: Remove unnecessary condition check Execution will only reach here if the asserted condition is true. Hence there is no need for the additional check. Signed-off-by: Aurabindo Pillai <mail@aurabindo.in> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-09 10:43:18 -04:00
Aaron Liu	ba714a56fc	drm/amdgpu: unify fw_write_wait for new gfx9 asics Make the fw_write_wait default case true since presumably all new gfx9 asics will have updated firmware. That is using unique WAIT_REG_MEM packet with opration=1. Signed-off-by: Aaron Liu <aaron.liu@amd.com> Tested-by: Aaron Liu <aaron.liu@amd.com> Tested-by: Yuxian Dai <Yuxian.Dai@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-09 10:43:18 -04:00
Hawking Zhang	2eee0229f6	drm/amdgpu: support access regs outside of mmio bar add indirect access support to registers outside of mmio bar. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-09 10:43:18 -04:00
Hawking Zhang	f384ff95f6	drm/amdgpu: retire AMDGPU_REGS_KIQ flag all the register access through kiq is redirected to amdgpu_kiq_rreg/amdgpu_kiq_wreg Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-09 10:43:18 -04:00
Hawking Zhang	ec59847e74	drm/amdgpu: retire RREG32_IDX/WREG32_IDX those are not needed anymore Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-09 10:43:18 -04:00
Hawking Zhang	3c888c1635	drm/amdgpu: retire indirect mmio reg support from cgs not needed anymore Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-09 10:43:18 -04:00
Hawking Zhang	46e840ed10	drm/amdgpu: replace indirect mmio access in non-dc code path all the mmCUR_CONTROL instances are in mmr range and can be accessd directly by using RREG32/WREG32 Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-09 10:43:18 -04:00
Hawking Zhang	dec0520aff	drm/amdgpu: remove inproper workaround for vega10 the workaround is not needed for soc15 ASICs except for vega10. it is even not needed with latest vega10 vbios. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-09 10:43:18 -04:00
Prike Liang	a23ca7f76d	drm/amdgpu: fix gfx hang during suspend with video playback (v2) The system will be hang up during S3 suspend because of SMU is pending for GC not respose the register CP_HQD_ACTIVE access request.This issue root cause of accessing the GC register under enter GFX CGGPG and can be fixed by disable GFX CGPG before perform suspend. v2: Use disable the GFX CGPG instead of RLC safe mode guard. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Tested-by: Mengbing Wang <Mengbing.Wang@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-09 10:43:17 -04:00
Kent Russell	1ea2b260eb	drm/amdgpu: Re-enable FRU check for most models v5 There is at least 1 VG20 DID that does not have an FRU, and trying to read that will cause a hang. For now, explicitly support reading the FRU for Arcturus and for the WKS VG20 DIDs, and skip for everything else. This re-enables serial number reporting for server cards v2: Add ASIC check v3: Don't default to true for pre-VG20 v4: Use DID instead of parsing the VBIOS v5: Sqaush in overflow warning fix Signed-off-by: Kent Russell <kent.russell@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-09 10:43:17 -04:00
Jack Zhang	b639c22c98	drm/amdgpu/sriov add amdgpu_amdkfd_pre_reset in gpu reset [PATCH 2/2] kfd_pre_reset will free mem_objs allocated by kfd_gtt_sa_allocate Without this change, sriov tdr code path will never free those allocated memories and get memory leak. Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com> Reviewed-by: Monk Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-09 10:43:15 -04:00
Jack Zhang	fe9824d15e	drm/amdkfd Avoid destroy hqd when GPU is on reset This reverts commit 5161bba4311f in order to split it into two different patches, and this will make it easier to understand. [PATCH 1/2] porting to gfx10 from commit `1b0bfcff46` ("drm/amdgpu: Avoid destroy hqd when GPU is on reset") Originally, MEC is touched without GPU initialized first. Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com> Reviewed-by: Monk Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-09 10:43:15 -04:00
John Clements	4a06686b94	drm/amdgpu: update RAS related dmesg print prefix RAS error related dmesg print with pci device info Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-09 10:43:15 -04:00
John Clements	b3dbd6d3ec	drm/amdgpu: resolve mGPU RAS query instability upon receiving uncorrectable error, query every GPU node for ras errors Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-09 10:43:15 -04:00
Chengming Gui	c419bdf5b8	drm/amd/amdgpu: Correct gfx10's CG sequence Incorrect CG sequence will cause gfx timedout, if we keep switching power profile mode (enter profile mod such as PEAK will disable CG, exit profile mode EXIT will enable CG) when run Vulkan test case(case used for test: vkexample). Signed-off-by: Chengming Gui <Jack.Gui@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-09 10:43:15 -04:00

1 2 3 4 5 ...

7194 commits