drm/amdgpu:fix world switch hang
for SR-IOV, we must keep the pipeline-sync in the protection
of COND_EXEC, otherwise the command consumed by CPG is not
consistent when world switch triggerd, e.g.:
world switch hit and the IB frame is skipped so the fence
won't signal, thus CP will jump to the next DMAframe's pipeline-sync
command, and it will make CP hang foever.
after pipelin-sync moved into COND_EXEC the consistency can be
guaranteed
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
index 1b30d2a..659997b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
@@ -130,6 +130,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
unsigned i;
int r = 0;
+ bool need_pipe_sync = false;
if (num_ibs == 0)
return -EINVAL;
@@ -165,7 +166,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
if (ring->funcs->emit_pipeline_sync && job &&
((tmp = amdgpu_sync_get_fence(&job->sched_sync)) ||
amdgpu_vm_need_pipeline_sync(ring, job))) {
- amdgpu_ring_emit_pipeline_sync(ring);
+ need_pipe_sync = true;
dma_fence_put(tmp);
}
@@ -173,7 +174,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
ring->funcs->insert_start(ring);
if (job) {
- r = amdgpu_vm_flush(ring, job);
+ r = amdgpu_vm_flush(ring, job, need_pipe_sync);
if (r) {
amdgpu_ring_undo(ring);
return r;