winsys/amdgpu: add a new winsys for the new kernel driver

v2: - lots of changes according to Emil Velikov's comments
    - implemented radeon_winsys::read_registers

v3: - a lot of new work, many of them adapt to libdrm interface changes
Squashed patches:
winsys/amdgpu: implement radeon_winsys context support
winsys/amdgpu: add reference counting for contexts
winsys/amdgpu: add userptr support
winsys/amdgpu: allocate IBs like normal buffers
winsys/amdgpu: add IBs to the buffer list, adapt to interface changes
winsys/amdgpu: don't use KMS handles as reloc hash keys
winsys/amdgpu: sync buffer accesses to different rings
winsys/amdgpu: use dependencies instead of waiting for last fence v2
gallium/radeon: unify buffer_wait and buffer_is_busy in the winsys interface (amdgpu part)
winsys/amdgpu: track fences per ring and be thread-safe
winsys/amdgpu: simplify waiting on a variable in amdgpu_fence_wait
gallium/radeon: allow the winsys to choose the IB size (amdgpu part)
winsys/amdgpu: switch to new amdgpu_cs_query_fence_status interface
winsys/amdgpu: handle fence and dependencies merge
winsys/amdgpu follow libdrm change to move user fence into UMD
winsys/amdgpu: use amdgpu_bo_va_op for va map/unmap v2
winsys/amdgpu: use the new tiling flags
winsys/amdgpu: switch to new GTT_USWC definition
winsys/amdgpu: expose amdgpu_cs_query_reset_state to drivers
winsys/amdgpu: fix valgrind warnings
winsys/amdgpu: don't use VRAM with APUs that don't have much of it
winsys/amdgpu: require LLVM 3.6.1 for VI because of bug fixes there
winsys/amdgpu: remove amdgpu_winsys::num_cpus
winsys/amdgpu: align BO size to page size
winsys/amdgpu: reduce BO cache timeout
winsys/amdgpu: remove useless flushing and waiting in amdgpu_bo_set_tiling
winsys/amdgpu: use amdgpu_device_handle as a unique device ID instead of fd
winsys/amdgpu: use safer access to amdgpu_fence_wait::signalled
winsys/amdgpu: allow maximum IB size of 4 MB
winsys/amdgpu: add ip_instance into amdgpu_fence
gallium/radeon: add RING_COMPUTE instead of RADEON_FLUSH_COMPUTE
winsys/amdgpu: set the ring type at CS initilization
winsys/amdgpu: query the GART page size from the kernel
winsys/amdgpu: correctly wait for shared buffers to become idle
winsys/amdgpu: set the amdgpu_cs_fence structure only once at fence creation
winsys/amdgpu: add a specific error message for cs_submit -> -ENOMEM
winsys/amdgpu: check num_active_ioctls before calling amdgpu_bo_wait_for_idle
winsys/amdgpu: clear user fence BO after allocating it
winsys/amdgpu: fix user fences
winsys/amdgpu: make amdgpu_winsys_create public
winsys/amdgpu: remove thread offloading
winsys/amdgpu: flatten the amdgpu_cs_context structure and simplify more

v4: require libdrm 2.4.63
diff --git a/configure.ac b/configure.ac
index 7ddb47a..4b76246 100644
--- a/configure.ac
+++ b/configure.ac
@@ -70,6 +70,7 @@
 dnl Versions for external dependencies
 LIBDRM_REQUIRED=2.4.60
 LIBDRM_RADEON_REQUIRED=2.4.56
+LIBDRM_AMDGPU_REQUIRED=2.4.63
 LIBDRM_INTEL_REQUIRED=2.4.61
 LIBDRM_NVVIEUX_REQUIRED=2.4.33
 LIBDRM_NOUVEAU_REQUIRED=2.4.62
@@ -2105,6 +2106,7 @@
         xradeonsi)
             HAVE_GALLIUM_RADEONSI=yes
             PKG_CHECK_MODULES([RADEON], [libdrm_radeon >= $LIBDRM_RADEON_REQUIRED])
+            PKG_CHECK_MODULES([AMDGPU], [libdrm_amdgpu >= $LIBDRM_AMDGPU_REQUIRED])
             gallium_require_drm "radeonsi"
             gallium_require_drm_loader
             radeon_llvm_check "radeonsi"
@@ -2357,6 +2359,7 @@
 		src/gallium/winsys/intel/drm/Makefile
 		src/gallium/winsys/nouveau/drm/Makefile
 		src/gallium/winsys/radeon/drm/Makefile
+		src/gallium/winsys/amdgpu/drm/Makefile
 		src/gallium/winsys/svga/drm/Makefile
 		src/gallium/winsys/sw/dri/Makefile
 		src/gallium/winsys/sw/kms-dri/Makefile