drm/radeon: allocate page tables on demand v4

Based on Dmitries work, but splitting the code into page
directory and page table handling makes it far more
readable and (hopefully) more reliable.

Allocations of page tables are made from the SA on demand,
that should still work fine since all page tables are of
the same size.

Also using the fact that allocations from the SA are mostly
continuously (except for end of buffer wraps and under very
high memory pressure) to group updates send to the chipset
specific code into larger chunks.

v3: mostly a rewrite of Dmitries previous patch.
v4: fix some typos and coding style

Signed-off-by: Dmitry Cherkasov <Dmitrii.Cherkasov@amd.com>
Signed-off-by: Christian König <deathsimple@vodafone.de>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index b04c064..bc6b56b 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -663,9 +663,14 @@
 	struct list_head		list;
 	struct list_head		va;
 	unsigned			id;
-	unsigned			last_pfn;
-	u64				pd_gpu_addr;
-	struct radeon_sa_bo		*sa_bo;
+
+	/* contains the page directory */
+	struct radeon_sa_bo		*page_directory;
+	uint64_t			pd_gpu_addr;
+
+	/* array of page tables, one for each page directory entry */
+	struct radeon_sa_bo		**page_tables;
+
 	struct mutex			mutex;
 	/* last fence for cs using this vm */
 	struct radeon_fence		*fence;