89bd323f4b9c4c2382e5361c52f25f86db330583 - kernel/msm

commit	89bd323f4b9c4c2382e5361c52f25f86db330583	[log] [tgz]
author	Jordan Crouse <jcrouse@codeaurora.org>	Mon Jul 02 17:50:15 2012 -0600
committer	Jordan Crouse <jcrouse@codeaurora.org>	Mon Jul 09 13:00:53 2012 -0600
tree	241d22535ec8065106153e3c816b21ceb06d8d54
parent	9aa5a9fbfbc7c4e4940269c34797648fb8d3ae1d [diff]

msm: kgsl: Optimize page_alloc allocations

User memory needs to be zeroed out before it is sent to the user.
To do this, the kernel maps the page, memsets it to zero and then
unmaps it.  By virtue of mapping it, this forces us to flush the
dcache to ensure cache coherency between kernel and user mappings.
Originally, the page_alloc loop was using GFP_ZERO (which does a
map, memset, and unmap for each individual page) and then we were
additionally calling flush_dcache_page() for each page killing us
on performance.  It is far more efficient, especially for large
allocations (> 1MB), to allocate the pages without GFP_ZERO and
then to vmap the entire allocation, memset it to zero, flush the
cache and then unmap. This process is slightly slower for very
small allocations, but only by a few microseconds, and is well
within the margin of acceptability. In all, the new scheme is
faster than the default for all sizes greater than 16k, and is
almost 4X faster for 2MB and 4MB allocations which are common for
textures and very large buffer objects.

The downside is that if there isn't enough vmalloc room for the
allocation that we are forced to fallback to a slow page by
page memset/flush, but this should happen rarely (if at all) and
is only included for completeness.

CRs-Fixed: 372638
Change-Id: Ic0dedbadf3e27dcddf0f068594a40c00d64b495e
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>

drivers/gpu/msm/kgsl_sharedmem.c[diff]

1 file changed

tree: 241d22535ec8065106153e3c816b21ceb06d8d54