Implement GPU path for matrix convolution. Note that when not convolving alpha,
the premultiplying is done less efficiently than in the raster path: it's
done on each texture access, rather than as a pre-processing pass. This was
so I could do the filter as a single custom stage; will try the optimization
separately.
This implementation gives a ~30X speedup on the GPU results for the
matrixconvolution bench (~10X due to the GPU, and ~3X due to texture
uploads/readback removal).
Note: this changes the matrixconvolution for the software path as well, so
it will likely break the bots until that test is rebaselined.
Review URL: https://codereview.appspot.com/6585069/
git-svn-id: http://skia.googlecode.com/svn/trunk@5809 2bbb7eff-a529-9590-31e7-b0007b416f81
3 files changed