rework plus blend mode

The most interesting parts of this are how plus interacts with partial
coverage.  Plus needs its clamp to happen after the lerp.
Luckily, some of its math folds away:

  d' = clamp[ d*(1-c) + (s+d)*c ] ==
       clamp[ d - dc  + sc + dc ] ==
       clamp[ d       + sc      ]

What's nice there is that coverage can be folded into the src term.
This suggests that we can re-write the plus stage to clamp internally
(and thus, be viable for 8-bit) if we always pre-scale with coverage.

We don't have a way to pre-scale with 565 coverage until now, but
it's only a step or two away from there.  We can use the alternate
formulation we derived for alpha for lerp_565, calculating the alpha
coverage from red, green, and blue coverages _and_ the values of src
and dst alpha.

While we already pre-scale srcover today for 8-bit or constant coverage,
we cannot do the same for 565.  When evaluating the expression

   d' = s + (1-a)d

we need the a term to be pre-scaled with red's coverage when calculating
dr', with blue's when calculating db', etc.  Essentially we need to
carry around a bunch of extra values, and we've got no way to do that.

So instead, we'll just carefully pre-scale plus with any coverage, and
keep post-lerping srcover when we have 565 coverage.

Change-Id: I7a7a52eec7d482e1b98bb8a01ea0a3d5e67bef65
Reviewed-on: https://skia-review.googlesource.com/38300
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Florin Malita <fmalita@chromium.org>
10 files changed