Optimize concat44 on canvas

- allow clients to pass in raw-array at public level
- rewrite m44 concat to take raw-array for 2nd input

Extended canvas_matrix bench to also time 44 concats:

Before
    207.51  	canvas_matrix_4x4	8888
    100.75 ?	canvas_matrix_3x3	8888
    100.79  	canvas_matrix_2x3	8888
    140.93  	canvas_matrix_scale	8888
    133.35  	canvas_matrix_trans	8888

After
    120.80  	canvas_matrix_4x4	8888
    ...

bug: skia:9768

Change-Id: Iaa361b9897a183d930fd31aa67327caed25cd51d
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/263209
Reviewed-by: Florin Malita <fmalita@chromium.org>
Commit-Queue: Mike Reed <reed@google.com>
7 files changed