draw: split off all the extra functionality in the vertex shader

This will at least allow us to make the initial gains to get decent
vertex performance much more quickly & with higher confidence of getting
it right.

At some later point can look again at code-generating all the
fetch/cliptest/viewport extras in the same block as the vertex shader.
For now, just need to get some decent baseline performance.
13 files changed