Interpreter: Change all Load/Store ops to use immediate indices

Avoids the need for a PushImmediate (which usually also required
a Nop for alignment) on every load/store. And it makes the codegen
for local vs. global identical, which simplifies the byte-code
generator in a few spots.

This means that places previously using DupDown now need something that
dupes the *top* N elements of the stack. Just use Vector + Dup, and
remove the (now unused) DupDown instruction.

Also add/implement kStoreSwizzleGlobal, which was the last missing
load/store op (and add a test case).

Change-Id: Id450272c11f4f1842c9d57bc5114e7f81e9e926e
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/214062
Reviewed-by: Ethan Nicholas <ethannicholas@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
diff --git a/src/sksl/SkSLByteCodeGenerator.h b/src/sksl/SkSLByteCodeGenerator.h
index ccea57b..596c639 100644
--- a/src/sksl/SkSLByteCodeGenerator.h
+++ b/src/sksl/SkSLByteCodeGenerator.h
@@ -100,11 +100,6 @@
     void writeTypedInstruction(const Type& type, ByteCodeInstruction s, ByteCodeInstruction u,
                                ByteCodeInstruction f);
 
-    /**
-     * Pushes the storage location of an lvalue to the stack.
-     */
-    void writeTarget(const Expression& expr);
-
 private:
     // reserves 16 bits in the output code, to be filled in later with an address once we determine
     // it
@@ -272,6 +267,7 @@
 
     friend class DeferredLocation;
     friend class ByteCodeVariableLValue;
+    friend class ByteCodeSwizzleLValue;
 
     typedef CodeGenerator INHERITED;
 };