Add a "nop filler" pass to SPU.

Filling no-ops is done just before emitting of assembly,
when the instruction stream is final. No-ops are inserted
to align the instructions so the dual-issue of the pipeline
is utilized. This speeds up generated code with a minimum of 
1% on a select set of algorithms.

This pass may be redundant if the instruction scheduler and 
all subsequent passes that modify the instruction stream 
(prolog+epilog inserter, register scavenger, are there others?)
are made aware of the instruction alignments.

llvm-svn: 123226
6 files changed
tree: d2ef5fc4e00d53261babd0fb5965b558e29d924e
  1. clang/
  2. compiler-rt/
  3. debuginfo-tests/
  4. libcxx/
  5. lldb/
  6. llvm/