NVPTX: Replace uses of cuda.syncthreads with nvvm.barrier0

Everywhere where cuda.syncthreads or __syncthreads is used, use the
properly namespaced nvvm.barrier0 instead.

llvm-svn: 274664
11 files changed