[CUDA] Don't pass top-level -march down to device cc1 or ptxas.

Summary:
Previously if you did e.g.

  $ clang -march=haswell -x cuda foo.cu

we would pass "-march=haswell -march=sm_20" down to the ptxas tool.
This causes it to assert, and rightly so!

Reviewers: tra

Subscribers: cfe-commits, echristo

Differential Revision: http://reviews.llvm.org/D21419

llvm-svn: 272857
diff --git a/clang/lib/Driver/ToolChains.cpp b/clang/lib/Driver/ToolChains.cpp
index fcc6dfe..5043e53 100644
--- a/clang/lib/Driver/ToolChains.cpp
+++ b/clang/lib/Driver/ToolChains.cpp
@@ -4676,8 +4676,10 @@
     DAL->append(A);
   }
 
-  if (BoundArch)
+  if (BoundArch) {
+    DAL->eraseArg(options::OPT_march_EQ);
     DAL->AddJoinedArg(nullptr, Opts.getOption(options::OPT_march_EQ), BoundArch);
+  }
   return DAL;
 }