[CUDA] Add -fcuda-approx-transcendentals flag.

Summary:
This lets us emit e.g. sin.approx.f32.  See
http://docs.nvidia.com/cuda/parallel-thread-execution/#floating-point-instructions-sin

Reviewers: rnk

Subscribers: tra, cfe-commits

Differential Revision: http://reviews.llvm.org/D20493

llvm-svn: 270484
diff --git a/clang/lib/Frontend/InitPreprocessor.cpp b/clang/lib/Frontend/InitPreprocessor.cpp
index 5d38d5f..f8b407b 100644
--- a/clang/lib/Frontend/InitPreprocessor.cpp
+++ b/clang/lib/Frontend/InitPreprocessor.cpp
@@ -938,6 +938,12 @@
     Builder.defineMacro("__CUDA_ARCH__");
   }
 
+  // We need to communicate this to our CUDA header wrapper, which in turn
+  // informs the proper CUDA headers of this choice.
+  if (LangOpts.CUDADeviceApproxTranscendentals || LangOpts.FastMath) {
+    Builder.defineMacro("__CLANG_CUDA_APPROX_TRANSCENDENTALS__");
+  }
+
   // OpenCL definitions.
   if (LangOpts.OpenCL) {
 #define OPENCLEXT(Ext) \