[CUDA] Fixed false error reporting in case of calling H->G->HD->D. Launching a kernel from the host code does not generate code for the kernel itself. This fixes an issue with clang erroneously reporting an error for a HD->D call from within the kernel. Differential Revision: https://reviews.llvm.org/D44837 llvm-svn: 328362

commit: e2ae8b55108c08aa0718a020355de83e10897103 [log] [tgz]
author: Artem Belevich <tra@google.com> Fri Mar 23 19:49:03 2018 +0000
committer: Artem Belevich <tra@google.com> Fri Mar 23 19:49:03 2018 +0000
tree: 09c89597b7f1d0041c7e8ddcb00615060008b2b6
parent: 0a20cefffd3140666e29934df970561824e4c87e [diff] [blame]
diff --git a/clang/lib/Sema/SemaCUDA.cpp b/clang/lib/Sema/SemaCUDA.cpp
index cac5f68..ccd93fa 100644
--- a/clang/lib/Sema/SemaCUDA.cpp
+++ b/clang/lib/Sema/SemaCUDA.cpp

@@ -790,9 +790,12 @@
   // If the caller is known-emitted, mark the callee as known-emitted.
   // Otherwise, mark the call in our call graph so we can traverse it later.
   bool CallerKnownEmitted = IsKnownEmitted(*this, Caller);
-  if (CallerKnownEmitted)
-    MarkKnownEmitted(*this, Caller, Callee, Loc);
-  else {
+  if (CallerKnownEmitted) {
+    // Host-side references to a __global__ function refer to the stub, so the
+    // function itself is never emitted and therefore should not be marked.
+    if (getLangOpts().CUDAIsDevice || IdentifyCUDATarget(Callee) != CFT_Global)
+      MarkKnownEmitted(*this, Caller, Callee, Loc);
+  } else {
     // If we have
     //   host fn calls kernel fn calls host+device,
     // the HD function does not get instantiated on the host.  We model this by
commit	e2ae8b55108c08aa0718a020355de83e10897103	[log] [tgz]
author	Artem Belevich <tra@google.com>	Fri Mar 23 19:49:03 2018 +0000
committer	Artem Belevich <tra@google.com>	Fri Mar 23 19:49:03 2018 +0000
tree	09c89597b7f1d0041c7e8ddcb00615060008b2b6
parent	0a20cefffd3140666e29934df970561824e4c87e [diff] [blame]