[CUDA] Fixed false error reporting in case of calling H->G->HD->D.
Launching a kernel from the host code does not generate code for the
kernel itself. This fixes an issue with clang erroneously reporting
an error for a HD->D call from within the kernel.
Differential Revision: https://reviews.llvm.org/D44837
llvm-svn: 328362
diff --git a/clang/lib/Sema/SemaCUDA.cpp b/clang/lib/Sema/SemaCUDA.cpp
index cac5f68..ccd93fa 100644
--- a/clang/lib/Sema/SemaCUDA.cpp
+++ b/clang/lib/Sema/SemaCUDA.cpp
@@ -790,9 +790,12 @@
// If the caller is known-emitted, mark the callee as known-emitted.
// Otherwise, mark the call in our call graph so we can traverse it later.
bool CallerKnownEmitted = IsKnownEmitted(*this, Caller);
- if (CallerKnownEmitted)
- MarkKnownEmitted(*this, Caller, Callee, Loc);
- else {
+ if (CallerKnownEmitted) {
+ // Host-side references to a __global__ function refer to the stub, so the
+ // function itself is never emitted and therefore should not be marked.
+ if (getLangOpts().CUDAIsDevice || IdentifyCUDATarget(Callee) != CFT_Global)
+ MarkKnownEmitted(*this, Caller, Callee, Loc);
+ } else {
// If we have
// host fn calls kernel fn calls host+device,
// the HD function does not get instantiated on the host. We model this by