[CUDA] Do a better job at detecting wrong-side calls.

Summary:
Move CheckCUDACall from ActOnCallExpr and BuildDeclRefExpr to
DiagnoseUseOfDecl.  This lets us catch some edge cases we were missing,
specifically around class operators.

This necessitates a few other changes:

 - Avoid emitting duplicate deferred diags in CheckCUDACall.

   Previously we'd carefully placed our call to CheckCUDACall such that
   it would only ever run once for a particular callsite.  But now this
   isn't the case.

 - Emit deferred diagnostics from a template
   specialization/instantiation's primary template, in addition to from
   the specialization/instantiation itself.  DiagnoseUseOfDecl ends up
   putting the deferred diagnostics on the template, rather than the
   specialization, so we need to check both.

Reviewers: rsmith

Subscribers: cfe-commits, tra

Differential Revision: https://reviews.llvm.org/D24573

llvm-svn: 283637
diff --git a/clang/lib/Sema/SemaCUDA.cpp b/clang/lib/Sema/SemaCUDA.cpp
index 8223041..cb70192 100644
--- a/clang/lib/Sema/SemaCUDA.cpp
+++ b/clang/lib/Sema/SemaCUDA.cpp
@@ -495,7 +495,13 @@
     Diag(Callee->getLocation(), diag::note_previous_decl) << Callee;
     return false;
   }
-  if (Pref == Sema::CFP_WrongSide) {
+
+  // Insert into LocsWithCUDADeferredDiags to avoid emitting duplicate deferred
+  // diagnostics for the same location.  Duplicate deferred diags are otherwise
+  // tricky to avoid, because, unlike with regular errors, sema checking
+  // proceeds unhindered when we omit a deferred diagnostic.
+  if (Pref == Sema::CFP_WrongSide &&
+      LocsWithCUDACallDeferredDiags.insert(Loc.getRawEncoding()).second) {
     // We have to do this odd dance to create our PartialDiagnostic because we
     // want its storage to be allocated with operator new, not in an arena.
     PartialDiagnostic ErrPD{PartialDiagnostic::NullDiagnostic()};