Generally, each write operation (Write(), WritesDone()) implies a syscall. gRPC will try to batch together separate write operations from different threads, but currently cannot automatically infer batching in a single stream.
If message k+1 in a stream does not rely on responses from message k, it's possible to enable write batching by passing a WriteOptions argument to Write with the buffer_hint set:
stream_writer->Write(message, WriteOptions().set_buffer_hint());
The write will be buffered until one of the following is true:
Right now, the best performance trade-off is having numcpu's threads and one completion queue per thread.