commit | c47d948a472936c54693dd49cac17d66a87420c6 | [log] [tgz] |
---|---|---|
author | Carl Mastrangelo <notcarl@google.com> | Fri Aug 12 15:19:22 2016 -0700 |
committer | Carl Mastrangelo <notcarl@google.com> | Wed Aug 17 15:45:21 2016 -0700 |
tree | 62ec8a6d086e43ef21c7ceb35ea5f6560364129a | |
parent | 09d663faf18b41b0a4b736fe9c0908cea13eea24 [diff] |
protobuf: copy input data before decoding CodedInputStream is risk averse in ways that hurt performance when parsing large messages. gRPC knows how large the input size is as it is being read from the wire, and only tries to parse it once the entire message has been read in. The message is represented as chunks of memory strung together in a CompositeReadableBuffer, and then wrapped in a custom BufferInputStream. When passed to Protobuf, CodedInputStream attempts to read data out of this InputStream into CIS's internal 4K buffer. For messages that are much larger, CIS copies from the input in chunks of 4K and saved in an ArrayList. Once the entire message size is read in, it is re-copied into one large byte array and passed back up. This only happens for ByteStrings and ByteBuffers that are read out of CIS. (See CIS.readRawBytesSlowPath for implementation). gRPC doesn't need this overhead, since we already have the entire message in memory, albeit in chunks. This change copies the composite buffer into a single heap byte buffer, and passes this (via UnsafeByteOperations) into CodedInputStream. This pays one copy to build the heap buffer, but avoids the two copes in CIS. This also ensures that the buffer is considered "immutable" from CIS's point of view. Because CIS does not have ByteString aliasing turned on, this large buffer will not accidentally be kept in memory even if only tiny fields from the proto are still referenced. Instead, reading ByteStrings out of CIS will always copy. (This copy, and the problems it avoids, can be turned off by calling CIS.enableAliasing.) Benchmark results will come shortly, but initial testing shows significant speedup in throughput tests. Profiling has shown that copying memory was a large time consumer for messages of size 1MB.
gRPC-Java works with JDK 6. TLS usage typically requires using Java 8, or Play Services Dynamic Security Provider on Android. Please see the Security Readme.
Download the JARs. Or for Maven with non-Android, add to your pom.xml
:
<dependency> <groupId>io.grpc</groupId> <artifactId>grpc-netty</artifactId> <version>0.15.0</version> </dependency> <dependency> <groupId>io.grpc</groupId> <artifactId>grpc-protobuf</artifactId> <version>0.15.0</version> </dependency> <dependency> <groupId>io.grpc</groupId> <artifactId>grpc-stub</artifactId> <version>0.15.0</version> </dependency>
Or for Gradle with non-Android, add to your dependencies:
compile 'io.grpc:grpc-netty:0.15.0' compile 'io.grpc:grpc-protobuf:0.15.0' compile 'io.grpc:grpc-stub:0.15.0'
For Android client, use grpc-okhttp
instead of grpc-netty
and grpc-protobuf-lite
or grpc-protobuf-nano
instead of grpc-protobuf
:
compile 'io.grpc:grpc-okhttp:0.15.0' compile 'io.grpc:grpc-protobuf-lite:0.15.0' compile 'io.grpc:grpc-stub:0.15.0'
Development snapshots are available in Sonatypes's snapshot repository.
For protobuf-based codegen, you can put your proto files in the src/main/proto
and src/test/proto
directories along with an appropriate plugin.
For protobuf-based codegen integrated with the Maven build system, you can use protobuf-maven-plugin:
<build> <extensions> <extension> <groupId>kr.motd.maven</groupId> <artifactId>os-maven-plugin</artifactId> <version>1.4.1.Final</version> </extension> </extensions> <plugins> <plugin> <groupId>org.xolstice.maven.plugins</groupId> <artifactId>protobuf-maven-plugin</artifactId> <version>0.5.0</version> <configuration> <!-- The version of protoc must match protobuf-java. If you don't depend on protobuf-java directly, you will be transitively depending on the protobuf-java version that grpc depends on. --> <protocArtifact>com.google.protobuf:protoc:3.0.0-beta-3:exe:${os.detected.classifier}</protocArtifact> <pluginId>grpc-java</pluginId> <pluginArtifact>io.grpc:protoc-gen-grpc-java:0.15.0:exe:${os.detected.classifier}</pluginArtifact> </configuration> <executions> <execution> <goals> <goal>compile</goal> <goal>compile-custom</goal> </goals> </execution> </executions> </plugin> </plugins> </build>
For protobuf-based codegen integrated with the Gradle build system, you can use protobuf-gradle-plugin:
apply plugin: 'java' apply plugin: 'com.google.protobuf' buildscript { repositories { mavenCentral() } dependencies { // ASSUMES GRADLE 2.12 OR HIGHER. Use plugin version 0.7.5 with earlier // gradle versions classpath 'com.google.protobuf:protobuf-gradle-plugin:0.7.7' } } protobuf { protoc { // The version of protoc must match protobuf-java. If you don't depend on // protobuf-java directly, you will be transitively depending on the // protobuf-java version that grpc depends on. artifact = "com.google.protobuf:protoc:3.0.0-beta-3" } plugins { grpc { artifact = 'io.grpc:protoc-gen-grpc-java:0.15.0' } } generateProtoTasks { all()*.plugins { grpc {} } } }
If you are making changes to gRPC-Java, see the compiling instructions.
Here's a quick readers' guide to the code to help folks get started. At a high level there are three distinct layers to the library: Stub, Channel & Transport.
The Stub layer is what is exposed to most developers and provides type-safe bindings to whatever datamodel/IDL/interface you are adapting. gRPC comes with a plugin to the protocol-buffers compiler that generates Stub interfaces out of .proto
files, but bindings to other datamodel/IDL should be trivial to add and are welcome.
The Channel layer is an abstraction over Transport handling that is suitable for interception/decoration and exposes more behavior to the application than the Stub layer. It is intended to be easy for application frameworks to use this layer to address cross-cutting concerns such as logging, monitoring, auth etc. Flow-control is also exposed at this layer to allow more sophisticated applications to interact with it directly.
The Transport layer does the heavy lifting of putting and taking bytes off the wire. The interfaces to it are abstract just enough to allow plugging in of different implementations. Transports are modeled as Stream
factories. The variation in interface between a server Stream and a client Stream exists to codify their differing semantics for cancellation and error reporting.
Note the transport layer API is considered internal to gRPC and has weaker API guarantees than the core API under package io.grpc
.
gRPC comes with three Transport implementations:
Tests showing how these layers are composed to execute calls using protobuf messages can be found here https://github.com/google/grpc-java/tree/master/interop-testing/src/main/java/io/grpc/testing/integration