all: support enforce_utf8 override
In 2014, when proto3 was being developed, there were a number of early
adopters of the new syntax. Before the finalization of proto3 when
it was released in open-source in July 2016, a decision was made to
strictly validate strings in proto3. However, some of the early adopters
were already using invalid UTF-8 with string fields.
The google.protobuf.FieldOptions.enforce_utf8 option only exists to support
those grandfathered users where they can opt-out of the validation logic.
Practical use of that option in open source is impossible even if a user
specifies the proto1_legacy build tag since it requires a hacked
variant of descriptor.proto that is not externally available.
This CL supports enforce_utf8 by modifiyng internal/filedesc to
expose the flag if it detects it in the raw descriptor.
We add an strs.EnforceUTF8 function as a centralized place to determine
whether to perform validation. Validation opt-out is supported
only in builds with legacy support.
We implement support for validating UTF-8 in all proto3 string fields,
even if they are backed by a Go []byte.
Change-Id: I9c0628b84909bc7181125f09db730c80d490e485
Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/186002
Reviewed-by: Damien Neil <dneil@google.com>
diff --git a/proto/encode_gen.go b/proto/encode_gen.go
index fe977e3..77b6511 100644
--- a/proto/encode_gen.go
+++ b/proto/encode_gen.go
@@ -12,6 +12,7 @@
"google.golang.org/protobuf/internal/encoding/wire"
"google.golang.org/protobuf/internal/errors"
+ "google.golang.org/protobuf/internal/strs"
"google.golang.org/protobuf/reflect/protoreflect"
)
@@ -67,7 +68,7 @@
case protoreflect.DoubleKind:
b = wire.AppendFixed64(b, math.Float64bits(v.Float()))
case protoreflect.StringKind:
- if fd.Syntax() == protoreflect.Proto3 && !utf8.ValidString(v.String()) {
+ if strs.EnforceUTF8(fd) && !utf8.ValidString(v.String()) {
return b, errors.InvalidUTF8(string(fd.FullName()))
}
b = wire.AppendString(b, v.String())