Adding StringPiece read/write support to pickle.

This is a re-do of https://codereview.chromium.org/927183002/ which
was reverted in CL https://codereview.chromium.org/981853002/ because
of (unsubstantiated) concerns that it caused a crash.

Adding StringPiece/StringPiece16 read/write support to pickle, and
update unit tests.

The IPC perf tests do a lot of allocations which can, with large block
sizes, harm their performance. The high allocation counts also make
their performance very dependent on the quirks of the Windows heap, as
it applies undocumented heuristics to decide when to decommit memory
and when to save it for later.

And, doing unnecessary allocations is generally always a bad thing.

So, this change adds StringPiece support to PickleIterator (reading)
and Pickle (writing). The StringPiece function now handles all strings
when writing, but must be requested explicitly when reading.

ipc_mojo_perftests does more allocations than ipc_perftests. This
removes one message-sized allocation from both tests.

The unit tests were updated so that WriteString and WriteString16 are
exercised with both string objects and char/char16 arrays (no
allocations required!). Reading into StringPiece and StringPiece16 is
also tested and the tests were verified with:
out\Release\base_unittests --gtest_filter=PickleTest.*

The main performance test command line used was:

out\Release\ipc_mojo_perftests --gtest_filter=MojoChannelPerfTest.ChannelPingPong

Typical test results on HP Z620 (Windows 8.1) with thread affinity and
high-performance power settings prior to this change:
IPC_Channel_Perf_50000x_12      1140.01 ms
IPC_Channel_Perf_50000x_144     1182.55 ms
IPC_Channel_Perf_50000x_1728    1266.42 ms
IPC_Channel_Perf_12000x_20736   1289.19 ms
IPC_Channel_Perf_1000x_248832   584.619 ms

Typical test results with same settings but using base::StringPiece:
IPC_Channel_Perf_50000x_12      1123.33 ms
IPC_Channel_Perf_50000x_144     1164.53 ms
IPC_Channel_Perf_50000x_1728    1242.99 ms
IPC_Channel_Perf_12000x_20736   1186.84 ms
IPC_Channel_Perf_1000x_248832   496.469 ms

Modest improvement with small buffers, but significant speedup with large buffers.

Typical test results with large-blocks only prior to this change:
IPC_Channel_Perf_1000x_248832   1211.33 ms
IPC_Channel_Perf_1000x_248832   961.404 ms
IPC_Channel_Perf_1000x_248832   600.911 ms
IPC_Channel_Perf_1000x_248832   468.356 ms
IPC_Channel_Perf_1000x_248832   430.859 ms
IPC_Channel_Perf_1000x_248832   425.147 ms

Typical test results with large-blocks only (base::StringPiece):
IPC_Channel_Perf_1000x_248832   909.571 ms
IPC_Channel_Perf_1000x_248832   731.435 ms
IPC_Channel_Perf_1000x_248832   493.697 ms
IPC_Channel_Perf_1000x_248832   417.966 ms
IPC_Channel_Perf_1000x_248832   397.377 ms
IPC_Channel_Perf_1000x_248832   397.725 ms

Note that it takes a while for the Windows heap to 'realize' that it
should hang on to some of the large blocks which is why performance
increases over multiple runs.

Chrome will not immediately be affected because StringPiece reading has
to be explicitly selected. Note that the effect on ipc_perftests is
more variable due to the odd Windows heap heuristics.

Reliable results require setting the power plan to high-performance.
On Linux this is done with this command:
sudo cpupower frequency-set --governor performance

The ipc_perftests command-line is:
out/Release/ipc_perftests --gtest_filter=IPCChannelPerfTest.ChannelPingPong

Typical before/after Linux results for ipc_perftests are:
IPC_Channel_Perf_50000x_12      465.271 ms
IPC_Channel_Perf_50000x_144     480.224 ms
IPC_Channel_Perf_50000x_1728    510.871 ms
IPC_Channel_Perf_12000x_20736   318.016 ms
IPC_Channel_Perf_1000x_248832   309.325 ms

IPC_Channel_Perf_50000x_12      459.245 ms
IPC_Channel_Perf_50000x_144     479.347 ms
IPC_Channel_Perf_50000x_1728    506.57  ms
IPC_Channel_Perf_12000x_20736   289.583 ms
IPC_Channel_Perf_1000x_248832   255.083 ms

Before after Linux results for ipc_mojo_perftests:
IPC_Channel_Perf_50000x_12      670.727 ms
IPC_Channel_Perf_50000x_144     713.6   ms
IPC_Channel_Perf_50000x_1728    808.157 ms
IPC_Channel_Perf_12000x_20736   464.221 ms
IPC_Channel_Perf_1000x_248832   365.258 ms

IPC_Channel_Perf_50000x_12      653.12  ms
IPC_Channel_Perf_50000x_144     697.418 ms
IPC_Channel_Perf_50000x_1728    772.575 ms
IPC_Channel_Perf_12000x_20736   446.315 ms
IPC_Channel_Perf_1000x_248832   348.38  ms

So, consistent improvements on Linux.

Review URL: https://codereview.chromium.org/987033003

Cr-Commit-Position: refs/heads/master@{#319810}


CrOS-Libchrome-Original-Commit: eaa3896066e3b3c5e742a2dea4f87fe5e8e5dda4
diff --git a/base/pickle.cc b/base/pickle.cc
index d461f41..e66c5ba 100644
--- a/base/pickle.cc
+++ b/base/pickle.cc
@@ -152,6 +152,18 @@
   return true;
 }
 
+bool PickleIterator::ReadStringPiece(base::StringPiece* result) {
+  int len;
+  if (!ReadInt(&len))
+    return false;
+  const char* read_from = GetReadPointerAndAdvance(len);
+  if (!read_from)
+    return false;
+
+  *result = base::StringPiece(read_from, len);
+  return true;
+}
+
 bool PickleIterator::ReadWString(std::wstring* result) {
   int len;
   if (!ReadInt(&len))
@@ -176,6 +188,19 @@
   return true;
 }
 
+bool PickleIterator::ReadStringPiece16(base::StringPiece16* result) {
+  int len;
+  if (!ReadInt(&len))
+    return false;
+  const char* read_from = GetReadPointerAndAdvance(len, sizeof(char16));
+  if (!read_from)
+    return false;
+
+  *result = base::StringPiece16(reinterpret_cast<const char16*>(read_from),
+                                len);
+  return true;
+}
+
 bool PickleIterator::ReadData(const char** data, int* length) {
   *length = 0;
   *data = 0;
@@ -271,7 +296,7 @@
   return *this;
 }
 
-bool Pickle::WriteString(const std::string& value) {
+bool Pickle::WriteString(const base::StringPiece& value) {
   if (!WriteInt(static_cast<int>(value.size())))
     return false;
 
@@ -286,7 +311,7 @@
                     static_cast<int>(value.size() * sizeof(wchar_t)));
 }
 
-bool Pickle::WriteString16(const string16& value) {
+bool Pickle::WriteString16(const base::StringPiece16& value) {
   if (!WriteInt(static_cast<int>(value.size())))
     return false;
 
diff --git a/base/pickle.h b/base/pickle.h
index f2a198e..b9b4126 100644
--- a/base/pickle.h
+++ b/base/pickle.h
@@ -13,6 +13,7 @@
 #include "base/gtest_prod_util.h"
 #include "base/logging.h"
 #include "base/strings/string16.h"
+#include "base/strings/string_piece.h"
 
 class Pickle;
 
@@ -39,8 +40,12 @@
   bool ReadFloat(float* result) WARN_UNUSED_RESULT;
   bool ReadDouble(double* result) WARN_UNUSED_RESULT;
   bool ReadString(std::string* result) WARN_UNUSED_RESULT;
+  // The StringPiece data will only be valid for the lifetime of the message.
+  bool ReadStringPiece(base::StringPiece* result) WARN_UNUSED_RESULT;
   bool ReadWString(std::wstring* result) WARN_UNUSED_RESULT;
   bool ReadString16(base::string16* result) WARN_UNUSED_RESULT;
+  // The StringPiece16 data will only be valid for the lifetime of the message.
+  bool ReadStringPiece16(base::StringPiece16* result) WARN_UNUSED_RESULT;
 
   // A pointer to the data will be placed in |*data|, and the length will be
   // placed in |*length|. The pointer placed into |*data| points into the
@@ -195,9 +200,9 @@
   bool WriteDouble(double value) {
     return WritePOD(value);
   }
-  bool WriteString(const std::string& value);
+  bool WriteString(const base::StringPiece& value);
   bool WriteWString(const std::wstring& value);
-  bool WriteString16(const base::string16& value);
+  bool WriteString16(const base::StringPiece16& value);
   // "Data" is a blob with a length. When you read it out you will be given the
   // length. See also WriteBytes.
   bool WriteData(const char* data, int length);
diff --git a/base/pickle_unittest.cc b/base/pickle_unittest.cc
index 1fa1f32..466d32a 100644
--- a/base/pickle_unittest.cc
+++ b/base/pickle_unittest.cc
@@ -30,10 +30,13 @@
 const std::string teststring("Hello world");  // note non-aligned string length
 const std::wstring testwstring(L"Hello, world");
 const base::string16 teststring16(base::ASCIIToUTF16("Hello, world"));
+const char testrawstring[] = "Hello new world"; // Test raw string writing
+// Test raw char16 writing, assumes UTF16 encoding is ANSI for alpha chars.
+const base::char16 testrawstring16[] = {'A', 'l', 'o', 'h', 'a', 0};
 const char testdata[] = "AAA\0BBB\0";
 const int testdatalen = arraysize(testdata) - 1;
 
-// checks that the result
+// checks that the results can be read correctly from the Pickle
 void VerifyResult(const Pickle& pickle) {
   PickleIterator iter(pickle);
 
@@ -91,6 +94,14 @@
   EXPECT_TRUE(iter.ReadString16(&outstring16));
   EXPECT_EQ(teststring16, outstring16);
 
+  base::StringPiece outstringpiece;
+  EXPECT_TRUE(iter.ReadStringPiece(&outstringpiece));
+  EXPECT_EQ(testrawstring, outstringpiece);
+
+  base::StringPiece16 outstringpiece16;
+  EXPECT_TRUE(iter.ReadStringPiece16(&outstringpiece16));
+  EXPECT_EQ(testrawstring16, outstringpiece16);
+
   const char* outdata;
   int outdatalen;
   EXPECT_TRUE(iter.ReadData(&outdata, &outdatalen));
@@ -121,6 +132,8 @@
   EXPECT_TRUE(pickle.WriteString(teststring));
   EXPECT_TRUE(pickle.WriteWString(testwstring));
   EXPECT_TRUE(pickle.WriteString16(teststring16));
+  EXPECT_TRUE(pickle.WriteString(testrawstring));
+  EXPECT_TRUE(pickle.WriteString16(testrawstring16));
   EXPECT_TRUE(pickle.WriteData(testdata, testdatalen));
   VerifyResult(pickle);
 
diff --git a/ipc/ipc_perftest_support.cc b/ipc/ipc_perftest_support.cc
index f8a11c5..7144103 100644
--- a/ipc/ipc_perftest_support.cc
+++ b/ipc/ipc_perftest_support.cc
@@ -103,8 +103,8 @@
     EXPECT_TRUE(iter.ReadInt64(&time_internal));
     int msgid;
     EXPECT_TRUE(iter.ReadInt(&msgid));
-    std::string payload;
-    EXPECT_TRUE(iter.ReadString(&payload));
+    base::StringPiece payload;
+    EXPECT_TRUE(iter.ReadStringPiece(&payload));
 
     // Include message deserialization in latency.
     base::TimeTicks now = base::TimeTicks::Now();