[Remarks] Add string deduplication using a string table
* Add support for uniquing strings in the remark streamer and emitting the string table in the remarks section.
* Add parsing support for the string table in the RemarkParser.
From this remark:
```
--- !Missed
Pass: inline
Name: NoDefinition
DebugLoc: { File: 'test-suite/SingleSource/UnitTests/2002-04-17-PrintfChar.c',
Line: 7, Column: 3 }
Function: printArgsNoRet
Args:
- Callee: printf
- String: ' will not be inlined into '
- Caller: printArgsNoRet
DebugLoc: { File: 'test-suite/SingleSource/UnitTests/2002-04-17-PrintfChar.c',
Line: 6, Column: 0 }
- String: ' because its definition is unavailable'
...
```
to:
```
--- !Missed
Pass: 0
Name: 1
DebugLoc: { File: 3, Line: 7, Column: 3 }
Function: 2
Args:
- Callee: 4
- String: 5
- Caller: 2
DebugLoc: { File: 3, Line: 6, Column: 0 }
- String: 6
...
```
And the string table in the .remarks/__remarks section containing:
```
inline\0NoDefinition\0printArgsNoRet\0
test-suite/SingleSource/UnitTests/2002-04-17-PrintfChar.c\0printf\0
will not be inlined into \0 because its definition is unavailable\0
```
This is mostly supposed to be used for testing purposes, but it gives us
a 2x reduction in the remark size, and is an incremental change for the
updates to the remarks file format.
Differential Revision: https://reviews.llvm.org/D60227
llvm-svn: 359050
diff --git a/llvm/lib/Remarks/RemarkStringTable.cpp b/llvm/lib/Remarks/RemarkStringTable.cpp
new file mode 100644
index 0000000..984aa5b
--- /dev/null
+++ b/llvm/lib/Remarks/RemarkStringTable.cpp
@@ -0,0 +1,48 @@
+//===- RemarkStringTable.cpp ----------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// Implementation of the Remark string table used at remark generation.
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/Remarks/RemarkStringTable.h"
+#include "llvm/Support/EndianStream.h"
+#include "llvm/Support/Error.h"
+#include <vector>
+
+using namespace llvm;
+using namespace llvm::remarks;
+
+std::pair<unsigned, StringRef> StringTable::add(StringRef Str) {
+ size_t NextID = StrTab.size();
+ auto KV = StrTab.insert({Str, NextID});
+ // If it's a new string, add it to the final size.
+ if (KV.second)
+ SerializedSize += KV.first->first().size() + 1; // +1 for the '\0'
+ // Can be either NextID or the previous ID if the string is already there.
+ return {KV.first->second, KV.first->first()};
+}
+
+void StringTable::serialize(raw_ostream &OS) const {
+ // Emit the number of strings.
+ uint64_t StrTabSize = SerializedSize;
+ support::endian::write(OS, StrTabSize, support::little);
+ // Emit the sequence of strings.
+ for (StringRef Str : serialize()) {
+ OS << Str;
+ // Explicitly emit a '\0'.
+ OS.write('\0');
+ }
+}
+
+std::vector<StringRef> StringTable::serialize() const {
+ std::vector<StringRef> Strings{StrTab.size()};
+ for (const auto &KV : StrTab)
+ Strings[KV.second] = KV.first();
+ return Strings;
+}