[WebAssembly] Add a flag to control merging data segments

Merging data segments produces smaller code sizes because each segment
has some boilerplate. Therefore, merging data segments is generally the
right approach, especially with wasm where binaries are typically
delivered over the network.

However, when analyzing wasm binaries, it can be helpful to get a
conservative picture of which functions are using which data
segments[0]. Perhaps there is a large data segment that you didn't
expect to be included in the wasm, introduced by some library you're
using, and you'd like to know which library it was. In this scenario,
merging data segments only makes the analysis worse.

Alternatively, perhaps you will remove some dead functions by-hand[1]
that can't be statically proven dead by the compiler or lld, and
removing these functions might make some data garbage collect-able, and
you'd like to run `--gc-sections` again so that this now-unused data can
be collected. If the segments were originally merged, then a single use
of the merged data segment will entrench all of the data.

[0] https://github.com/rustwasm/twiggy
[1] https://github.com/fitzgen/wasm-snip

Patch by Nick Fitzgerald!

Differential Revision: https://reviews.llvm.org/D46417

llvm-svn: 332013
diff --git a/lld/test/wasm/data-segment-merging.ll b/lld/test/wasm/data-segment-merging.ll
new file mode 100644
index 0000000..bce938c
--- /dev/null
+++ b/lld/test/wasm/data-segment-merging.ll
@@ -0,0 +1,48 @@
+target triple = "wasm32-unknown-unknown"
+
+@a = hidden global [6 x i8] c"hello\00", align 1
+@b = hidden global [8 x i8] c"goodbye\00", align 1
+@c = hidden global [9 x i8] c"whatever\00", align 1
+@d = hidden global i32 42, align 4
+
+; RUN: llc -filetype=obj %s -o %t.data-segment-merging.o
+
+; RUN: wasm-ld -no-gc-sections --allow-undefined -o %t.merged.wasm %t.data-segment-merging.o
+; RUN: obj2yaml %t.merged.wasm | FileCheck %s --check-prefix=MERGE
+; MERGE:       - Type:            DATA
+; MERGE-NEXT:    Segments:
+; MERGE-NEXT:      - SectionOffset:   7
+; MERGE-NEXT:        MemoryIndex:     0
+; MERGE-NEXT:        Offset:
+; MERGE-NEXT:          Opcode:          I32_CONST
+; MERGE-NEXT:          Value:           1024
+; MERGE-NEXT:        Content:         68656C6C6F00676F6F6462796500776861746576657200002A000000
+
+; RUN: wasm-ld -no-gc-sections --allow-undefined --no-merge-data-segments -o %t.separate.wasm %t.data-segment-merging.o
+; RUN: obj2yaml %t.separate.wasm | FileCheck %s --check-prefix=SEPARATE
+; SEPARATE:       - Type:            DATA
+; SEPARATE-NEXT:    Segments:
+; SEPARATE-NEXT:      - SectionOffset:   7
+; SEPARATE-NEXT:        MemoryIndex:     0
+; SEPARATE-NEXT:        Offset:
+; SEPARATE-NEXT:          Opcode:          I32_CONST
+; SEPARATE-NEXT:          Value:           1024
+; SEPARATE-NEXT:        Content:         68656C6C6F00
+; SEPARATE-NEXT:      - SectionOffset:   19
+; SEPARATE-NEXT:        MemoryIndex:     0
+; SEPARATE-NEXT:        Offset:
+; SEPARATE-NEXT:          Opcode:          I32_CONST
+; SEPARATE-NEXT:          Value:           1030
+; SEPARATE-NEXT:        Content:         676F6F6462796500
+; SEPARATE-NEXT:      - SectionOffset:   33
+; SEPARATE-NEXT:        MemoryIndex:     0
+; SEPARATE-NEXT:        Offset:
+; SEPARATE-NEXT:          Opcode:          I32_CONST
+; SEPARATE-NEXT:          Value:           1038
+; SEPARATE-NEXT:        Content:         '776861746576657200'
+; SEPARATE-NEXT:      - SectionOffset:   48
+; SEPARATE-NEXT:        MemoryIndex:     0
+; SEPARATE-NEXT:        Offset:
+; SEPARATE-NEXT:          Opcode:          I32_CONST
+; SEPARATE-NEXT:          Value:           1048
+; SEPARATE-NEXT:        Content:         2A000000
diff --git a/lld/wasm/Config.h b/lld/wasm/Config.h
index 3ade61c..da0da17 100644
--- a/lld/wasm/Config.h
+++ b/lld/wasm/Config.h
@@ -24,6 +24,7 @@
   bool GcSections;
   bool ImportMemory;
   bool ImportTable;
+  bool MergeDataSegments;
   bool PrintGcSections;
   bool Relocatable;
   bool StripAll;
diff --git a/lld/wasm/Driver.cpp b/lld/wasm/Driver.cpp
index 125a75a..18dcde1 100644
--- a/lld/wasm/Driver.cpp
+++ b/lld/wasm/Driver.cpp
@@ -292,6 +292,9 @@
   Config->Relocatable = Args.hasArg(OPT_relocatable);
   Config->GcSections =
       Args.hasFlag(OPT_gc_sections, OPT_no_gc_sections, !Config->Relocatable);
+  Config->MergeDataSegments =
+      Args.hasFlag(OPT_merge_data_segments, OPT_no_merge_data_segments,
+                   !Config->Relocatable);
   Config->PrintGcSections =
       Args.hasFlag(OPT_print_gc_sections, OPT_no_print_gc_sections, false);
   Config->SearchPaths = args::getStrings(Args, OPT_L);
diff --git a/lld/wasm/Options.td b/lld/wasm/Options.td
index 566b36a..8f7eed6 100644
--- a/lld/wasm/Options.td
+++ b/lld/wasm/Options.td
@@ -40,6 +40,10 @@
     "Enable garbage collection of unused sections",
     "Disable garbage collection of unused sections">;
 
+defm merge_data_segments: B<"merge-data-segments",
+    "Enable merging data segments",
+    "Disable merging data segments">;
+
 def help: F<"help">, HelpText<"Print option help">;
 
 def l: JoinedOrSeparate<["-"], "l">, MetaVarName<"<libName>">,
diff --git a/lld/wasm/Writer.cpp b/lld/wasm/Writer.cpp
index 0b515be..58677dc 100644
--- a/lld/wasm/Writer.cpp
+++ b/lld/wasm/Writer.cpp
@@ -914,7 +914,7 @@
 }
 
 static StringRef getOutputDataSegmentName(StringRef Name) {
-  if (Config->Relocatable)
+  if (!Config->MergeDataSegments)
     return Name;
   if (Name.startswith(".text."))
     return ".text";