|  | ================= | 
|  | DataFlowSanitizer | 
|  | ================= | 
|  |  | 
|  | .. toctree:: | 
|  | :hidden: | 
|  |  | 
|  | DataFlowSanitizerDesign | 
|  |  | 
|  | .. contents:: | 
|  | :local: | 
|  |  | 
|  | Introduction | 
|  | ============ | 
|  |  | 
|  | DataFlowSanitizer is a generalised dynamic data flow analysis. | 
|  |  | 
|  | Unlike other Sanitizer tools, this tool is not designed to detect a | 
|  | specific class of bugs on its own.  Instead, it provides a generic | 
|  | dynamic data flow analysis framework to be used by clients to help | 
|  | detect application-specific issues within their own code. | 
|  |  | 
|  | Usage | 
|  | ===== | 
|  |  | 
|  | With no program changes, applying DataFlowSanitizer to a program | 
|  | will not alter its behavior.  To use DataFlowSanitizer, the program | 
|  | uses API functions to apply tags to data to cause it to be tracked, and to | 
|  | check the tag of a specific data item.  DataFlowSanitizer manages | 
|  | the propagation of tags through the program according to its data flow. | 
|  |  | 
|  | The APIs are defined in the header file ``sanitizer/dfsan_interface.h``. | 
|  | For further information about each function, please refer to the header | 
|  | file. | 
|  |  | 
|  | ABI List | 
|  | -------- | 
|  |  | 
|  | DataFlowSanitizer uses a list of functions known as an ABI list to decide | 
|  | whether a call to a specific function should use the operating system's native | 
|  | ABI or whether it should use a variant of this ABI that also propagates labels | 
|  | through function parameters and return values.  The ABI list file also controls | 
|  | how labels are propagated in the former case.  DataFlowSanitizer comes with a | 
|  | default ABI list which is intended to eventually cover the glibc library on | 
|  | Linux but it may become necessary for users to extend the ABI list in cases | 
|  | where a particular library or function cannot be instrumented (e.g. because | 
|  | it is implemented in assembly or another language which DataFlowSanitizer does | 
|  | not support) or a function is called from a library or function which cannot | 
|  | be instrumented. | 
|  |  | 
|  | DataFlowSanitizer's ABI list file is a :doc:`SanitizerSpecialCaseList`. | 
|  | The pass treats every function in the ``uninstrumented`` category in the | 
|  | ABI list file as conforming to the native ABI.  Unless the ABI list contains | 
|  | additional categories for those functions, a call to one of those functions | 
|  | will produce a warning message, as the labelling behavior of the function | 
|  | is unknown.  The other supported categories are ``discard``, ``functional`` | 
|  | and ``custom``. | 
|  |  | 
|  | * ``discard`` -- To the extent that this function writes to (user-accessible) | 
|  | memory, it also updates labels in shadow memory (this condition is trivially | 
|  | satisfied for functions which do not write to user-accessible memory).  Its | 
|  | return value is unlabelled. | 
|  | * ``functional`` -- Like ``discard``, except that the label of its return value | 
|  | is the union of the label of its arguments. | 
|  | * ``custom`` -- Instead of calling the function, a custom wrapper ``__dfsw_F`` | 
|  | is called, where ``F`` is the name of the function.  This function may wrap | 
|  | the original function or provide its own implementation.  This category is | 
|  | generally used for uninstrumentable functions which write to user-accessible | 
|  | memory or which have more complex label propagation behavior.  The signature | 
|  | of ``__dfsw_F`` is based on that of ``F`` with each argument having a | 
|  | label of type ``dfsan_label`` appended to the argument list.  If ``F`` | 
|  | is of non-void return type a final argument of type ``dfsan_label *`` | 
|  | is appended to which the custom function can store the label for the | 
|  | return value.  For example: | 
|  |  | 
|  | .. code-block:: c++ | 
|  |  | 
|  | void f(int x); | 
|  | void __dfsw_f(int x, dfsan_label x_label); | 
|  |  | 
|  | void *memcpy(void *dest, const void *src, size_t n); | 
|  | void *__dfsw_memcpy(void *dest, const void *src, size_t n, | 
|  | dfsan_label dest_label, dfsan_label src_label, | 
|  | dfsan_label n_label, dfsan_label *ret_label); | 
|  |  | 
|  | If a function defined in the translation unit being compiled belongs to the | 
|  | ``uninstrumented`` category, it will be compiled so as to conform to the | 
|  | native ABI.  Its arguments will be assumed to be unlabelled, but it will | 
|  | propagate labels in shadow memory. | 
|  |  | 
|  | For example: | 
|  |  | 
|  | .. code-block:: none | 
|  |  | 
|  | # main is called by the C runtime using the native ABI. | 
|  | fun:main=uninstrumented | 
|  | fun:main=discard | 
|  |  | 
|  | # malloc only writes to its internal data structures, not user-accessible memory. | 
|  | fun:malloc=uninstrumented | 
|  | fun:malloc=discard | 
|  |  | 
|  | # tolower is a pure function. | 
|  | fun:tolower=uninstrumented | 
|  | fun:tolower=functional | 
|  |  | 
|  | # memcpy needs to copy the shadow from the source to the destination region. | 
|  | # This is done in a custom function. | 
|  | fun:memcpy=uninstrumented | 
|  | fun:memcpy=custom | 
|  |  | 
|  | Example | 
|  | ======= | 
|  |  | 
|  | The following program demonstrates label propagation by checking that | 
|  | the correct labels are propagated. | 
|  |  | 
|  | .. code-block:: c++ | 
|  |  | 
|  | #include <sanitizer/dfsan_interface.h> | 
|  | #include <assert.h> | 
|  |  | 
|  | int main(void) { | 
|  | int i = 1; | 
|  | dfsan_label i_label = dfsan_create_label("i", 0); | 
|  | dfsan_set_label(i_label, &i, sizeof(i)); | 
|  |  | 
|  | int j = 2; | 
|  | dfsan_label j_label = dfsan_create_label("j", 0); | 
|  | dfsan_set_label(j_label, &j, sizeof(j)); | 
|  |  | 
|  | int k = 3; | 
|  | dfsan_label k_label = dfsan_create_label("k", 0); | 
|  | dfsan_set_label(k_label, &k, sizeof(k)); | 
|  |  | 
|  | dfsan_label ij_label = dfsan_get_label(i + j); | 
|  | assert(dfsan_has_label(ij_label, i_label)); | 
|  | assert(dfsan_has_label(ij_label, j_label)); | 
|  | assert(!dfsan_has_label(ij_label, k_label)); | 
|  |  | 
|  | dfsan_label ijk_label = dfsan_get_label(i + j + k); | 
|  | assert(dfsan_has_label(ijk_label, i_label)); | 
|  | assert(dfsan_has_label(ijk_label, j_label)); | 
|  | assert(dfsan_has_label(ijk_label, k_label)); | 
|  |  | 
|  | return 0; | 
|  | } | 
|  |  | 
|  | Current status | 
|  | ============== | 
|  |  | 
|  | DataFlowSanitizer is a work in progress, currently under development for | 
|  | x86\_64 Linux. | 
|  |  | 
|  | Design | 
|  | ====== | 
|  |  | 
|  | Please refer to the :doc:`design document<DataFlowSanitizerDesign>`. |