Update README, and markup with reStructuredText.

Update the API documentation, since bccCompileBC and bccLoadBinary
are removed.  Update the document about the cache file format.
Since the caching mechanism has been rewritten completely.

Note: Now, we are going to adopt reStructuredText to markup
the document.  reStructuredText is a natural markup language,
which is human-eye-friendly (contrast to html).
diff --git a/README b/README
deleted file mode 100644
index 838a404..0000000
--- a/README
+++ /dev/null
@@ -1,172 +0,0 @@
-Welcome to libbcc, Android's bitcode JIT. Note that bitcode refers to LLVM bitcode.
-
-------------------------------------------------------------------------------
-A. Highlights:
-------------------------------------------------------------------------------
-
-* libbcc supports bitcode from various language frontends (E.g. RenderScript,
-  GLSL).
-
-* libbcc strives to balance between library size, launch time and steady-state
-  performance:
-
-  ** The size of libbcc is aggressively reduced for a mobile device. We customize
-     and we don't use Execution Engine.
-
-  ** To reduce launch time, we support caching of binaries.
-
-  ** For steady-state performance, we enable VFP3 and aggressive optimizations.
-
-* Currently we disable Lazy JITting.
-
-
-------------------------------------------------------------------------------
-B. The following APIs are provided:
-------------------------------------------------------------------------------
-
-Basic end-to-end usage:
-* bccReadBC(): Read bitcode and convert it to LLVM module.
-* bccReadModule(): Read LLVM module, which has 1-1 mapping with bitcode format.
-* bccReadExe(): Read cached binary. No need to compiling the bitcode later.
-
-* bccLinkBC(): On-device linking of bitcodes.
-* bccPrepareExecutable(): Either compile bitcode or load from cache (cache-hit)
-
-* bccRegisterSymbolCallback(): Tell libbcc the function address of "dlsym"
-* bccGetError(): Get compilation... error
-
-Reflection:
-* bccGetExportFuncs()
-* bccGetExportVars()
-* bccGetPragmas()
-
-For debugging:
-* bccGetFunctions()
-* bccGetFunctionBinary()
-
-RenderScript-specific:
-* bccCreateScript()
-* bccDeleteScript()
-* bccGetScriptInfoLog()
-  Note that Bitcode JIT is invoked when a script is loaded.
-
-
-------------------------------------------------------------------------------
-C. Calling conventions:
-------------------------------------------------------------------------------
-
-Case 1. Calls from Execution Environment or from/to within script:
-  On ARM, the first 4 arguments will go into r0, r1, r2, and r3, in that order.
-  The remaining (if any) will go through stack.
-
-  For ext_vec_types such as float2, a set of registers will be used. In the case
-  of float2, a register pair will be used. Specifically, if float2 is the first
-  argument in the function prototype, float2.x will go into r0, and float2.y,
-  r1.
-  Note: stack will be aligned to the coarsest-grained argument. In the case of
-  float2 above as an argument, parameter stack will be aligned to an 8-byte
-  boundary (if the sizes of other arguments are no greater than 8.)
-
-Case 2. Calls from/to a separate compilation unit: (E.g., calls to Execution
-  Environment if those runtime library callees are not compiled using LLVM.)
-  On ARM, we use hardfp.
-  Note that double will be placed in a register pair.
-
-
-------------------------------------------------------------------------------
-D. BCC Cache File Format
-------------------------------------------------------------------------------
-
-The structure of BCC cache file consists of three sections:
-
-1) Header Section: Information about this cache file
-2) Relocation Info Section: Information to relocate a global variable
-3) Exported Var Info Section: Information for RenderScript
-   Exported Function Info Section: Information for RenderScript
-   Exported Pragma Info Section: Information for RenderScript
-4) Code Section: The jit'ed code on this platform
-5) Data Section: The global variable
-
-
---------------------------------------------------
-D.1 Header Section
---------------------------------------------------
-
-struct oBCCHeader {
-  uint8_t magic[4];             // includes version number
-  uint8_t magicVersion[4];
-
-  uint32_t sourceWhen;
-  uint32_t rslibWhen;
-  uint32_t libRSWhen;
-  uint32_t libbccWhen;
-
-  uint32_t cachedCodeMapAddr;
-  uint32_t rootAddr;
-  uint32_t initAddr;
-
-  uint32_t relocOffset;         // offset of reloc table.
-  uint32_t relocCount;
-  uint32_t exportVarsOffset;    // offset of export var table
-  uint32_t exportVarsCount;
-  uint32_t exportFuncsOffset;   // offset of export func table
-  uint32_t exportFuncsCount;
-  uint32_t exportPragmasOffset; // offset of export pragma table
-  uint32_t exportPragmasCount;
-  uint32_t exportPragmasSize;   // size of export pragma table (in bytes)
-
-  uint32_t codeOffset;          // offset of code: 64-bit alignment
-  uint32_t codeSize;
-  uint32_t dataOffset;          // offset of data section
-  uint32_t dataSize;
-
-  //  uint32_t flags;           // some info flags
-  uint32_t checksum;            // adler32 checksum covering deps/opt
-};
-
-Note: Both "offset" and "size" are in the unit of bytes. However, "count" is
-in the unit of entities.
-
-
---------------------------------------------------
-D.2 Relocation Section
---------------------------------------------------
-
-There may be many entries in the relocation section.  Each entry has following
-structure:
-
-struct oBCCRelocEntry {
-  uint32_t relocType;           // target instruction relocation type
-  uint32_t relocOffset;         // offset of hole (holeAddr - codeAddr)
-  uint32_t cachedResultAddr;    // address resolved at compile time
-};
-
-
---------------------------------------------------
-D.3 Export Pragma Section
---------------------------------------------------
-
-The export pragma section is consisted of two parts: (1) Offset and size
-table and (2) String constant pools.  Every string in constant pool should
-end with '\0', and the length of the string should obey the value of size
-in previous table.
-
-For each entry in offset and size table, they have four fields,
-see the following structure:
-
-struct oBCCPragmaEntry {
-  uint32_t pragmaNameOffset;    // offset to pragma name string
-  uint32_t pragmaNameSize;      // size of pragma name string (without '\0')
-  uint32_t pragmaValueOffset;   // offset to pragma value string
-  uint32_t pramgaValueSize;     // size of pragma value string (without '\0')
-};
-
-Note: The offset is related to the start of the export pragma section.
-
-
---------------------------------------------------
-D.4 Code Section
---------------------------------------------------
-
-This section should align to a page size.
-
diff --git a/README.rst b/README.rst
new file mode 100644
index 0000000..d77c665
--- /dev/null
+++ b/README.rst
@@ -0,0 +1,139 @@
+=========================================
+libbcc: A Hybrid Bitcode Execution Engine
+=========================================
+
+
+Introduction
+------------
+
+libbcc is an LLVM bitcode execution engine which compiles the bitcode
+to an in-memory executable.  It comes with a *just-in-time bitcode
+compiler*, which translates the bitcode to machine code, and a *caching
+mechanism*, which saves the in-memory executable after the compilation.
+Here are some highlights of libbcc:
+
+* libbcc supports bit code from various language frontends, such as
+  RenderScript, GLSL.
+
+* libbcc strives to balance between library size, launch time and
+  steady-state performance:
+
+  * The size of libbcc is aggressively reduced for a mobile device.
+    We customize and we don't use Execution Engine.
+
+  * To reduce launch time, we support caching of binaries.
+
+  * For steady-state performance, we enable VFP3 and aggressive
+    optimizations.
+
+
+
+API
+---
+
+Basic:
+
+* **bccCreateScript** - Create new bcc script
+
+* **bccRegisterSymbolCallback** - Register the callback function for external
+  symbol lookup
+
+* **bccReadBC** - Set the source bitcode for compilation
+
+* **bccReadModule** - Set the llvm::Module for compilation
+
+* **bccLinkBC** - Set the library bitcode for linking
+
+* **bccPrepareExecutable** Create the in-memory executable by either
+  just-in-time compilation or cache loading
+
+* **bccDeleteScript** - Destroy bcc script and release the resources
+
+* **bccGetError** - Get the error code
+
+* **bccGetScriptInfoLog** *deprecated* - Don't use this
+
+
+Reflection:
+
+* **bccGetExportVars** - Get the addresses of exported variables
+
+* **bccGetExportFuncs** - Get the addresses of exported functions
+
+* **bccGetPragmas** - Get the pragmas
+
+
+Debug:
+
+* **bccGetFunctions** - Get the function name list
+
+* **bccGetFunctionBinary** - Get the address and the size of function binary
+
+
+
+Cache File Format
+-----------------
+
+The cache file of libbcc (\*.oBCC) is consisted of several sections:
+header, string pool, dependencies table, relocation table, exported
+variable list, exported function list, pragma list, function information
+table, and bcc context.  Every section should be aligned to a word size.
+Here's the brief description of each sections:
+
+* **Header** (OBCC_Header) - The header of the cache file.  Contains the
+  magic word, version, machine integer type information, and the size
+  and the offset of other sections.  The header section is guaranteed
+  to be at the beginning of the cache file.
+
+* **String Pool** (OBCC_StringPool) - A collection of serialized variadic
+  length strings.  The strp_index in the other part of the cache file
+  represents the index of such string in this string pool.
+
+* **Dependencies Table** (OBCC_DependencyTable) - The dependencies table.
+  This table will store the resource name (or file path), the resouece
+  type (rather in APK or on the file system), and the SHA1 checksum.
+
+* **Relocation Table** (OBCC_RelocationTable) *not finished*
+
+* **Exported Variable List** (OBCC_ExportVarList),
+  **Exported Function List** (OBCC_ExportFuncList) -
+  The list of the addresses of exported variables and exported functions.
+
+* **Pragma List** (OBCC_PragmaList) - The list of pragma key-value pair.
+
+* **Function Information Table** (OBCC_FuncTable) - This is a table of
+  function information, such as function name, function entry address,
+  and function binary size.  Besides, the table should be ordered by
+  function name.
+
+* **Context** - The context of the in-memory executable, including
+  the code and the data.  The offset of context should aligned to
+  a page size, so that we can mmap the context directly into the memory.
+
+For furthur information, you may read `bcc_cache.h <include/bcc/bcc_cache.h>`_,
+`CacheReader.cpp <lib/bcc/CacheReader.cpp>`_, and
+`CacheWriter.cpp <lib/bcc/CacheWriter.cpp>`_ for details.
+
+
+
+JIT'ed Code Calling Conventions
+-------------------------------
+
+1. Calls from Execution Environment or from/to within script:
+
+   On ARM, the first 4 arguments will go into r0, r1, r2, and r3, in that order.
+   The remaining (if any) will go through stack.
+
+   For ext_vec_types such as float2, a set of registers will be used. In the case
+   of float2, a register pair will be used. Specifically, if float2 is the first
+   argument in the function prototype, float2.x will go into r0, and float2.y,
+   r1.
+
+   Note: stack will be aligned to the coarsest-grained argument. In the case of
+   float2 above as an argument, parameter stack will be aligned to an 8-byte
+   boundary (if the sizes of other arguments are no greater than 8.)
+
+2. Calls from/to a separate compilation unit: (E.g., calls to Execution
+   Environment if those runtime library callees are not compiled using LLVM.)
+
+   On ARM, we use hardfp.  Note that double will be placed in a register pair.