blob: 276ca3668fe5fd6d01afb78433c8e88c4a89a85b [file] [log] [blame]
Narayan Kamathf163f692012-09-14 15:02:00 +01001- Project name
2
3marisa-trie
4http://code.google.com/p/marisa-trie/
5
6- Project summary
7
8Matching Algorithms with Recursively Implemented StorAge
9
10- Version
11
120.1.4
13
14- Description
15
16This project *marisa-trie* provides a C++ library *libmarisa* and command line tools *`marisa-*`* for building and operating nesting patricia tries. The brand-new dictionary structure is designed to be static and space efficient. Also, *marisa-trie* enables not only simple lookups but also prefix searches and predictive searches.
17
18 * Prefix searches are to find keys from prefixes of a given query.
19 * Predictive searches are to find keys starting with a given query.
20
21The biggest advantage of *marisa-trie* is that it can build a considerably compact dictionary. See below for the size of dictionaries built with various trie implementations.
22
23 * Input
24 * Source: enwiki-20110115-all-titles-in-ns0.gz
25 * Contents: all page titles of English Wikipedia (Jan. 2011)
26 * Number of keys: 8,279,325
27 * Total size: 167,223,035 bytes (plain) / 46,344,646 bytes (gzipped)
28
29|| *Implementation* || *Size (bytes)* || *Remarks* ||
30|| darts-clone || 316,065,792 || Compacted double-array trie ||
31|| tx-trie || 107,119,864 || LOUDS-based trie ||
32|| *marisa-trie* || *42,688,271* || Nesting patricia trie ||
33
34 * Documentation (in Japanese)
35 * HowTo
36 * ListOfTools
37 * LibraryInterface
38 * BenchmarkResults
39
40- Version control system
41
42Subversion
43
44- Source code license
45
46New BSD License
47
48- Project labels
49
50Nesting
51Patricia
52Trie
53Static
54Dictionary
55CPlusPlus