blob: 8a8731da9bfe26257d6a5f6c313fee97584f037f [file] [log] [blame]
Stefan Krah1919b7e2012-03-21 18:25:23 +01001
2
3Bignum support (Fast Number Theoretic Transform or FNT):
4========================================================
5
6Bignum arithmetic in libmpdec uses the scheme for fast convolution
7of integer sequences from:
8
9J. M. Pollard: The fast Fourier transform in a finite field
10http://www.ams.org/journals/mcom/1971-25-114/S0025-5718-1971-0301966-0/home.html
11
12
13The transform in a finite field can be used for convolution in the same
14way as the Fourier Transform. The main advantages of the Number Theoretic
15Transform are that it is both exact and very memory efficient.
16
17
18Convolution in pseudo-code:
19~~~~~~~~~~~~~~~~~~~~~~~~~~~
20
21 fnt_convolute(a, b):
22 x = fnt(a) # forward transform of a
23 y = fnt(b) # forward transform of b
24 z = pairwise multiply x[i] and y[i]
25 result = inv_fnt(z) # backward transform of z.
26
27
28Extending the maximum transform length (Chinese Remainder Theorem):
29-------------------------------------------------------------------
30
31The maximum transform length is quite limited when using a single
32prime field. However, it is possible to use multiple primes and
33recover the result using the Chinese Remainder Theorem.
34
35
36Multiplication in pseudo-code:
37~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
38
39 _mpd_fntmul(u, v):
40 c1 = fnt_convolute(u, v, P1) # convolute modulo prime1
41 c2 = fnt_convolute(u, v, P2) # convolute modulo prime2
42 c3 = fnt_convolute(u, v, P3) # convolute modulo prime3
43 result = crt3(c1, c2, c3) # Chinese Remainder Theorem
44
45
46Optimized transform functions:
47------------------------------
48
49There are three different fnt() functions:
50
51 std_fnt: "standard" decimation in frequency transform for array lengths
52 of 2**n. Performs well up to 1024 words.
53
54 sixstep: Cache-friendly algorithm for array lengths of 2**n. Outperforms
55 std_fnt for large arrays.
56
57 fourstep: Algorithm for array lengths of 3 * 2**n. Also cache friendly
58 in large parts.
59
60
61List of bignum-only files:
62--------------------------
63
64Functions from these files are only used in _mpd_fntmul().
65
66 umodarith.h -> fast low level routines for unsigned modular arithmetic
67 numbertheory.c -> routines for setting up the FNT
68 difradix2.c -> decimation in frequency transform, used as the
69 "base case" by the following three files:
70
71 fnt.c -> standard transform for smaller arrays
72 sixstep.c -> transform large arrays of length 2**n
73 fourstep.c -> transform arrays of length 3 * 2**n
74
75 convolute.c -> do the actual fast convolution, using one of
76 the three transform functions.
77 transpose.c -> transpositions needed for the sixstep algorithm.
78 crt.c -> Chinese Remainder Theorem: use information from three
79 transforms modulo three different primes to get the
80 final result.
81
82
83