Fred Drake | 295da24 | 1998-08-10 19:42:37 +0000 | [diff] [blame] | 1 | \section{\module{soundex} --- |
| 2 | None} |
Fred Drake | b91e934 | 1998-07-23 17:59:49 +0000 | [diff] [blame] | 3 | \declaremodule{builtin}{soundex} |
| 4 | |
| 5 | \modulesynopsis{None} |
| 6 | |
Guido van Rossum | 3486f27 | 1996-12-12 17:02:21 +0000 | [diff] [blame] | 7 | |
Fred Drake | 23bc85a | 1998-03-08 07:56:48 +0000 | [diff] [blame] | 8 | |
Guido van Rossum | 3486f27 | 1996-12-12 17:02:21 +0000 | [diff] [blame] | 9 | The soundex algorithm takes an English word, and returns an |
| 10 | easily-computed hash of it; this hash is intended to be the same for |
| 11 | words that sound alike. This module provides an interface to the |
| 12 | soundex algorithm. |
| 13 | |
| 14 | Note that the soundex algorithm is quite simple-minded, and isn't |
| 15 | perfect by any measure. Its main purpose is to help looking up names |
Fred Drake | c520b69 | 1998-01-20 04:45:44 +0000 | [diff] [blame] | 16 | in databases, when the name may be misspelled --- soundex hashes common |
Guido van Rossum | 3486f27 | 1996-12-12 17:02:21 +0000 | [diff] [blame] | 17 | misspellings together. |
| 18 | |
| 19 | \begin{funcdesc}{get_soundex}{string} |
| 20 | Return the soundex hash value for a word; it will always be a |
| 21 | 6-character string. \var{string} must contain the word to be hashed, |
Fred Drake | c708605 | 1998-04-07 19:58:19 +0000 | [diff] [blame] | 22 | with no leading whitespace; the case of the word is ignored. (Note |
| 23 | that the original algorithm produces a 4-character result.) |
Guido van Rossum | 3486f27 | 1996-12-12 17:02:21 +0000 | [diff] [blame] | 24 | \end{funcdesc} |
| 25 | |
| 26 | \begin{funcdesc}{sound_similar}{string1, string2} |
| 27 | Compare the word in \var{string1} with the word in \var{string2}; this |
| 28 | is equivalent to |
Fred Drake | fc931ec | 1998-02-13 21:49:12 +0000 | [diff] [blame] | 29 | \code{get_soundex(\var{string1})} \code{==} |
| 30 | \code{get_soundex(\var{string2})}. |
Guido van Rossum | 3486f27 | 1996-12-12 17:02:21 +0000 | [diff] [blame] | 31 | \end{funcdesc} |
Fred Drake | c708605 | 1998-04-07 19:58:19 +0000 | [diff] [blame] | 32 | |
| 33 | |
| 34 | \begin{seealso} |
Fred Drake | 37f1574 | 1999-11-10 16:21:37 +0000 | [diff] [blame] | 35 | \seetext{Donald E. Knuth, \citetitle{Sorting and Searching}, vol. 3 |
| 36 | in ``The Art of Computer Programming.'' Addison-Wesley |
| 37 | Publishing Company: Reading, MA: 1973. pp.\ 391-392. |
| 38 | Discusses the origin and usefulness of the algorithm, as |
| 39 | well as the algorithm itself. Knuth gives his sources as |
| 40 | \emph{U.S. Patents 1261167} (1918) and \emph{1435663} |
| 41 | (1922), attributing the algorithm to Margaret K. Odell and |
| 42 | Robert C. Russel. Additional references are provided.} |
Fred Drake | c708605 | 1998-04-07 19:58:19 +0000 | [diff] [blame] | 43 | \end{seealso} |