| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 1 |  | 
 | 2 | :mod:`gettext` --- Multilingual internationalization services | 
 | 3 | ============================================================= | 
 | 4 |  | 
 | 5 | .. module:: gettext | 
 | 6 |    :synopsis: Multilingual internationalization services. | 
 | 7 | .. moduleauthor:: Barry A. Warsaw <barry@zope.com> | 
 | 8 | .. sectionauthor:: Barry A. Warsaw <barry@zope.com> | 
 | 9 |  | 
 | 10 |  | 
 | 11 | The :mod:`gettext` module provides internationalization (I18N) and localization | 
 | 12 | (L10N) services for your Python modules and applications. It supports both the | 
 | 13 | GNU ``gettext`` message catalog API and a higher level, class-based API that may | 
 | 14 | be more appropriate for Python files.  The interface described below allows you | 
 | 15 | to write your module and application messages in one natural language, and | 
 | 16 | provide a catalog of translated messages for running under different natural | 
 | 17 | languages. | 
 | 18 |  | 
 | 19 | Some hints on localizing your Python modules and applications are also given. | 
 | 20 |  | 
 | 21 |  | 
 | 22 | GNU :program:`gettext` API | 
 | 23 | -------------------------- | 
 | 24 |  | 
 | 25 | The :mod:`gettext` module defines the following API, which is very similar to | 
 | 26 | the GNU :program:`gettext` API.  If you use this API you will affect the | 
 | 27 | translation of your entire application globally.  Often this is what you want if | 
 | 28 | your application is monolingual, with the choice of language dependent on the | 
 | 29 | locale of your user.  If you are localizing a Python module, or if your | 
 | 30 | application needs to switch languages on the fly, you probably want to use the | 
 | 31 | class-based API instead. | 
 | 32 |  | 
 | 33 |  | 
 | 34 | .. function:: bindtextdomain(domain[, localedir]) | 
 | 35 |  | 
 | 36 |    Bind the *domain* to the locale directory *localedir*.  More concretely, | 
 | 37 |    :mod:`gettext` will look for binary :file:`.mo` files for the given domain using | 
 | 38 |    the path (on Unix): :file:`localedir/language/LC_MESSAGES/domain.mo`, where | 
 | 39 |    *languages* is searched for in the environment variables :envvar:`LANGUAGE`, | 
 | 40 |    :envvar:`LC_ALL`, :envvar:`LC_MESSAGES`, and :envvar:`LANG` respectively. | 
 | 41 |  | 
 | 42 |    If *localedir* is omitted or ``None``, then the current binding for *domain* is | 
 | 43 |    returned. [#]_ | 
 | 44 |  | 
 | 45 |  | 
 | 46 | .. function:: bind_textdomain_codeset(domain[, codeset]) | 
 | 47 |  | 
 | 48 |    Bind the *domain* to *codeset*, changing the encoding of strings returned by the | 
 | 49 |    :func:`gettext` family of functions. If *codeset* is omitted, then the current | 
 | 50 |    binding is returned. | 
 | 51 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 52 |  | 
 | 53 | .. function:: textdomain([domain]) | 
 | 54 |  | 
 | 55 |    Change or query the current global domain.  If *domain* is ``None``, then the | 
 | 56 |    current global domain is returned, otherwise the global domain is set to | 
 | 57 |    *domain*, which is returned. | 
 | 58 |  | 
 | 59 |  | 
 | 60 | .. function:: gettext(message) | 
 | 61 |  | 
 | 62 |    Return the localized translation of *message*, based on the current global | 
 | 63 |    domain, language, and locale directory.  This function is usually aliased as | 
 | 64 |    :func:`_` in the local namespace (see examples below). | 
 | 65 |  | 
 | 66 |  | 
 | 67 | .. function:: lgettext(message) | 
 | 68 |  | 
 | 69 |    Equivalent to :func:`gettext`, but the translation is returned in the preferred | 
 | 70 |    system encoding, if no other encoding was explicitly set with | 
 | 71 |    :func:`bind_textdomain_codeset`. | 
 | 72 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 73 |  | 
 | 74 | .. function:: dgettext(domain, message) | 
 | 75 |  | 
 | 76 |    Like :func:`gettext`, but look the message up in the specified *domain*. | 
 | 77 |  | 
 | 78 |  | 
 | 79 | .. function:: ldgettext(domain, message) | 
 | 80 |  | 
 | 81 |    Equivalent to :func:`dgettext`, but the translation is returned in the preferred | 
 | 82 |    system encoding, if no other encoding was explicitly set with | 
 | 83 |    :func:`bind_textdomain_codeset`. | 
 | 84 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 85 |  | 
 | 86 | .. function:: ngettext(singular, plural, n) | 
 | 87 |  | 
 | 88 |    Like :func:`gettext`, but consider plural forms. If a translation is found, | 
 | 89 |    apply the plural formula to *n*, and return the resulting message (some | 
 | 90 |    languages have more than two plural forms). If no translation is found, return | 
 | 91 |    *singular* if *n* is 1; return *plural* otherwise. | 
 | 92 |  | 
 | 93 |    The Plural formula is taken from the catalog header. It is a C or Python | 
 | 94 |    expression that has a free variable *n*; the expression evaluates to the index | 
 | 95 |    of the plural in the catalog. See the GNU gettext documentation for the precise | 
 | 96 |    syntax to be used in :file:`.po` files and the formulas for a variety of | 
 | 97 |    languages. | 
 | 98 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 99 |  | 
 | 100 | .. function:: lngettext(singular, plural, n) | 
 | 101 |  | 
 | 102 |    Equivalent to :func:`ngettext`, but the translation is returned in the preferred | 
 | 103 |    system encoding, if no other encoding was explicitly set with | 
 | 104 |    :func:`bind_textdomain_codeset`. | 
 | 105 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 106 |  | 
 | 107 | .. function:: dngettext(domain, singular, plural, n) | 
 | 108 |  | 
 | 109 |    Like :func:`ngettext`, but look the message up in the specified *domain*. | 
 | 110 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 111 |  | 
 | 112 | .. function:: ldngettext(domain, singular, plural, n) | 
 | 113 |  | 
 | 114 |    Equivalent to :func:`dngettext`, but the translation is returned in the | 
 | 115 |    preferred system encoding, if no other encoding was explicitly set with | 
 | 116 |    :func:`bind_textdomain_codeset`. | 
 | 117 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 118 |  | 
 | 119 | Note that GNU :program:`gettext` also defines a :func:`dcgettext` method, but | 
 | 120 | this was deemed not useful and so it is currently unimplemented. | 
 | 121 |  | 
 | 122 | Here's an example of typical usage for this API:: | 
 | 123 |  | 
 | 124 |    import gettext | 
 | 125 |    gettext.bindtextdomain('myapplication', '/path/to/my/language/directory') | 
 | 126 |    gettext.textdomain('myapplication') | 
 | 127 |    _ = gettext.gettext | 
 | 128 |    # ... | 
| Georg Brandl | 6911e3c | 2007-09-04 07:15:32 +0000 | [diff] [blame] | 129 |    print(_('This is a translatable string.')) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 130 |  | 
 | 131 |  | 
 | 132 | Class-based API | 
 | 133 | --------------- | 
 | 134 |  | 
 | 135 | The class-based API of the :mod:`gettext` module gives you more flexibility and | 
 | 136 | greater convenience than the GNU :program:`gettext` API.  It is the recommended | 
 | 137 | way of localizing your Python applications and modules.  :mod:`gettext` defines | 
 | 138 | a "translations" class which implements the parsing of GNU :file:`.mo` format | 
| Georg Brandl | f694518 | 2008-02-01 11:56:49 +0000 | [diff] [blame] | 139 | files, and has methods for returning strings. Instances of this "translations" | 
 | 140 | class can also install themselves in the built-in namespace as the function | 
 | 141 | :func:`_`. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 142 |  | 
 | 143 |  | 
 | 144 | .. function:: find(domain[, localedir[,  languages[, all]]]) | 
 | 145 |  | 
 | 146 |    This function implements the standard :file:`.mo` file search algorithm.  It | 
 | 147 |    takes a *domain*, identical to what :func:`textdomain` takes.  Optional | 
 | 148 |    *localedir* is as in :func:`bindtextdomain`  Optional *languages* is a list of | 
 | 149 |    strings, where each string is a language code. | 
 | 150 |  | 
 | 151 |    If *localedir* is not given, then the default system locale directory is used. | 
 | 152 |    [#]_  If *languages* is not given, then the following environment variables are | 
 | 153 |    searched: :envvar:`LANGUAGE`, :envvar:`LC_ALL`, :envvar:`LC_MESSAGES`, and | 
 | 154 |    :envvar:`LANG`.  The first one returning a non-empty value is used for the | 
 | 155 |    *languages* variable. The environment variables should contain a colon separated | 
 | 156 |    list of languages, which will be split on the colon to produce the expected list | 
 | 157 |    of language code strings. | 
 | 158 |  | 
 | 159 |    :func:`find` then expands and normalizes the languages, and then iterates | 
 | 160 |    through them, searching for an existing file built of these components: | 
 | 161 |  | 
 | 162 |    :file:`localedir/language/LC_MESSAGES/domain.mo` | 
 | 163 |  | 
 | 164 |    The first such file name that exists is returned by :func:`find`. If no such | 
 | 165 |    file is found, then ``None`` is returned. If *all* is given, it returns a list | 
 | 166 |    of all file names, in the order in which they appear in the languages list or | 
 | 167 |    the environment variables. | 
 | 168 |  | 
 | 169 |  | 
 | 170 | .. function:: translation(domain[, localedir[, languages[, class_[, fallback[, codeset]]]]]) | 
 | 171 |  | 
 | 172 |    Return a :class:`Translations` instance based on the *domain*, *localedir*, and | 
 | 173 |    *languages*, which are first passed to :func:`find` to get a list of the | 
 | 174 |    associated :file:`.mo` file paths.  Instances with identical :file:`.mo` file | 
 | 175 |    names are cached.  The actual class instantiated is either *class_* if provided, | 
 | 176 |    otherwise :class:`GNUTranslations`.  The class's constructor must take a single | 
 | 177 |    file object argument. If provided, *codeset* will change the charset used to | 
 | 178 |    encode translated strings. | 
 | 179 |  | 
 | 180 |    If multiple files are found, later files are used as fallbacks for earlier ones. | 
 | 181 |    To allow setting the fallback, :func:`copy.copy` is used to clone each | 
 | 182 |    translation object from the cache; the actual instance data is still shared with | 
 | 183 |    the cache. | 
 | 184 |  | 
 | 185 |    If no :file:`.mo` file is found, this function raises :exc:`IOError` if | 
 | 186 |    *fallback* is false (which is the default), and returns a | 
 | 187 |    :class:`NullTranslations` instance if *fallback* is true. | 
 | 188 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 189 |  | 
 | 190 | .. function:: install(domain[, localedir[, unicode [, codeset[, names]]]]) | 
 | 191 |  | 
 | 192 |    This installs the function :func:`_` in Python's builtin namespace, based on | 
 | 193 |    *domain*, *localedir*, and *codeset* which are passed to the function | 
 | 194 |    :func:`translation`.  The *unicode* flag is passed to the resulting translation | 
 | 195 |    object's :meth:`install` method. | 
 | 196 |  | 
 | 197 |    For the *names* parameter, please see the description of the translation | 
 | 198 |    object's :meth:`install` method. | 
 | 199 |  | 
 | 200 |    As seen below, you usually mark the strings in your application that are | 
 | 201 |    candidates for translation, by wrapping them in a call to the :func:`_` | 
 | 202 |    function, like this:: | 
 | 203 |  | 
| Georg Brandl | 6911e3c | 2007-09-04 07:15:32 +0000 | [diff] [blame] | 204 |       print(_('This string will be translated.')) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 205 |  | 
 | 206 |    For convenience, you want the :func:`_` function to be installed in Python's | 
 | 207 |    builtin namespace, so it is easily accessible in all modules of your | 
 | 208 |    application. | 
 | 209 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 210 |  | 
 | 211 | The :class:`NullTranslations` class | 
 | 212 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 
 | 213 |  | 
 | 214 | Translation classes are what actually implement the translation of original | 
 | 215 | source file message strings to translated message strings. The base class used | 
 | 216 | by all translation classes is :class:`NullTranslations`; this provides the basic | 
 | 217 | interface you can use to write your own specialized translation classes.  Here | 
 | 218 | are the methods of :class:`NullTranslations`: | 
 | 219 |  | 
 | 220 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 221 | .. class:: NullTranslations([fp]) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 222 |  | 
 | 223 |    Takes an optional file object *fp*, which is ignored by the base class. | 
 | 224 |    Initializes "protected" instance variables *_info* and *_charset* which are set | 
 | 225 |    by derived classes, as well as *_fallback*, which is set through | 
 | 226 |    :meth:`add_fallback`.  It then calls ``self._parse(fp)`` if *fp* is not | 
 | 227 |    ``None``. | 
 | 228 |  | 
 | 229 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 230 |    .. method:: _parse(fp) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 231 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 232 |       No-op'd in the base class, this method takes file object *fp*, and reads | 
 | 233 |       the data from the file, initializing its message catalog.  If you have an | 
 | 234 |       unsupported message catalog file format, you should override this method | 
 | 235 |       to parse your format. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 236 |  | 
 | 237 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 238 |    .. method:: add_fallback(fallback) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 239 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 240 |       Add *fallback* as the fallback object for the current translation | 
 | 241 |       object. A translation object should consult the fallback if it cannot provide a | 
 | 242 |       translation for a given message. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 243 |  | 
 | 244 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 245 |    .. method:: gettext(message) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 246 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 247 |       If a fallback has been set, forward :meth:`gettext` to the | 
 | 248 |       fallback. Otherwise, return the translated message.  Overridden in derived | 
 | 249 |       classes. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 250 |  | 
 | 251 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 252 |    .. method:: lgettext(message) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 253 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 254 |       If a fallback has been set, forward :meth:`lgettext` to the | 
 | 255 |       fallback. Otherwise, return the translated message.  Overridden in derived | 
 | 256 |       classes. | 
 | 257 |  | 
 | 258 |    .. method:: ugettext(message) | 
 | 259 |  | 
 | 260 |       If a fallback has been set, forward :meth:`ugettext` to the | 
 | 261 |       fallback. Otherwise, return the translated message as a string. Overridden | 
 | 262 |       in derived classes. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 263 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 264 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 265 |    .. method:: ngettext(singular, plural, n) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 266 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 267 |       If a fallback has been set, forward :meth:`ngettext` to the | 
 | 268 |       fallback. Otherwise, return the translated message.  Overridden in derived | 
 | 269 |       classes. | 
 | 270 |  | 
 | 271 |    .. method:: lngettext(singular, plural, n) | 
 | 272 |  | 
 | 273 |       If a fallback has been set, forward :meth:`ngettext` to the | 
 | 274 |       fallback. Otherwise, return the translated message.  Overridden in derived | 
 | 275 |       classes. | 
 | 276 |  | 
 | 277 |    .. method:: ungettext(singular, plural, n) | 
 | 278 |  | 
 | 279 |       If a fallback has been set, forward :meth:`ungettext` to the fallback. | 
 | 280 |       Otherwise, return the translated message as a string. Overridden in | 
 | 281 |       derived classes. | 
 | 282 |  | 
 | 283 |    .. method:: info() | 
 | 284 |  | 
 | 285 |       Return the "protected" :attr:`_info` variable. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 286 |  | 
 | 287 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 288 |    .. method:: charset() | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 289 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 290 |       Return the "protected" :attr:`_charset` variable. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 291 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 292 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 293 |    .. method:: output_charset() | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 294 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 295 |       Return the "protected" :attr:`_output_charset` variable, which defines the | 
 | 296 |       encoding used to return translated messages. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 297 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 298 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 299 |    .. method:: set_output_charset(charset) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 300 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 301 |       Change the "protected" :attr:`_output_charset` variable, which defines the | 
 | 302 |       encoding used to return translated messages. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 303 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 304 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 305 |    .. method:: install([unicode [, names]]) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 306 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 307 |       If the *unicode* flag is false, this method installs :meth:`self.gettext` | 
 | 308 |       into the built-in namespace, binding it to ``_``.  If *unicode* is true, | 
 | 309 |       it binds :meth:`self.ugettext` instead.  By default, *unicode* is false. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 310 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 311 |       If the *names* parameter is given, it must be a sequence containing the | 
 | 312 |       names of functions you want to install in the builtin namespace in | 
 | 313 |       addition to :func:`_`.  Supported names are ``'gettext'`` (bound to | 
 | 314 |       :meth:`self.gettext` or :meth:`self.ugettext` according to the *unicode* | 
 | 315 |       flag), ``'ngettext'`` (bound to :meth:`self.ngettext` or | 
 | 316 |       :meth:`self.ungettext` according to the *unicode* flag), ``'lgettext'`` | 
 | 317 |       and ``'lngettext'``. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 318 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 319 |       Note that this is only one way, albeit the most convenient way, to make | 
 | 320 |       the :func:`_` function available to your application.  Because it affects | 
 | 321 |       the entire application globally, and specifically the built-in namespace, | 
 | 322 |       localized modules should never install :func:`_`. Instead, they should use | 
 | 323 |       this code to make :func:`_` available to their module:: | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 324 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 325 |          import gettext | 
 | 326 |          t = gettext.translation('mymodule', ...) | 
 | 327 |          _ = t.gettext | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 328 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 329 |       This puts :func:`_` only in the module's global namespace and so only | 
 | 330 |       affects calls within this module. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 331 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 332 |  | 
 | 333 | The :class:`GNUTranslations` class | 
 | 334 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 
 | 335 |  | 
 | 336 | The :mod:`gettext` module provides one additional class derived from | 
 | 337 | :class:`NullTranslations`: :class:`GNUTranslations`.  This class overrides | 
 | 338 | :meth:`_parse` to enable reading GNU :program:`gettext` format :file:`.mo` files | 
 | 339 | in both big-endian and little-endian format. It also coerces both message ids | 
 | 340 | and message strings to Unicode. | 
 | 341 |  | 
 | 342 | :class:`GNUTranslations` parses optional meta-data out of the translation | 
 | 343 | catalog.  It is convention with GNU :program:`gettext` to include meta-data as | 
 | 344 | the translation for the empty string.  This meta-data is in :rfc:`822`\ -style | 
 | 345 | ``key: value`` pairs, and should contain the ``Project-Id-Version`` key.  If the | 
 | 346 | key ``Content-Type`` is found, then the ``charset`` property is used to | 
 | 347 | initialize the "protected" :attr:`_charset` instance variable, defaulting to | 
 | 348 | ``None`` if not found.  If the charset encoding is specified, then all message | 
 | 349 | ids and message strings read from the catalog are converted to Unicode using | 
 | 350 | this encoding.  The :meth:`ugettext` method always returns a Unicode, while the | 
| Georg Brandl | f694518 | 2008-02-01 11:56:49 +0000 | [diff] [blame] | 351 | :meth:`gettext` returns an encoded bytestring.  For the message id arguments | 
 | 352 | of both methods, either Unicode strings or bytestrings containing only | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 353 | US-ASCII characters are acceptable.  Note that the Unicode version of the | 
 | 354 | methods (i.e. :meth:`ugettext` and :meth:`ungettext`) are the recommended | 
 | 355 | interface to use for internationalized Python programs. | 
 | 356 |  | 
 | 357 | The entire set of key/value pairs are placed into a dictionary and set as the | 
 | 358 | "protected" :attr:`_info` instance variable. | 
 | 359 |  | 
 | 360 | If the :file:`.mo` file's magic number is invalid, or if other problems occur | 
 | 361 | while reading the file, instantiating a :class:`GNUTranslations` class can raise | 
 | 362 | :exc:`IOError`. | 
 | 363 |  | 
 | 364 | The following methods are overridden from the base class implementation: | 
 | 365 |  | 
 | 366 |  | 
 | 367 | .. method:: GNUTranslations.gettext(message) | 
 | 368 |  | 
 | 369 |    Look up the *message* id in the catalog and return the corresponding message | 
| Georg Brandl | f694518 | 2008-02-01 11:56:49 +0000 | [diff] [blame] | 370 |    string, as a bytestring encoded with the catalog's charset encoding, if | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 371 |    known.  If there is no entry in the catalog for the *message* id, and a fallback | 
 | 372 |    has been set, the look up is forwarded to the fallback's :meth:`gettext` method. | 
 | 373 |    Otherwise, the *message* id is returned. | 
 | 374 |  | 
 | 375 |  | 
 | 376 | .. method:: GNUTranslations.lgettext(message) | 
 | 377 |  | 
 | 378 |    Equivalent to :meth:`gettext`, but the translation is returned in the preferred | 
 | 379 |    system encoding, if no other encoding was explicitly set with | 
 | 380 |    :meth:`set_output_charset`. | 
 | 381 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 382 |  | 
 | 383 | .. method:: GNUTranslations.ugettext(message) | 
 | 384 |  | 
 | 385 |    Look up the *message* id in the catalog and return the corresponding message | 
| Georg Brandl | f694518 | 2008-02-01 11:56:49 +0000 | [diff] [blame] | 386 |    string, as a string.  If there is no entry in the catalog for the | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 387 |    *message* id, and a fallback has been set, the look up is forwarded to the | 
 | 388 |    fallback's :meth:`ugettext` method.  Otherwise, the *message* id is returned. | 
 | 389 |  | 
 | 390 |  | 
 | 391 | .. method:: GNUTranslations.ngettext(singular, plural, n) | 
 | 392 |  | 
 | 393 |    Do a plural-forms lookup of a message id.  *singular* is used as the message id | 
 | 394 |    for purposes of lookup in the catalog, while *n* is used to determine which | 
| Georg Brandl | f694518 | 2008-02-01 11:56:49 +0000 | [diff] [blame] | 395 |    plural form to use.  The returned message string is a bytestring encoded with | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 396 |    the catalog's charset encoding, if known. | 
 | 397 |  | 
 | 398 |    If the message id is not found in the catalog, and a fallback is specified, the | 
 | 399 |    request is forwarded to the fallback's :meth:`ngettext` method.  Otherwise, when | 
 | 400 |    *n* is 1 *singular* is returned, and *plural* is returned in all other cases. | 
 | 401 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 402 |  | 
 | 403 | .. method:: GNUTranslations.lngettext(singular, plural, n) | 
 | 404 |  | 
 | 405 |    Equivalent to :meth:`gettext`, but the translation is returned in the preferred | 
 | 406 |    system encoding, if no other encoding was explicitly set with | 
 | 407 |    :meth:`set_output_charset`. | 
 | 408 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 409 |  | 
 | 410 | .. method:: GNUTranslations.ungettext(singular, plural, n) | 
 | 411 |  | 
 | 412 |    Do a plural-forms lookup of a message id.  *singular* is used as the message id | 
 | 413 |    for purposes of lookup in the catalog, while *n* is used to determine which | 
| Georg Brandl | f694518 | 2008-02-01 11:56:49 +0000 | [diff] [blame] | 414 |    plural form to use.  The returned message string is a string. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 415 |  | 
 | 416 |    If the message id is not found in the catalog, and a fallback is specified, the | 
 | 417 |    request is forwarded to the fallback's :meth:`ungettext` method.  Otherwise, | 
 | 418 |    when *n* is 1 *singular* is returned, and *plural* is returned in all other | 
 | 419 |    cases. | 
 | 420 |  | 
 | 421 |    Here is an example:: | 
 | 422 |  | 
 | 423 |       n = len(os.listdir('.')) | 
 | 424 |       cat = GNUTranslations(somefile) | 
 | 425 |       message = cat.ungettext( | 
 | 426 |           'There is %(num)d file in this directory', | 
 | 427 |           'There are %(num)d files in this directory', | 
 | 428 |           n) % {'num': n} | 
 | 429 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 430 |  | 
 | 431 | Solaris message catalog support | 
 | 432 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 
 | 433 |  | 
 | 434 | The Solaris operating system defines its own binary :file:`.mo` file format, but | 
 | 435 | since no documentation can be found on this format, it is not supported at this | 
 | 436 | time. | 
 | 437 |  | 
 | 438 |  | 
 | 439 | The Catalog constructor | 
 | 440 | ^^^^^^^^^^^^^^^^^^^^^^^ | 
 | 441 |  | 
 | 442 | .. index:: single: GNOME | 
 | 443 |  | 
 | 444 | GNOME uses a version of the :mod:`gettext` module by James Henstridge, but this | 
 | 445 | version has a slightly different API.  Its documented usage was:: | 
 | 446 |  | 
 | 447 |    import gettext | 
 | 448 |    cat = gettext.Catalog(domain, localedir) | 
 | 449 |    _ = cat.gettext | 
| Georg Brandl | 6911e3c | 2007-09-04 07:15:32 +0000 | [diff] [blame] | 450 |    print(_('hello world')) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 451 |  | 
 | 452 | For compatibility with this older module, the function :func:`Catalog` is an | 
 | 453 | alias for the :func:`translation` function described above. | 
 | 454 |  | 
 | 455 | One difference between this module and Henstridge's: his catalog objects | 
 | 456 | supported access through a mapping API, but this appears to be unused and so is | 
 | 457 | not currently supported. | 
 | 458 |  | 
 | 459 |  | 
 | 460 | Internationalizing your programs and modules | 
 | 461 | -------------------------------------------- | 
 | 462 |  | 
 | 463 | Internationalization (I18N) refers to the operation by which a program is made | 
 | 464 | aware of multiple languages.  Localization (L10N) refers to the adaptation of | 
 | 465 | your program, once internationalized, to the local language and cultural habits. | 
 | 466 | In order to provide multilingual messages for your Python programs, you need to | 
 | 467 | take the following steps: | 
 | 468 |  | 
 | 469 | #. prepare your program or module by specially marking translatable strings | 
 | 470 |  | 
 | 471 | #. run a suite of tools over your marked files to generate raw messages catalogs | 
 | 472 |  | 
 | 473 | #. create language specific translations of the message catalogs | 
 | 474 |  | 
 | 475 | #. use the :mod:`gettext` module so that message strings are properly translated | 
 | 476 |  | 
 | 477 | In order to prepare your code for I18N, you need to look at all the strings in | 
 | 478 | your files.  Any string that needs to be translated should be marked by wrapping | 
 | 479 | it in ``_('...')`` --- that is, a call to the function :func:`_`.  For example:: | 
 | 480 |  | 
 | 481 |    filename = 'mylog.txt' | 
 | 482 |    message = _('writing a log message') | 
 | 483 |    fp = open(filename, 'w') | 
 | 484 |    fp.write(message) | 
 | 485 |    fp.close() | 
 | 486 |  | 
 | 487 | In this example, the string ``'writing a log message'`` is marked as a candidate | 
 | 488 | for translation, while the strings ``'mylog.txt'`` and ``'w'`` are not. | 
 | 489 |  | 
 | 490 | The Python distribution comes with two tools which help you generate the message | 
 | 491 | catalogs once you've prepared your source code.  These may or may not be | 
 | 492 | available from a binary distribution, but they can be found in a source | 
 | 493 | distribution, in the :file:`Tools/i18n` directory. | 
 | 494 |  | 
 | 495 | The :program:`pygettext` [#]_ program scans all your Python source code looking | 
 | 496 | for the strings you previously marked as translatable.  It is similar to the GNU | 
 | 497 | :program:`gettext` program except that it understands all the intricacies of | 
 | 498 | Python source code, but knows nothing about C or C++ source code.  You don't | 
 | 499 | need GNU ``gettext`` unless you're also going to be translating C code (such as | 
 | 500 | C extension modules). | 
 | 501 |  | 
 | 502 | :program:`pygettext` generates textual Uniforum-style human readable message | 
 | 503 | catalog :file:`.pot` files, essentially structured human readable files which | 
 | 504 | contain every marked string in the source code, along with a placeholder for the | 
 | 505 | translation strings. :program:`pygettext` is a command line script that supports | 
 | 506 | a similar command line interface as :program:`xgettext`; for details on its use, | 
 | 507 | run:: | 
 | 508 |  | 
 | 509 |    pygettext.py --help | 
 | 510 |  | 
 | 511 | Copies of these :file:`.pot` files are then handed over to the individual human | 
 | 512 | translators who write language-specific versions for every supported natural | 
 | 513 | language.  They send you back the filled in language-specific versions as a | 
 | 514 | :file:`.po` file.  Using the :program:`msgfmt.py` [#]_ program (in the | 
 | 515 | :file:`Tools/i18n` directory), you take the :file:`.po` files from your | 
 | 516 | translators and generate the machine-readable :file:`.mo` binary catalog files. | 
 | 517 | The :file:`.mo` files are what the :mod:`gettext` module uses for the actual | 
 | 518 | translation processing during run-time. | 
 | 519 |  | 
 | 520 | How you use the :mod:`gettext` module in your code depends on whether you are | 
 | 521 | internationalizing a single module or your entire application. The next two | 
 | 522 | sections will discuss each case. | 
 | 523 |  | 
 | 524 |  | 
 | 525 | Localizing your module | 
 | 526 | ^^^^^^^^^^^^^^^^^^^^^^ | 
 | 527 |  | 
 | 528 | If you are localizing your module, you must take care not to make global | 
 | 529 | changes, e.g. to the built-in namespace.  You should not use the GNU ``gettext`` | 
 | 530 | API but instead the class-based API. | 
 | 531 |  | 
 | 532 | Let's say your module is called "spam" and the module's various natural language | 
 | 533 | translation :file:`.mo` files reside in :file:`/usr/share/locale` in GNU | 
 | 534 | :program:`gettext` format.  Here's what you would put at the top of your | 
 | 535 | module:: | 
 | 536 |  | 
 | 537 |    import gettext | 
 | 538 |    t = gettext.translation('spam', '/usr/share/locale') | 
 | 539 |    _ = t.lgettext | 
 | 540 |  | 
 | 541 | If your translators were providing you with Unicode strings in their :file:`.po` | 
 | 542 | files, you'd instead do:: | 
 | 543 |  | 
 | 544 |    import gettext | 
 | 545 |    t = gettext.translation('spam', '/usr/share/locale') | 
 | 546 |    _ = t.ugettext | 
 | 547 |  | 
 | 548 |  | 
 | 549 | Localizing your application | 
 | 550 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 
 | 551 |  | 
 | 552 | If you are localizing your application, you can install the :func:`_` function | 
 | 553 | globally into the built-in namespace, usually in the main driver file of your | 
 | 554 | application.  This will let all your application-specific files just use | 
 | 555 | ``_('...')`` without having to explicitly install it in each file. | 
 | 556 |  | 
 | 557 | In the simple case then, you need only add the following bit of code to the main | 
 | 558 | driver file of your application:: | 
 | 559 |  | 
 | 560 |    import gettext | 
 | 561 |    gettext.install('myapplication') | 
 | 562 |  | 
 | 563 | If you need to set the locale directory or the *unicode* flag, you can pass | 
 | 564 | these into the :func:`install` function:: | 
 | 565 |  | 
 | 566 |    import gettext | 
 | 567 |    gettext.install('myapplication', '/usr/share/locale', unicode=1) | 
 | 568 |  | 
 | 569 |  | 
 | 570 | Changing languages on the fly | 
 | 571 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 
 | 572 |  | 
 | 573 | If your program needs to support many languages at the same time, you may want | 
 | 574 | to create multiple translation instances and then switch between them | 
 | 575 | explicitly, like so:: | 
 | 576 |  | 
 | 577 |    import gettext | 
 | 578 |  | 
 | 579 |    lang1 = gettext.translation('myapplication', languages=['en']) | 
 | 580 |    lang2 = gettext.translation('myapplication', languages=['fr']) | 
 | 581 |    lang3 = gettext.translation('myapplication', languages=['de']) | 
 | 582 |  | 
 | 583 |    # start by using language1 | 
 | 584 |    lang1.install() | 
 | 585 |  | 
 | 586 |    # ... time goes by, user selects language 2 | 
 | 587 |    lang2.install() | 
 | 588 |  | 
 | 589 |    # ... more time goes by, user selects language 3 | 
 | 590 |    lang3.install() | 
 | 591 |  | 
 | 592 |  | 
 | 593 | Deferred translations | 
 | 594 | ^^^^^^^^^^^^^^^^^^^^^ | 
 | 595 |  | 
 | 596 | In most coding situations, strings are translated where they are coded. | 
 | 597 | Occasionally however, you need to mark strings for translation, but defer actual | 
 | 598 | translation until later.  A classic example is:: | 
 | 599 |  | 
 | 600 |    animals = ['mollusk', | 
 | 601 |               'albatross', | 
 | 602 |    	   'rat', | 
 | 603 |    	   'penguin', | 
 | 604 |    	   'python', | 
 | 605 |    	   ] | 
 | 606 |    # ... | 
 | 607 |    for a in animals: | 
| Georg Brandl | 6911e3c | 2007-09-04 07:15:32 +0000 | [diff] [blame] | 608 |        print(a) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 609 |  | 
 | 610 | Here, you want to mark the strings in the ``animals`` list as being | 
 | 611 | translatable, but you don't actually want to translate them until they are | 
 | 612 | printed. | 
 | 613 |  | 
 | 614 | Here is one way you can handle this situation:: | 
 | 615 |  | 
 | 616 |    def _(message): return message | 
 | 617 |  | 
 | 618 |    animals = [_('mollusk'), | 
 | 619 |               _('albatross'), | 
 | 620 |    	   _('rat'), | 
 | 621 |    	   _('penguin'), | 
 | 622 |    	   _('python'), | 
 | 623 |    	   ] | 
 | 624 |  | 
 | 625 |    del _ | 
 | 626 |  | 
 | 627 |    # ... | 
 | 628 |    for a in animals: | 
| Georg Brandl | 6911e3c | 2007-09-04 07:15:32 +0000 | [diff] [blame] | 629 |        print(_(a)) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 630 |  | 
 | 631 | This works because the dummy definition of :func:`_` simply returns the string | 
 | 632 | unchanged.  And this dummy definition will temporarily override any definition | 
 | 633 | of :func:`_` in the built-in namespace (until the :keyword:`del` command). Take | 
 | 634 | care, though if you have a previous definition of :func:`_` in the local | 
 | 635 | namespace. | 
 | 636 |  | 
 | 637 | Note that the second use of :func:`_` will not identify "a" as being | 
 | 638 | translatable to the :program:`pygettext` program, since it is not a string. | 
 | 639 |  | 
 | 640 | Another way to handle this is with the following example:: | 
 | 641 |  | 
 | 642 |    def N_(message): return message | 
 | 643 |  | 
 | 644 |    animals = [N_('mollusk'), | 
 | 645 |               N_('albatross'), | 
 | 646 |    	   N_('rat'), | 
 | 647 |    	   N_('penguin'), | 
 | 648 |    	   N_('python'), | 
 | 649 |    	   ] | 
 | 650 |  | 
 | 651 |    # ... | 
 | 652 |    for a in animals: | 
| Georg Brandl | 6911e3c | 2007-09-04 07:15:32 +0000 | [diff] [blame] | 653 |        print(_(a)) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 654 |  | 
 | 655 | In this case, you are marking translatable strings with the function :func:`N_`, | 
 | 656 | [#]_ which won't conflict with any definition of :func:`_`.  However, you will | 
 | 657 | need to teach your message extraction program to look for translatable strings | 
 | 658 | marked with :func:`N_`. :program:`pygettext` and :program:`xpot` both support | 
 | 659 | this through the use of command line switches. | 
 | 660 |  | 
 | 661 |  | 
 | 662 | :func:`gettext` vs. :func:`lgettext` | 
 | 663 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 
 | 664 |  | 
 | 665 | In Python 2.4 the :func:`lgettext` family of functions were introduced. The | 
 | 666 | intention of these functions is to provide an alternative which is more | 
 | 667 | compliant with the current implementation of GNU gettext. Unlike | 
 | 668 | :func:`gettext`, which returns strings encoded with the same codeset used in the | 
 | 669 | translation file, :func:`lgettext` will return strings encoded with the | 
 | 670 | preferred system encoding, as returned by :func:`locale.getpreferredencoding`. | 
 | 671 | Also notice that Python 2.4 introduces new functions to explicitly choose the | 
 | 672 | codeset used in translated strings. If a codeset is explicitly set, even | 
 | 673 | :func:`lgettext` will return translated strings in the requested codeset, as | 
 | 674 | would be expected in the GNU gettext implementation. | 
 | 675 |  | 
 | 676 |  | 
 | 677 | Acknowledgements | 
 | 678 | ---------------- | 
 | 679 |  | 
 | 680 | The following people contributed code, feedback, design suggestions, previous | 
 | 681 | implementations, and valuable experience to the creation of this module: | 
 | 682 |  | 
 | 683 | * Peter Funk | 
 | 684 |  | 
 | 685 | * James Henstridge | 
 | 686 |  | 
 | 687 | * Juan David Ibáñez Palomar | 
 | 688 |  | 
 | 689 | * Marc-André Lemburg | 
 | 690 |  | 
 | 691 | * Martin von Löwis | 
 | 692 |  | 
 | 693 | * François Pinard | 
 | 694 |  | 
 | 695 | * Barry Warsaw | 
 | 696 |  | 
 | 697 | * Gustavo Niemeyer | 
 | 698 |  | 
 | 699 | .. rubric:: Footnotes | 
 | 700 |  | 
 | 701 | .. [#] The default locale directory is system dependent; for example, on RedHat Linux | 
 | 702 |    it is :file:`/usr/share/locale`, but on Solaris it is :file:`/usr/lib/locale`. | 
 | 703 |    The :mod:`gettext` module does not try to support these system dependent | 
 | 704 |    defaults; instead its default is :file:`sys.prefix/share/locale`. For this | 
 | 705 |    reason, it is always best to call :func:`bindtextdomain` with an explicit | 
 | 706 |    absolute path at the start of your application. | 
 | 707 |  | 
 | 708 | .. [#] See the footnote for :func:`bindtextdomain` above. | 
 | 709 |  | 
 | 710 | .. [#] François Pinard has written a program called :program:`xpot` which does a | 
 | 711 |    similar job.  It is available as part of his :program:`po-utils` package at http | 
 | 712 |    ://po-utils.progiciels-bpi.ca/. | 
 | 713 |  | 
 | 714 | .. [#] :program:`msgfmt.py` is binary compatible with GNU :program:`msgfmt` except that | 
 | 715 |    it provides a simpler, all-Python implementation.  With this and | 
 | 716 |    :program:`pygettext.py`, you generally won't need to install the GNU | 
 | 717 |    :program:`gettext` package to internationalize your Python applications. | 
 | 718 |  | 
 | 719 | .. [#] The choice of :func:`N_` here is totally arbitrary; it could have just as easily | 
 | 720 |    been :func:`MarkThisStringForTranslation`. | 
 | 721 |  |