blob: 9b4eb0c73bf4ff3b56bd2c73621f1a83b120644e [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001:mod:`gettext` --- Multilingual internationalization services
2=============================================================
3
4.. module:: gettext
5 :synopsis: Multilingual internationalization services.
6.. moduleauthor:: Barry A. Warsaw <barry@zope.com>
7.. sectionauthor:: Barry A. Warsaw <barry@zope.com>
8
Éric Araujo29a0b572011-08-19 02:14:03 +02009**Source code:** :source:`Lib/gettext.py`
10
11--------------
Georg Brandl8ec7f652007-08-15 14:28:01 +000012
13The :mod:`gettext` module provides internationalization (I18N) and localization
14(L10N) services for your Python modules and applications. It supports both the
15GNU ``gettext`` message catalog API and a higher level, class-based API that may
16be more appropriate for Python files. The interface described below allows you
17to write your module and application messages in one natural language, and
18provide a catalog of translated messages for running under different natural
19languages.
20
21Some hints on localizing your Python modules and applications are also given.
22
23
24GNU :program:`gettext` API
25--------------------------
26
27The :mod:`gettext` module defines the following API, which is very similar to
28the GNU :program:`gettext` API. If you use this API you will affect the
29translation of your entire application globally. Often this is what you want if
30your application is monolingual, with the choice of language dependent on the
31locale of your user. If you are localizing a Python module, or if your
32application needs to switch languages on the fly, you probably want to use the
33class-based API instead.
34
35
36.. function:: bindtextdomain(domain[, localedir])
37
38 Bind the *domain* to the locale directory *localedir*. More concretely,
39 :mod:`gettext` will look for binary :file:`.mo` files for the given domain using
40 the path (on Unix): :file:`localedir/language/LC_MESSAGES/domain.mo`, where
41 *languages* is searched for in the environment variables :envvar:`LANGUAGE`,
42 :envvar:`LC_ALL`, :envvar:`LC_MESSAGES`, and :envvar:`LANG` respectively.
43
44 If *localedir* is omitted or ``None``, then the current binding for *domain* is
45 returned. [#]_
46
47
48.. function:: bind_textdomain_codeset(domain[, codeset])
49
50 Bind the *domain* to *codeset*, changing the encoding of strings returned by the
51 :func:`gettext` family of functions. If *codeset* is omitted, then the current
52 binding is returned.
53
54 .. versionadded:: 2.4
55
56
57.. function:: textdomain([domain])
58
59 Change or query the current global domain. If *domain* is ``None``, then the
60 current global domain is returned, otherwise the global domain is set to
61 *domain*, which is returned.
62
63
64.. function:: gettext(message)
65
66 Return the localized translation of *message*, based on the current global
67 domain, language, and locale directory. This function is usually aliased as
68 :func:`_` in the local namespace (see examples below).
69
70
71.. function:: lgettext(message)
72
73 Equivalent to :func:`gettext`, but the translation is returned in the preferred
74 system encoding, if no other encoding was explicitly set with
75 :func:`bind_textdomain_codeset`.
76
77 .. versionadded:: 2.4
78
79
80.. function:: dgettext(domain, message)
81
82 Like :func:`gettext`, but look the message up in the specified *domain*.
83
84
85.. function:: ldgettext(domain, message)
86
87 Equivalent to :func:`dgettext`, but the translation is returned in the preferred
88 system encoding, if no other encoding was explicitly set with
89 :func:`bind_textdomain_codeset`.
90
91 .. versionadded:: 2.4
92
93
94.. function:: ngettext(singular, plural, n)
95
96 Like :func:`gettext`, but consider plural forms. If a translation is found,
97 apply the plural formula to *n*, and return the resulting message (some
98 languages have more than two plural forms). If no translation is found, return
99 *singular* if *n* is 1; return *plural* otherwise.
100
101 The Plural formula is taken from the catalog header. It is a C or Python
102 expression that has a free variable *n*; the expression evaluates to the index
103 of the plural in the catalog. See the GNU gettext documentation for the precise
104 syntax to be used in :file:`.po` files and the formulas for a variety of
105 languages.
106
107 .. versionadded:: 2.3
108
109
110.. function:: lngettext(singular, plural, n)
111
112 Equivalent to :func:`ngettext`, but the translation is returned in the preferred
113 system encoding, if no other encoding was explicitly set with
114 :func:`bind_textdomain_codeset`.
115
116 .. versionadded:: 2.4
117
118
119.. function:: dngettext(domain, singular, plural, n)
120
121 Like :func:`ngettext`, but look the message up in the specified *domain*.
122
123 .. versionadded:: 2.3
124
125
126.. function:: ldngettext(domain, singular, plural, n)
127
128 Equivalent to :func:`dngettext`, but the translation is returned in the
129 preferred system encoding, if no other encoding was explicitly set with
130 :func:`bind_textdomain_codeset`.
131
132 .. versionadded:: 2.4
133
134Note that GNU :program:`gettext` also defines a :func:`dcgettext` method, but
135this was deemed not useful and so it is currently unimplemented.
136
137Here's an example of typical usage for this API::
138
139 import gettext
140 gettext.bindtextdomain('myapplication', '/path/to/my/language/directory')
141 gettext.textdomain('myapplication')
142 _ = gettext.gettext
143 # ...
144 print _('This is a translatable string.')
145
146
147Class-based API
148---------------
149
150The class-based API of the :mod:`gettext` module gives you more flexibility and
151greater convenience than the GNU :program:`gettext` API. It is the recommended
152way of localizing your Python applications and modules. :mod:`gettext` defines
153a "translations" class which implements the parsing of GNU :file:`.mo` format
154files, and has methods for returning either standard 8-bit strings or Unicode
155strings. Instances of this "translations" class can also install themselves in
156the built-in namespace as the function :func:`_`.
157
158
159.. function:: find(domain[, localedir[, languages[, all]]])
160
161 This function implements the standard :file:`.mo` file search algorithm. It
162 takes a *domain*, identical to what :func:`textdomain` takes. Optional
163 *localedir* is as in :func:`bindtextdomain` Optional *languages* is a list of
164 strings, where each string is a language code.
165
166 If *localedir* is not given, then the default system locale directory is used.
167 [#]_ If *languages* is not given, then the following environment variables are
168 searched: :envvar:`LANGUAGE`, :envvar:`LC_ALL`, :envvar:`LC_MESSAGES`, and
169 :envvar:`LANG`. The first one returning a non-empty value is used for the
170 *languages* variable. The environment variables should contain a colon separated
171 list of languages, which will be split on the colon to produce the expected list
172 of language code strings.
173
174 :func:`find` then expands and normalizes the languages, and then iterates
175 through them, searching for an existing file built of these components:
176
177 :file:`localedir/language/LC_MESSAGES/domain.mo`
178
179 The first such file name that exists is returned by :func:`find`. If no such
180 file is found, then ``None`` is returned. If *all* is given, it returns a list
181 of all file names, in the order in which they appear in the languages list or
182 the environment variables.
183
184
185.. function:: translation(domain[, localedir[, languages[, class_[, fallback[, codeset]]]]])
186
187 Return a :class:`Translations` instance based on the *domain*, *localedir*, and
188 *languages*, which are first passed to :func:`find` to get a list of the
189 associated :file:`.mo` file paths. Instances with identical :file:`.mo` file
190 names are cached. The actual class instantiated is either *class_* if provided,
191 otherwise :class:`GNUTranslations`. The class's constructor must take a single
192 file object argument. If provided, *codeset* will change the charset used to
193 encode translated strings.
194
195 If multiple files are found, later files are used as fallbacks for earlier ones.
196 To allow setting the fallback, :func:`copy.copy` is used to clone each
197 translation object from the cache; the actual instance data is still shared with
198 the cache.
199
200 If no :file:`.mo` file is found, this function raises :exc:`IOError` if
201 *fallback* is false (which is the default), and returns a
202 :class:`NullTranslations` instance if *fallback* is true.
203
204 .. versionchanged:: 2.4
205 Added the *codeset* parameter.
206
207
208.. function:: install(domain[, localedir[, unicode [, codeset[, names]]]])
209
Georg Brandld7d4fd72009-07-26 14:37:28 +0000210 This installs the function :func:`_` in Python's builtins namespace, based on
Georg Brandl8ec7f652007-08-15 14:28:01 +0000211 *domain*, *localedir*, and *codeset* which are passed to the function
212 :func:`translation`. The *unicode* flag is passed to the resulting translation
Georg Brandl5b3e7e92009-07-29 16:06:31 +0000213 object's :meth:`~NullTranslations.install` method.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000214
215 For the *names* parameter, please see the description of the translation
Georg Brandl5b3e7e92009-07-29 16:06:31 +0000216 object's :meth:`~NullTranslations.install` method.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000217
218 As seen below, you usually mark the strings in your application that are
219 candidates for translation, by wrapping them in a call to the :func:`_`
220 function, like this::
221
222 print _('This string will be translated.')
223
224 For convenience, you want the :func:`_` function to be installed in Python's
Georg Brandld7d4fd72009-07-26 14:37:28 +0000225 builtins namespace, so it is easily accessible in all modules of your
Georg Brandl8ec7f652007-08-15 14:28:01 +0000226 application.
227
228 .. versionchanged:: 2.4
229 Added the *codeset* parameter.
230
231 .. versionchanged:: 2.5
232 Added the *names* parameter.
233
234
235The :class:`NullTranslations` class
236^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
237
238Translation classes are what actually implement the translation of original
239source file message strings to translated message strings. The base class used
240by all translation classes is :class:`NullTranslations`; this provides the basic
241interface you can use to write your own specialized translation classes. Here
242are the methods of :class:`NullTranslations`:
243
244
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000245.. class:: NullTranslations([fp])
Georg Brandl8ec7f652007-08-15 14:28:01 +0000246
247 Takes an optional file object *fp*, which is ignored by the base class.
248 Initializes "protected" instance variables *_info* and *_charset* which are set
249 by derived classes, as well as *_fallback*, which is set through
250 :meth:`add_fallback`. It then calls ``self._parse(fp)`` if *fp* is not
251 ``None``.
252
253
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000254 .. method:: _parse(fp)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000255
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000256 No-op'd in the base class, this method takes file object *fp*, and reads
257 the data from the file, initializing its message catalog. If you have an
258 unsupported message catalog file format, you should override this method
259 to parse your format.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000260
261
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000262 .. method:: add_fallback(fallback)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000263
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000264 Add *fallback* as the fallback object for the current translation
265 object. A translation object should consult the fallback if it cannot provide a
266 translation for a given message.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000267
268
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000269 .. method:: gettext(message)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000270
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000271 If a fallback has been set, forward :meth:`gettext` to the
272 fallback. Otherwise, return the translated message. Overridden in derived
273 classes.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000274
275
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000276 .. method:: lgettext(message)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000277
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000278 If a fallback has been set, forward :meth:`lgettext` to the
279 fallback. Otherwise, return the translated message. Overridden in derived
280 classes.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000281
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000282 .. versionadded:: 2.4
Georg Brandl8ec7f652007-08-15 14:28:01 +0000283
284
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000285 .. method:: ugettext(message)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000286
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000287 If a fallback has been set, forward :meth:`ugettext` to the
288 fallback. Otherwise, return the translated message as a Unicode
289 string. Overridden in derived classes.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000290
291
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000292 .. method:: ngettext(singular, plural, n)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000293
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000294 If a fallback has been set, forward :meth:`ngettext` to the
295 fallback. Otherwise, return the translated message. Overridden in derived
296 classes.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000297
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000298 .. versionadded:: 2.3
Georg Brandl8ec7f652007-08-15 14:28:01 +0000299
300
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000301 .. method:: lngettext(singular, plural, n)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000302
Éric Araujoeb35f252011-10-08 02:15:04 +0200303 If a fallback has been set, forward :meth:`lngettext` to the
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000304 fallback. Otherwise, return the translated message. Overridden in derived
305 classes.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000306
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000307 .. versionadded:: 2.4
Georg Brandl8ec7f652007-08-15 14:28:01 +0000308
309
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000310 .. method:: ungettext(singular, plural, n)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000311
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000312 If a fallback has been set, forward :meth:`ungettext` to the fallback.
313 Otherwise, return the translated message as a Unicode string. Overridden
314 in derived classes.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000315
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000316 .. versionadded:: 2.3
Georg Brandl8ec7f652007-08-15 14:28:01 +0000317
318
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000319 .. method:: info()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000320
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000321 Return the "protected" :attr:`_info` variable.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000322
323
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000324 .. method:: charset()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000325
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000326 Return the "protected" :attr:`_charset` variable.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000327
328
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000329 .. method:: output_charset()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000330
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000331 Return the "protected" :attr:`_output_charset` variable, which defines the
332 encoding used to return translated messages.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000333
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000334 .. versionadded:: 2.4
Georg Brandl8ec7f652007-08-15 14:28:01 +0000335
336
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000337 .. method:: set_output_charset(charset)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000338
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000339 Change the "protected" :attr:`_output_charset` variable, which defines the
340 encoding used to return translated messages.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000341
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000342 .. versionadded:: 2.4
Georg Brandl8ec7f652007-08-15 14:28:01 +0000343
344
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000345 .. method:: install([unicode [, names]])
Georg Brandl8ec7f652007-08-15 14:28:01 +0000346
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000347 If the *unicode* flag is false, this method installs :meth:`self.gettext`
348 into the built-in namespace, binding it to ``_``. If *unicode* is true,
349 it binds :meth:`self.ugettext` instead. By default, *unicode* is false.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000350
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000351 If the *names* parameter is given, it must be a sequence containing the
Georg Brandld7d4fd72009-07-26 14:37:28 +0000352 names of functions you want to install in the builtins namespace in
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000353 addition to :func:`_`. Supported names are ``'gettext'`` (bound to
354 :meth:`self.gettext` or :meth:`self.ugettext` according to the *unicode*
355 flag), ``'ngettext'`` (bound to :meth:`self.ngettext` or
356 :meth:`self.ungettext` according to the *unicode* flag), ``'lgettext'``
357 and ``'lngettext'``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000358
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000359 Note that this is only one way, albeit the most convenient way, to make
360 the :func:`_` function available to your application. Because it affects
361 the entire application globally, and specifically the built-in namespace,
362 localized modules should never install :func:`_`. Instead, they should use
363 this code to make :func:`_` available to their module::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000364
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000365 import gettext
366 t = gettext.translation('mymodule', ...)
367 _ = t.gettext
Georg Brandl8ec7f652007-08-15 14:28:01 +0000368
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000369 This puts :func:`_` only in the module's global namespace and so only
370 affects calls within this module.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000371
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000372 .. versionchanged:: 2.5
373 Added the *names* parameter.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000374
375
376The :class:`GNUTranslations` class
377^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
378
379The :mod:`gettext` module provides one additional class derived from
380:class:`NullTranslations`: :class:`GNUTranslations`. This class overrides
381:meth:`_parse` to enable reading GNU :program:`gettext` format :file:`.mo` files
382in both big-endian and little-endian format. It also coerces both message ids
383and message strings to Unicode.
384
385:class:`GNUTranslations` parses optional meta-data out of the translation
386catalog. It is convention with GNU :program:`gettext` to include meta-data as
387the translation for the empty string. This meta-data is in :rfc:`822`\ -style
388``key: value`` pairs, and should contain the ``Project-Id-Version`` key. If the
389key ``Content-Type`` is found, then the ``charset`` property is used to
390initialize the "protected" :attr:`_charset` instance variable, defaulting to
391``None`` if not found. If the charset encoding is specified, then all message
392ids and message strings read from the catalog are converted to Unicode using
393this encoding. The :meth:`ugettext` method always returns a Unicode, while the
394:meth:`gettext` returns an encoded 8-bit string. For the message id arguments
395of both methods, either Unicode strings or 8-bit strings containing only
396US-ASCII characters are acceptable. Note that the Unicode version of the
397methods (i.e. :meth:`ugettext` and :meth:`ungettext`) are the recommended
398interface to use for internationalized Python programs.
399
400The entire set of key/value pairs are placed into a dictionary and set as the
401"protected" :attr:`_info` instance variable.
402
403If the :file:`.mo` file's magic number is invalid, or if other problems occur
404while reading the file, instantiating a :class:`GNUTranslations` class can raise
405:exc:`IOError`.
406
407The following methods are overridden from the base class implementation:
408
409
410.. method:: GNUTranslations.gettext(message)
411
412 Look up the *message* id in the catalog and return the corresponding message
413 string, as an 8-bit string encoded with the catalog's charset encoding, if
414 known. If there is no entry in the catalog for the *message* id, and a fallback
415 has been set, the look up is forwarded to the fallback's :meth:`gettext` method.
416 Otherwise, the *message* id is returned.
417
418
419.. method:: GNUTranslations.lgettext(message)
420
421 Equivalent to :meth:`gettext`, but the translation is returned in the preferred
422 system encoding, if no other encoding was explicitly set with
423 :meth:`set_output_charset`.
424
425 .. versionadded:: 2.4
426
427
428.. method:: GNUTranslations.ugettext(message)
429
430 Look up the *message* id in the catalog and return the corresponding message
431 string, as a Unicode string. If there is no entry in the catalog for the
432 *message* id, and a fallback has been set, the look up is forwarded to the
433 fallback's :meth:`ugettext` method. Otherwise, the *message* id is returned.
434
435
436.. method:: GNUTranslations.ngettext(singular, plural, n)
437
438 Do a plural-forms lookup of a message id. *singular* is used as the message id
439 for purposes of lookup in the catalog, while *n* is used to determine which
440 plural form to use. The returned message string is an 8-bit string encoded with
441 the catalog's charset encoding, if known.
442
443 If the message id is not found in the catalog, and a fallback is specified, the
444 request is forwarded to the fallback's :meth:`ngettext` method. Otherwise, when
445 *n* is 1 *singular* is returned, and *plural* is returned in all other cases.
446
447 .. versionadded:: 2.3
448
449
450.. method:: GNUTranslations.lngettext(singular, plural, n)
451
452 Equivalent to :meth:`gettext`, but the translation is returned in the preferred
453 system encoding, if no other encoding was explicitly set with
454 :meth:`set_output_charset`.
455
456 .. versionadded:: 2.4
457
458
459.. method:: GNUTranslations.ungettext(singular, plural, n)
460
461 Do a plural-forms lookup of a message id. *singular* is used as the message id
462 for purposes of lookup in the catalog, while *n* is used to determine which
463 plural form to use. The returned message string is a Unicode string.
464
465 If the message id is not found in the catalog, and a fallback is specified, the
466 request is forwarded to the fallback's :meth:`ungettext` method. Otherwise,
467 when *n* is 1 *singular* is returned, and *plural* is returned in all other
468 cases.
469
470 Here is an example::
471
472 n = len(os.listdir('.'))
473 cat = GNUTranslations(somefile)
474 message = cat.ungettext(
475 'There is %(num)d file in this directory',
476 'There are %(num)d files in this directory',
477 n) % {'num': n}
478
479 .. versionadded:: 2.3
480
481
482Solaris message catalog support
483^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
484
485The Solaris operating system defines its own binary :file:`.mo` file format, but
486since no documentation can be found on this format, it is not supported at this
487time.
488
489
490The Catalog constructor
491^^^^^^^^^^^^^^^^^^^^^^^
492
493.. index:: single: GNOME
494
495GNOME uses a version of the :mod:`gettext` module by James Henstridge, but this
496version has a slightly different API. Its documented usage was::
497
498 import gettext
499 cat = gettext.Catalog(domain, localedir)
500 _ = cat.gettext
501 print _('hello world')
502
503For compatibility with this older module, the function :func:`Catalog` is an
504alias for the :func:`translation` function described above.
505
506One difference between this module and Henstridge's: his catalog objects
507supported access through a mapping API, but this appears to be unused and so is
508not currently supported.
509
510
511Internationalizing your programs and modules
512--------------------------------------------
513
514Internationalization (I18N) refers to the operation by which a program is made
515aware of multiple languages. Localization (L10N) refers to the adaptation of
516your program, once internationalized, to the local language and cultural habits.
517In order to provide multilingual messages for your Python programs, you need to
518take the following steps:
519
520#. prepare your program or module by specially marking translatable strings
521
522#. run a suite of tools over your marked files to generate raw messages catalogs
523
524#. create language specific translations of the message catalogs
525
526#. use the :mod:`gettext` module so that message strings are properly translated
527
528In order to prepare your code for I18N, you need to look at all the strings in
529your files. Any string that needs to be translated should be marked by wrapping
530it in ``_('...')`` --- that is, a call to the function :func:`_`. For example::
531
532 filename = 'mylog.txt'
533 message = _('writing a log message')
534 fp = open(filename, 'w')
535 fp.write(message)
536 fp.close()
537
538In this example, the string ``'writing a log message'`` is marked as a candidate
539for translation, while the strings ``'mylog.txt'`` and ``'w'`` are not.
540
541The Python distribution comes with two tools which help you generate the message
542catalogs once you've prepared your source code. These may or may not be
543available from a binary distribution, but they can be found in a source
544distribution, in the :file:`Tools/i18n` directory.
545
546The :program:`pygettext` [#]_ program scans all your Python source code looking
547for the strings you previously marked as translatable. It is similar to the GNU
548:program:`gettext` program except that it understands all the intricacies of
549Python source code, but knows nothing about C or C++ source code. You don't
550need GNU ``gettext`` unless you're also going to be translating C code (such as
551C extension modules).
552
553:program:`pygettext` generates textual Uniforum-style human readable message
554catalog :file:`.pot` files, essentially structured human readable files which
555contain every marked string in the source code, along with a placeholder for the
556translation strings. :program:`pygettext` is a command line script that supports
557a similar command line interface as :program:`xgettext`; for details on its use,
558run::
559
560 pygettext.py --help
561
562Copies of these :file:`.pot` files are then handed over to the individual human
563translators who write language-specific versions for every supported natural
564language. They send you back the filled in language-specific versions as a
565:file:`.po` file. Using the :program:`msgfmt.py` [#]_ program (in the
566:file:`Tools/i18n` directory), you take the :file:`.po` files from your
567translators and generate the machine-readable :file:`.mo` binary catalog files.
568The :file:`.mo` files are what the :mod:`gettext` module uses for the actual
569translation processing during run-time.
570
571How you use the :mod:`gettext` module in your code depends on whether you are
572internationalizing a single module or your entire application. The next two
573sections will discuss each case.
574
575
576Localizing your module
577^^^^^^^^^^^^^^^^^^^^^^
578
579If you are localizing your module, you must take care not to make global
580changes, e.g. to the built-in namespace. You should not use the GNU ``gettext``
581API but instead the class-based API.
582
583Let's say your module is called "spam" and the module's various natural language
584translation :file:`.mo` files reside in :file:`/usr/share/locale` in GNU
585:program:`gettext` format. Here's what you would put at the top of your
586module::
587
588 import gettext
589 t = gettext.translation('spam', '/usr/share/locale')
590 _ = t.lgettext
591
592If your translators were providing you with Unicode strings in their :file:`.po`
593files, you'd instead do::
594
595 import gettext
596 t = gettext.translation('spam', '/usr/share/locale')
597 _ = t.ugettext
598
599
600Localizing your application
601^^^^^^^^^^^^^^^^^^^^^^^^^^^
602
603If you are localizing your application, you can install the :func:`_` function
604globally into the built-in namespace, usually in the main driver file of your
605application. This will let all your application-specific files just use
606``_('...')`` without having to explicitly install it in each file.
607
608In the simple case then, you need only add the following bit of code to the main
609driver file of your application::
610
611 import gettext
612 gettext.install('myapplication')
613
614If you need to set the locale directory or the *unicode* flag, you can pass
615these into the :func:`install` function::
616
617 import gettext
618 gettext.install('myapplication', '/usr/share/locale', unicode=1)
619
620
621Changing languages on the fly
622^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
623
624If your program needs to support many languages at the same time, you may want
625to create multiple translation instances and then switch between them
626explicitly, like so::
627
628 import gettext
629
630 lang1 = gettext.translation('myapplication', languages=['en'])
631 lang2 = gettext.translation('myapplication', languages=['fr'])
632 lang3 = gettext.translation('myapplication', languages=['de'])
633
634 # start by using language1
635 lang1.install()
636
637 # ... time goes by, user selects language 2
638 lang2.install()
639
640 # ... more time goes by, user selects language 3
641 lang3.install()
642
643
644Deferred translations
645^^^^^^^^^^^^^^^^^^^^^
646
647In most coding situations, strings are translated where they are coded.
648Occasionally however, you need to mark strings for translation, but defer actual
649translation until later. A classic example is::
650
651 animals = ['mollusk',
652 'albatross',
Georg Brandl7044b112009-01-03 21:04:55 +0000653 'rat',
654 'penguin',
655 'python', ]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000656 # ...
657 for a in animals:
658 print a
659
660Here, you want to mark the strings in the ``animals`` list as being
661translatable, but you don't actually want to translate them until they are
662printed.
663
664Here is one way you can handle this situation::
665
666 def _(message): return message
667
668 animals = [_('mollusk'),
669 _('albatross'),
Georg Brandl7044b112009-01-03 21:04:55 +0000670 _('rat'),
671 _('penguin'),
672 _('python'), ]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000673
674 del _
675
676 # ...
677 for a in animals:
678 print _(a)
679
680This works because the dummy definition of :func:`_` simply returns the string
681unchanged. And this dummy definition will temporarily override any definition
682of :func:`_` in the built-in namespace (until the :keyword:`del` command). Take
683care, though if you have a previous definition of :func:`_` in the local
684namespace.
685
686Note that the second use of :func:`_` will not identify "a" as being
687translatable to the :program:`pygettext` program, since it is not a string.
688
689Another way to handle this is with the following example::
690
691 def N_(message): return message
692
693 animals = [N_('mollusk'),
694 N_('albatross'),
Georg Brandl7044b112009-01-03 21:04:55 +0000695 N_('rat'),
696 N_('penguin'),
697 N_('python'), ]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000698
699 # ...
700 for a in animals:
701 print _(a)
702
703In this case, you are marking translatable strings with the function :func:`N_`,
704[#]_ which won't conflict with any definition of :func:`_`. However, you will
705need to teach your message extraction program to look for translatable strings
706marked with :func:`N_`. :program:`pygettext` and :program:`xpot` both support
707this through the use of command line switches.
708
709
710:func:`gettext` vs. :func:`lgettext`
711^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
712
713In Python 2.4 the :func:`lgettext` family of functions were introduced. The
714intention of these functions is to provide an alternative which is more
715compliant with the current implementation of GNU gettext. Unlike
716:func:`gettext`, which returns strings encoded with the same codeset used in the
717translation file, :func:`lgettext` will return strings encoded with the
718preferred system encoding, as returned by :func:`locale.getpreferredencoding`.
719Also notice that Python 2.4 introduces new functions to explicitly choose the
720codeset used in translated strings. If a codeset is explicitly set, even
721:func:`lgettext` will return translated strings in the requested codeset, as
722would be expected in the GNU gettext implementation.
723
724
725Acknowledgements
726----------------
727
728The following people contributed code, feedback, design suggestions, previous
729implementations, and valuable experience to the creation of this module:
730
731* Peter Funk
732
733* James Henstridge
734
735* Juan David Ibáñez Palomar
736
737* Marc-André Lemburg
738
739* Martin von Löwis
740
741* François Pinard
742
743* Barry Warsaw
744
745* Gustavo Niemeyer
746
747.. rubric:: Footnotes
748
749.. [#] The default locale directory is system dependent; for example, on RedHat Linux
750 it is :file:`/usr/share/locale`, but on Solaris it is :file:`/usr/lib/locale`.
751 The :mod:`gettext` module does not try to support these system dependent
752 defaults; instead its default is :file:`sys.prefix/share/locale`. For this
753 reason, it is always best to call :func:`bindtextdomain` with an explicit
754 absolute path at the start of your application.
755
756.. [#] See the footnote for :func:`bindtextdomain` above.
757
758.. [#] François Pinard has written a program called :program:`xpot` which does a
Éric Araujoeb35f252011-10-08 02:15:04 +0200759 similar job. It is available as part of his `po-utils package
760 <http://po-utils.progiciels-bpi.ca/>`_.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000761
762.. [#] :program:`msgfmt.py` is binary compatible with GNU :program:`msgfmt` except that
763 it provides a simpler, all-Python implementation. With this and
764 :program:`pygettext.py`, you generally won't need to install the GNU
765 :program:`gettext` package to internationalize your Python applications.
766
767.. [#] The choice of :func:`N_` here is totally arbitrary; it could have just as easily
768 been :func:`MarkThisStringForTranslation`.
769