blob: 22ad6682e8bd26f165e056315cfe12aa1222e65b [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001
2:mod:`gettext` --- Multilingual internationalization services
3=============================================================
4
5.. module:: gettext
6 :synopsis: Multilingual internationalization services.
7.. moduleauthor:: Barry A. Warsaw <barry@zope.com>
8.. sectionauthor:: Barry A. Warsaw <barry@zope.com>
9
10
11The :mod:`gettext` module provides internationalization (I18N) and localization
12(L10N) services for your Python modules and applications. It supports both the
13GNU ``gettext`` message catalog API and a higher level, class-based API that may
14be more appropriate for Python files. The interface described below allows you
15to write your module and application messages in one natural language, and
16provide a catalog of translated messages for running under different natural
17languages.
18
19Some hints on localizing your Python modules and applications are also given.
20
21
22GNU :program:`gettext` API
23--------------------------
24
25The :mod:`gettext` module defines the following API, which is very similar to
26the GNU :program:`gettext` API. If you use this API you will affect the
27translation of your entire application globally. Often this is what you want if
28your application is monolingual, with the choice of language dependent on the
29locale of your user. If you are localizing a Python module, or if your
30application needs to switch languages on the fly, you probably want to use the
31class-based API instead.
32
33
34.. function:: bindtextdomain(domain[, localedir])
35
36 Bind the *domain* to the locale directory *localedir*. More concretely,
37 :mod:`gettext` will look for binary :file:`.mo` files for the given domain using
38 the path (on Unix): :file:`localedir/language/LC_MESSAGES/domain.mo`, where
39 *languages* is searched for in the environment variables :envvar:`LANGUAGE`,
40 :envvar:`LC_ALL`, :envvar:`LC_MESSAGES`, and :envvar:`LANG` respectively.
41
42 If *localedir* is omitted or ``None``, then the current binding for *domain* is
43 returned. [#]_
44
45
46.. function:: bind_textdomain_codeset(domain[, codeset])
47
48 Bind the *domain* to *codeset*, changing the encoding of strings returned by the
49 :func:`gettext` family of functions. If *codeset* is omitted, then the current
50 binding is returned.
51
52 .. versionadded:: 2.4
53
54
55.. function:: textdomain([domain])
56
57 Change or query the current global domain. If *domain* is ``None``, then the
58 current global domain is returned, otherwise the global domain is set to
59 *domain*, which is returned.
60
61
62.. function:: gettext(message)
63
64 Return the localized translation of *message*, based on the current global
65 domain, language, and locale directory. This function is usually aliased as
66 :func:`_` in the local namespace (see examples below).
67
68
69.. function:: lgettext(message)
70
71 Equivalent to :func:`gettext`, but the translation is returned in the preferred
72 system encoding, if no other encoding was explicitly set with
73 :func:`bind_textdomain_codeset`.
74
75 .. versionadded:: 2.4
76
77
78.. function:: dgettext(domain, message)
79
80 Like :func:`gettext`, but look the message up in the specified *domain*.
81
82
83.. function:: ldgettext(domain, message)
84
85 Equivalent to :func:`dgettext`, but the translation is returned in the preferred
86 system encoding, if no other encoding was explicitly set with
87 :func:`bind_textdomain_codeset`.
88
89 .. versionadded:: 2.4
90
91
92.. function:: ngettext(singular, plural, n)
93
94 Like :func:`gettext`, but consider plural forms. If a translation is found,
95 apply the plural formula to *n*, and return the resulting message (some
96 languages have more than two plural forms). If no translation is found, return
97 *singular* if *n* is 1; return *plural* otherwise.
98
99 The Plural formula is taken from the catalog header. It is a C or Python
100 expression that has a free variable *n*; the expression evaluates to the index
101 of the plural in the catalog. See the GNU gettext documentation for the precise
102 syntax to be used in :file:`.po` files and the formulas for a variety of
103 languages.
104
105 .. versionadded:: 2.3
106
107
108.. function:: lngettext(singular, plural, n)
109
110 Equivalent to :func:`ngettext`, but the translation is returned in the preferred
111 system encoding, if no other encoding was explicitly set with
112 :func:`bind_textdomain_codeset`.
113
114 .. versionadded:: 2.4
115
116
117.. function:: dngettext(domain, singular, plural, n)
118
119 Like :func:`ngettext`, but look the message up in the specified *domain*.
120
121 .. versionadded:: 2.3
122
123
124.. function:: ldngettext(domain, singular, plural, n)
125
126 Equivalent to :func:`dngettext`, but the translation is returned in the
127 preferred system encoding, if no other encoding was explicitly set with
128 :func:`bind_textdomain_codeset`.
129
130 .. versionadded:: 2.4
131
132Note that GNU :program:`gettext` also defines a :func:`dcgettext` method, but
133this was deemed not useful and so it is currently unimplemented.
134
135Here's an example of typical usage for this API::
136
137 import gettext
138 gettext.bindtextdomain('myapplication', '/path/to/my/language/directory')
139 gettext.textdomain('myapplication')
140 _ = gettext.gettext
141 # ...
142 print _('This is a translatable string.')
143
144
145Class-based API
146---------------
147
148The class-based API of the :mod:`gettext` module gives you more flexibility and
149greater convenience than the GNU :program:`gettext` API. It is the recommended
150way of localizing your Python applications and modules. :mod:`gettext` defines
151a "translations" class which implements the parsing of GNU :file:`.mo` format
152files, and has methods for returning either standard 8-bit strings or Unicode
153strings. Instances of this "translations" class can also install themselves in
154the built-in namespace as the function :func:`_`.
155
156
157.. function:: find(domain[, localedir[, languages[, all]]])
158
159 This function implements the standard :file:`.mo` file search algorithm. It
160 takes a *domain*, identical to what :func:`textdomain` takes. Optional
161 *localedir* is as in :func:`bindtextdomain` Optional *languages* is a list of
162 strings, where each string is a language code.
163
164 If *localedir* is not given, then the default system locale directory is used.
165 [#]_ If *languages* is not given, then the following environment variables are
166 searched: :envvar:`LANGUAGE`, :envvar:`LC_ALL`, :envvar:`LC_MESSAGES`, and
167 :envvar:`LANG`. The first one returning a non-empty value is used for the
168 *languages* variable. The environment variables should contain a colon separated
169 list of languages, which will be split on the colon to produce the expected list
170 of language code strings.
171
172 :func:`find` then expands and normalizes the languages, and then iterates
173 through them, searching for an existing file built of these components:
174
175 :file:`localedir/language/LC_MESSAGES/domain.mo`
176
177 The first such file name that exists is returned by :func:`find`. If no such
178 file is found, then ``None`` is returned. If *all* is given, it returns a list
179 of all file names, in the order in which they appear in the languages list or
180 the environment variables.
181
182
183.. function:: translation(domain[, localedir[, languages[, class_[, fallback[, codeset]]]]])
184
185 Return a :class:`Translations` instance based on the *domain*, *localedir*, and
186 *languages*, which are first passed to :func:`find` to get a list of the
187 associated :file:`.mo` file paths. Instances with identical :file:`.mo` file
188 names are cached. The actual class instantiated is either *class_* if provided,
189 otherwise :class:`GNUTranslations`. The class's constructor must take a single
190 file object argument. If provided, *codeset* will change the charset used to
191 encode translated strings.
192
193 If multiple files are found, later files are used as fallbacks for earlier ones.
194 To allow setting the fallback, :func:`copy.copy` is used to clone each
195 translation object from the cache; the actual instance data is still shared with
196 the cache.
197
198 If no :file:`.mo` file is found, this function raises :exc:`IOError` if
199 *fallback* is false (which is the default), and returns a
200 :class:`NullTranslations` instance if *fallback* is true.
201
202 .. versionchanged:: 2.4
203 Added the *codeset* parameter.
204
205
206.. function:: install(domain[, localedir[, unicode [, codeset[, names]]]])
207
208 This installs the function :func:`_` in Python's builtin namespace, based on
209 *domain*, *localedir*, and *codeset* which are passed to the function
210 :func:`translation`. The *unicode* flag is passed to the resulting translation
211 object's :meth:`install` method.
212
213 For the *names* parameter, please see the description of the translation
214 object's :meth:`install` method.
215
216 As seen below, you usually mark the strings in your application that are
217 candidates for translation, by wrapping them in a call to the :func:`_`
218 function, like this::
219
220 print _('This string will be translated.')
221
222 For convenience, you want the :func:`_` function to be installed in Python's
223 builtin namespace, so it is easily accessible in all modules of your
224 application.
225
226 .. versionchanged:: 2.4
227 Added the *codeset* parameter.
228
229 .. versionchanged:: 2.5
230 Added the *names* parameter.
231
232
233The :class:`NullTranslations` class
234^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
235
236Translation classes are what actually implement the translation of original
237source file message strings to translated message strings. The base class used
238by all translation classes is :class:`NullTranslations`; this provides the basic
239interface you can use to write your own specialized translation classes. Here
240are the methods of :class:`NullTranslations`:
241
242
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000243.. class:: NullTranslations([fp])
Georg Brandl8ec7f652007-08-15 14:28:01 +0000244
245 Takes an optional file object *fp*, which is ignored by the base class.
246 Initializes "protected" instance variables *_info* and *_charset* which are set
247 by derived classes, as well as *_fallback*, which is set through
248 :meth:`add_fallback`. It then calls ``self._parse(fp)`` if *fp* is not
249 ``None``.
250
251
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000252 .. method:: _parse(fp)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000253
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000254 No-op'd in the base class, this method takes file object *fp*, and reads
255 the data from the file, initializing its message catalog. If you have an
256 unsupported message catalog file format, you should override this method
257 to parse your format.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000258
259
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000260 .. method:: add_fallback(fallback)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000261
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000262 Add *fallback* as the fallback object for the current translation
263 object. A translation object should consult the fallback if it cannot provide a
264 translation for a given message.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000265
266
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000267 .. method:: gettext(message)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000268
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000269 If a fallback has been set, forward :meth:`gettext` to the
270 fallback. Otherwise, return the translated message. Overridden in derived
271 classes.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000272
273
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000274 .. method:: lgettext(message)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000275
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000276 If a fallback has been set, forward :meth:`lgettext` to the
277 fallback. Otherwise, return the translated message. Overridden in derived
278 classes.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000279
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000280 .. versionadded:: 2.4
Georg Brandl8ec7f652007-08-15 14:28:01 +0000281
282
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000283 .. method:: ugettext(message)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000284
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000285 If a fallback has been set, forward :meth:`ugettext` to the
286 fallback. Otherwise, return the translated message as a Unicode
287 string. Overridden in derived classes.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000288
289
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000290 .. method:: ngettext(singular, plural, n)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000291
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000292 If a fallback has been set, forward :meth:`ngettext` to the
293 fallback. Otherwise, return the translated message. Overridden in derived
294 classes.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000295
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000296 .. versionadded:: 2.3
Georg Brandl8ec7f652007-08-15 14:28:01 +0000297
298
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000299 .. method:: lngettext(singular, plural, n)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000300
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000301 If a fallback has been set, forward :meth:`ngettext` to the
302 fallback. Otherwise, return the translated message. Overridden in derived
303 classes.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000304
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000305 .. versionadded:: 2.4
Georg Brandl8ec7f652007-08-15 14:28:01 +0000306
307
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000308 .. method:: ungettext(singular, plural, n)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000309
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000310 If a fallback has been set, forward :meth:`ungettext` to the fallback.
311 Otherwise, return the translated message as a Unicode string. Overridden
312 in derived classes.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000313
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000314 .. versionadded:: 2.3
Georg Brandl8ec7f652007-08-15 14:28:01 +0000315
316
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000317 .. method:: info()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000318
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000319 Return the "protected" :attr:`_info` variable.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000320
321
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000322 .. method:: charset()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000323
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000324 Return the "protected" :attr:`_charset` variable.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000325
326
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000327 .. method:: output_charset()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000328
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000329 Return the "protected" :attr:`_output_charset` variable, which defines the
330 encoding used to return translated messages.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000331
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000332 .. versionadded:: 2.4
Georg Brandl8ec7f652007-08-15 14:28:01 +0000333
334
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000335 .. method:: set_output_charset(charset)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000336
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000337 Change the "protected" :attr:`_output_charset` variable, which defines the
338 encoding used to return translated messages.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000339
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000340 .. versionadded:: 2.4
Georg Brandl8ec7f652007-08-15 14:28:01 +0000341
342
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000343 .. method:: install([unicode [, names]])
Georg Brandl8ec7f652007-08-15 14:28:01 +0000344
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000345 If the *unicode* flag is false, this method installs :meth:`self.gettext`
346 into the built-in namespace, binding it to ``_``. If *unicode* is true,
347 it binds :meth:`self.ugettext` instead. By default, *unicode* is false.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000348
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000349 If the *names* parameter is given, it must be a sequence containing the
350 names of functions you want to install in the builtin namespace in
351 addition to :func:`_`. Supported names are ``'gettext'`` (bound to
352 :meth:`self.gettext` or :meth:`self.ugettext` according to the *unicode*
353 flag), ``'ngettext'`` (bound to :meth:`self.ngettext` or
354 :meth:`self.ungettext` according to the *unicode* flag), ``'lgettext'``
355 and ``'lngettext'``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000356
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000357 Note that this is only one way, albeit the most convenient way, to make
358 the :func:`_` function available to your application. Because it affects
359 the entire application globally, and specifically the built-in namespace,
360 localized modules should never install :func:`_`. Instead, they should use
361 this code to make :func:`_` available to their module::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000362
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000363 import gettext
364 t = gettext.translation('mymodule', ...)
365 _ = t.gettext
Georg Brandl8ec7f652007-08-15 14:28:01 +0000366
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000367 This puts :func:`_` only in the module's global namespace and so only
368 affects calls within this module.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000369
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000370 .. versionchanged:: 2.5
371 Added the *names* parameter.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000372
373
374The :class:`GNUTranslations` class
375^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
376
377The :mod:`gettext` module provides one additional class derived from
378:class:`NullTranslations`: :class:`GNUTranslations`. This class overrides
379:meth:`_parse` to enable reading GNU :program:`gettext` format :file:`.mo` files
380in both big-endian and little-endian format. It also coerces both message ids
381and message strings to Unicode.
382
383:class:`GNUTranslations` parses optional meta-data out of the translation
384catalog. It is convention with GNU :program:`gettext` to include meta-data as
385the translation for the empty string. This meta-data is in :rfc:`822`\ -style
386``key: value`` pairs, and should contain the ``Project-Id-Version`` key. If the
387key ``Content-Type`` is found, then the ``charset`` property is used to
388initialize the "protected" :attr:`_charset` instance variable, defaulting to
389``None`` if not found. If the charset encoding is specified, then all message
390ids and message strings read from the catalog are converted to Unicode using
391this encoding. The :meth:`ugettext` method always returns a Unicode, while the
392:meth:`gettext` returns an encoded 8-bit string. For the message id arguments
393of both methods, either Unicode strings or 8-bit strings containing only
394US-ASCII characters are acceptable. Note that the Unicode version of the
395methods (i.e. :meth:`ugettext` and :meth:`ungettext`) are the recommended
396interface to use for internationalized Python programs.
397
398The entire set of key/value pairs are placed into a dictionary and set as the
399"protected" :attr:`_info` instance variable.
400
401If the :file:`.mo` file's magic number is invalid, or if other problems occur
402while reading the file, instantiating a :class:`GNUTranslations` class can raise
403:exc:`IOError`.
404
405The following methods are overridden from the base class implementation:
406
407
408.. method:: GNUTranslations.gettext(message)
409
410 Look up the *message* id in the catalog and return the corresponding message
411 string, as an 8-bit string encoded with the catalog's charset encoding, if
412 known. If there is no entry in the catalog for the *message* id, and a fallback
413 has been set, the look up is forwarded to the fallback's :meth:`gettext` method.
414 Otherwise, the *message* id is returned.
415
416
417.. method:: GNUTranslations.lgettext(message)
418
419 Equivalent to :meth:`gettext`, but the translation is returned in the preferred
420 system encoding, if no other encoding was explicitly set with
421 :meth:`set_output_charset`.
422
423 .. versionadded:: 2.4
424
425
426.. method:: GNUTranslations.ugettext(message)
427
428 Look up the *message* id in the catalog and return the corresponding message
429 string, as a Unicode string. If there is no entry in the catalog for the
430 *message* id, and a fallback has been set, the look up is forwarded to the
431 fallback's :meth:`ugettext` method. Otherwise, the *message* id is returned.
432
433
434.. method:: GNUTranslations.ngettext(singular, plural, n)
435
436 Do a plural-forms lookup of a message id. *singular* is used as the message id
437 for purposes of lookup in the catalog, while *n* is used to determine which
438 plural form to use. The returned message string is an 8-bit string encoded with
439 the catalog's charset encoding, if known.
440
441 If the message id is not found in the catalog, and a fallback is specified, the
442 request is forwarded to the fallback's :meth:`ngettext` method. Otherwise, when
443 *n* is 1 *singular* is returned, and *plural* is returned in all other cases.
444
445 .. versionadded:: 2.3
446
447
448.. method:: GNUTranslations.lngettext(singular, plural, n)
449
450 Equivalent to :meth:`gettext`, but the translation is returned in the preferred
451 system encoding, if no other encoding was explicitly set with
452 :meth:`set_output_charset`.
453
454 .. versionadded:: 2.4
455
456
457.. method:: GNUTranslations.ungettext(singular, plural, n)
458
459 Do a plural-forms lookup of a message id. *singular* is used as the message id
460 for purposes of lookup in the catalog, while *n* is used to determine which
461 plural form to use. The returned message string is a Unicode string.
462
463 If the message id is not found in the catalog, and a fallback is specified, the
464 request is forwarded to the fallback's :meth:`ungettext` method. Otherwise,
465 when *n* is 1 *singular* is returned, and *plural* is returned in all other
466 cases.
467
468 Here is an example::
469
470 n = len(os.listdir('.'))
471 cat = GNUTranslations(somefile)
472 message = cat.ungettext(
473 'There is %(num)d file in this directory',
474 'There are %(num)d files in this directory',
475 n) % {'num': n}
476
477 .. versionadded:: 2.3
478
479
480Solaris message catalog support
481^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
482
483The Solaris operating system defines its own binary :file:`.mo` file format, but
484since no documentation can be found on this format, it is not supported at this
485time.
486
487
488The Catalog constructor
489^^^^^^^^^^^^^^^^^^^^^^^
490
491.. index:: single: GNOME
492
493GNOME uses a version of the :mod:`gettext` module by James Henstridge, but this
494version has a slightly different API. Its documented usage was::
495
496 import gettext
497 cat = gettext.Catalog(domain, localedir)
498 _ = cat.gettext
499 print _('hello world')
500
501For compatibility with this older module, the function :func:`Catalog` is an
502alias for the :func:`translation` function described above.
503
504One difference between this module and Henstridge's: his catalog objects
505supported access through a mapping API, but this appears to be unused and so is
506not currently supported.
507
508
509Internationalizing your programs and modules
510--------------------------------------------
511
512Internationalization (I18N) refers to the operation by which a program is made
513aware of multiple languages. Localization (L10N) refers to the adaptation of
514your program, once internationalized, to the local language and cultural habits.
515In order to provide multilingual messages for your Python programs, you need to
516take the following steps:
517
518#. prepare your program or module by specially marking translatable strings
519
520#. run a suite of tools over your marked files to generate raw messages catalogs
521
522#. create language specific translations of the message catalogs
523
524#. use the :mod:`gettext` module so that message strings are properly translated
525
526In order to prepare your code for I18N, you need to look at all the strings in
527your files. Any string that needs to be translated should be marked by wrapping
528it in ``_('...')`` --- that is, a call to the function :func:`_`. For example::
529
530 filename = 'mylog.txt'
531 message = _('writing a log message')
532 fp = open(filename, 'w')
533 fp.write(message)
534 fp.close()
535
536In this example, the string ``'writing a log message'`` is marked as a candidate
537for translation, while the strings ``'mylog.txt'`` and ``'w'`` are not.
538
539The Python distribution comes with two tools which help you generate the message
540catalogs once you've prepared your source code. These may or may not be
541available from a binary distribution, but they can be found in a source
542distribution, in the :file:`Tools/i18n` directory.
543
544The :program:`pygettext` [#]_ program scans all your Python source code looking
545for the strings you previously marked as translatable. It is similar to the GNU
546:program:`gettext` program except that it understands all the intricacies of
547Python source code, but knows nothing about C or C++ source code. You don't
548need GNU ``gettext`` unless you're also going to be translating C code (such as
549C extension modules).
550
551:program:`pygettext` generates textual Uniforum-style human readable message
552catalog :file:`.pot` files, essentially structured human readable files which
553contain every marked string in the source code, along with a placeholder for the
554translation strings. :program:`pygettext` is a command line script that supports
555a similar command line interface as :program:`xgettext`; for details on its use,
556run::
557
558 pygettext.py --help
559
560Copies of these :file:`.pot` files are then handed over to the individual human
561translators who write language-specific versions for every supported natural
562language. They send you back the filled in language-specific versions as a
563:file:`.po` file. Using the :program:`msgfmt.py` [#]_ program (in the
564:file:`Tools/i18n` directory), you take the :file:`.po` files from your
565translators and generate the machine-readable :file:`.mo` binary catalog files.
566The :file:`.mo` files are what the :mod:`gettext` module uses for the actual
567translation processing during run-time.
568
569How you use the :mod:`gettext` module in your code depends on whether you are
570internationalizing a single module or your entire application. The next two
571sections will discuss each case.
572
573
574Localizing your module
575^^^^^^^^^^^^^^^^^^^^^^
576
577If you are localizing your module, you must take care not to make global
578changes, e.g. to the built-in namespace. You should not use the GNU ``gettext``
579API but instead the class-based API.
580
581Let's say your module is called "spam" and the module's various natural language
582translation :file:`.mo` files reside in :file:`/usr/share/locale` in GNU
583:program:`gettext` format. Here's what you would put at the top of your
584module::
585
586 import gettext
587 t = gettext.translation('spam', '/usr/share/locale')
588 _ = t.lgettext
589
590If your translators were providing you with Unicode strings in their :file:`.po`
591files, you'd instead do::
592
593 import gettext
594 t = gettext.translation('spam', '/usr/share/locale')
595 _ = t.ugettext
596
597
598Localizing your application
599^^^^^^^^^^^^^^^^^^^^^^^^^^^
600
601If you are localizing your application, you can install the :func:`_` function
602globally into the built-in namespace, usually in the main driver file of your
603application. This will let all your application-specific files just use
604``_('...')`` without having to explicitly install it in each file.
605
606In the simple case then, you need only add the following bit of code to the main
607driver file of your application::
608
609 import gettext
610 gettext.install('myapplication')
611
612If you need to set the locale directory or the *unicode* flag, you can pass
613these into the :func:`install` function::
614
615 import gettext
616 gettext.install('myapplication', '/usr/share/locale', unicode=1)
617
618
619Changing languages on the fly
620^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
621
622If your program needs to support many languages at the same time, you may want
623to create multiple translation instances and then switch between them
624explicitly, like so::
625
626 import gettext
627
628 lang1 = gettext.translation('myapplication', languages=['en'])
629 lang2 = gettext.translation('myapplication', languages=['fr'])
630 lang3 = gettext.translation('myapplication', languages=['de'])
631
632 # start by using language1
633 lang1.install()
634
635 # ... time goes by, user selects language 2
636 lang2.install()
637
638 # ... more time goes by, user selects language 3
639 lang3.install()
640
641
642Deferred translations
643^^^^^^^^^^^^^^^^^^^^^
644
645In most coding situations, strings are translated where they are coded.
646Occasionally however, you need to mark strings for translation, but defer actual
647translation until later. A classic example is::
648
649 animals = ['mollusk',
650 'albatross',
651 'rat',
652 'penguin',
653 'python',
654 ]
655 # ...
656 for a in animals:
657 print a
658
659Here, you want to mark the strings in the ``animals`` list as being
660translatable, but you don't actually want to translate them until they are
661printed.
662
663Here is one way you can handle this situation::
664
665 def _(message): return message
666
667 animals = [_('mollusk'),
668 _('albatross'),
669 _('rat'),
670 _('penguin'),
671 _('python'),
672 ]
673
674 del _
675
676 # ...
677 for a in animals:
678 print _(a)
679
680This works because the dummy definition of :func:`_` simply returns the string
681unchanged. And this dummy definition will temporarily override any definition
682of :func:`_` in the built-in namespace (until the :keyword:`del` command). Take
683care, though if you have a previous definition of :func:`_` in the local
684namespace.
685
686Note that the second use of :func:`_` will not identify "a" as being
687translatable to the :program:`pygettext` program, since it is not a string.
688
689Another way to handle this is with the following example::
690
691 def N_(message): return message
692
693 animals = [N_('mollusk'),
694 N_('albatross'),
695 N_('rat'),
696 N_('penguin'),
697 N_('python'),
698 ]
699
700 # ...
701 for a in animals:
702 print _(a)
703
704In this case, you are marking translatable strings with the function :func:`N_`,
705[#]_ which won't conflict with any definition of :func:`_`. However, you will
706need to teach your message extraction program to look for translatable strings
707marked with :func:`N_`. :program:`pygettext` and :program:`xpot` both support
708this through the use of command line switches.
709
710
711:func:`gettext` vs. :func:`lgettext`
712^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
713
714In Python 2.4 the :func:`lgettext` family of functions were introduced. The
715intention of these functions is to provide an alternative which is more
716compliant with the current implementation of GNU gettext. Unlike
717:func:`gettext`, which returns strings encoded with the same codeset used in the
718translation file, :func:`lgettext` will return strings encoded with the
719preferred system encoding, as returned by :func:`locale.getpreferredencoding`.
720Also notice that Python 2.4 introduces new functions to explicitly choose the
721codeset used in translated strings. If a codeset is explicitly set, even
722:func:`lgettext` will return translated strings in the requested codeset, as
723would be expected in the GNU gettext implementation.
724
725
726Acknowledgements
727----------------
728
729The following people contributed code, feedback, design suggestions, previous
730implementations, and valuable experience to the creation of this module:
731
732* Peter Funk
733
734* James Henstridge
735
736* Juan David Ibáñez Palomar
737
738* Marc-André Lemburg
739
740* Martin von Löwis
741
742* François Pinard
743
744* Barry Warsaw
745
746* Gustavo Niemeyer
747
748.. rubric:: Footnotes
749
750.. [#] The default locale directory is system dependent; for example, on RedHat Linux
751 it is :file:`/usr/share/locale`, but on Solaris it is :file:`/usr/lib/locale`.
752 The :mod:`gettext` module does not try to support these system dependent
753 defaults; instead its default is :file:`sys.prefix/share/locale`. For this
754 reason, it is always best to call :func:`bindtextdomain` with an explicit
755 absolute path at the start of your application.
756
757.. [#] See the footnote for :func:`bindtextdomain` above.
758
759.. [#] François Pinard has written a program called :program:`xpot` which does a
760 similar job. It is available as part of his :program:`po-utils` package at http
761 ://po-utils.progiciels-bpi.ca/.
762
763.. [#] :program:`msgfmt.py` is binary compatible with GNU :program:`msgfmt` except that
764 it provides a simpler, all-Python implementation. With this and
765 :program:`pygettext.py`, you generally won't need to install the GNU
766 :program:`gettext` package to internationalize your Python applications.
767
768.. [#] The choice of :func:`N_` here is totally arbitrary; it could have just as easily
769 been :func:`MarkThisStringForTranslation`.
770