blob: 407853c2d7efadbdc2109c51248ec165c7b78df4 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`gettext` --- Multilingual internationalization services
2=============================================================
3
4.. module:: gettext
5 :synopsis: Multilingual internationalization services.
Terry Jan Reedyfa089b92016-06-11 15:02:54 -04006
Andrew Kuchling587e9702013-11-12 10:02:35 -05007.. moduleauthor:: Barry A. Warsaw <barry@python.org>
8.. sectionauthor:: Barry A. Warsaw <barry@python.org>
Georg Brandl116aa622007-08-15 14:28:22 +00009
Raymond Hettinger469271d2011-01-27 20:38:46 +000010**Source code:** :source:`Lib/gettext.py`
11
12--------------
Georg Brandl116aa622007-08-15 14:28:22 +000013
14The :mod:`gettext` module provides internationalization (I18N) and localization
15(L10N) services for your Python modules and applications. It supports both the
16GNU ``gettext`` message catalog API and a higher level, class-based API that may
17be more appropriate for Python files. The interface described below allows you
18to write your module and application messages in one natural language, and
19provide a catalog of translated messages for running under different natural
20languages.
21
22Some hints on localizing your Python modules and applications are also given.
23
24
25GNU :program:`gettext` API
26--------------------------
27
28The :mod:`gettext` module defines the following API, which is very similar to
29the GNU :program:`gettext` API. If you use this API you will affect the
30translation of your entire application globally. Often this is what you want if
31your application is monolingual, with the choice of language dependent on the
32locale of your user. If you are localizing a Python module, or if your
33application needs to switch languages on the fly, you probably want to use the
34class-based API instead.
35
36
Georg Brandl036490d2009-05-17 13:00:36 +000037.. function:: bindtextdomain(domain, localedir=None)
Georg Brandl116aa622007-08-15 14:28:22 +000038
39 Bind the *domain* to the locale directory *localedir*. More concretely,
40 :mod:`gettext` will look for binary :file:`.mo` files for the given domain using
41 the path (on Unix): :file:`localedir/language/LC_MESSAGES/domain.mo`, where
42 *languages* is searched for in the environment variables :envvar:`LANGUAGE`,
43 :envvar:`LC_ALL`, :envvar:`LC_MESSAGES`, and :envvar:`LANG` respectively.
44
45 If *localedir* is omitted or ``None``, then the current binding for *domain* is
46 returned. [#]_
47
48
Georg Brandl036490d2009-05-17 13:00:36 +000049.. function:: bind_textdomain_codeset(domain, codeset=None)
Georg Brandl116aa622007-08-15 14:28:22 +000050
Serhiy Storchaka26cb4652017-06-20 17:13:29 +030051 Bind the *domain* to *codeset*, changing the encoding of byte strings
52 returned by the :func:`lgettext`, :func:`ldgettext`, :func:`lngettext`
53 and :func:`ldngettext` functions.
54 If *codeset* is omitted, then the current binding is returned.
Georg Brandl116aa622007-08-15 14:28:22 +000055
Georg Brandl116aa622007-08-15 14:28:22 +000056
Georg Brandl036490d2009-05-17 13:00:36 +000057.. function:: textdomain(domain=None)
Georg Brandl116aa622007-08-15 14:28:22 +000058
59 Change or query the current global domain. If *domain* is ``None``, then the
60 current global domain is returned, otherwise the global domain is set to
61 *domain*, which is returned.
62
63
64.. function:: gettext(message)
65
66 Return the localized translation of *message*, based on the current global
67 domain, language, and locale directory. This function is usually aliased as
68 :func:`_` in the local namespace (see examples below).
69
70
Georg Brandl116aa622007-08-15 14:28:22 +000071.. function:: dgettext(domain, message)
72
Serhiy Storchaka26cb4652017-06-20 17:13:29 +030073 Like :func:`.gettext`, but look the message up in the specified *domain*.
Georg Brandl116aa622007-08-15 14:28:22 +000074
Georg Brandl116aa622007-08-15 14:28:22 +000075
76.. function:: ngettext(singular, plural, n)
77
Serhiy Storchaka26cb4652017-06-20 17:13:29 +030078 Like :func:`.gettext`, but consider plural forms. If a translation is found,
Georg Brandl116aa622007-08-15 14:28:22 +000079 apply the plural formula to *n*, and return the resulting message (some
80 languages have more than two plural forms). If no translation is found, return
81 *singular* if *n* is 1; return *plural* otherwise.
82
83 The Plural formula is taken from the catalog header. It is a C or Python
84 expression that has a free variable *n*; the expression evaluates to the index
Andrew Kuchling30c5ad22013-11-19 11:05:20 -050085 of the plural in the catalog. See
86 `the GNU gettext documentation <https://www.gnu.org/software/gettext/manual/gettext.html>`__
87 for the precise syntax to be used in :file:`.po` files and the
88 formulas for a variety of languages.
Georg Brandl116aa622007-08-15 14:28:22 +000089
Georg Brandl116aa622007-08-15 14:28:22 +000090
Georg Brandl116aa622007-08-15 14:28:22 +000091.. function:: dngettext(domain, singular, plural, n)
92
93 Like :func:`ngettext`, but look the message up in the specified *domain*.
94
Georg Brandl116aa622007-08-15 14:28:22 +000095
Serhiy Storchaka26cb4652017-06-20 17:13:29 +030096.. function:: lgettext(message)
97.. function:: ldgettext(domain, message)
98.. function:: lngettext(singular, plural, n)
Georg Brandl116aa622007-08-15 14:28:22 +000099.. function:: ldngettext(domain, singular, plural, n)
100
Serhiy Storchaka26cb4652017-06-20 17:13:29 +0300101 Equivalent to the corresponding functions without the ``l`` prefix
102 (:func:`.gettext`, :func:`dgettext`, :func:`ngettext` and :func:`dngettext`),
103 but the translation is returned as a byte string encoded in the preferred
104 system encoding if no other encoding was explicitly set with
Georg Brandl116aa622007-08-15 14:28:22 +0000105 :func:`bind_textdomain_codeset`.
106
Serhiy Storchaka26cb4652017-06-20 17:13:29 +0300107 .. warning::
108
109 These functions should be avoided in Python 3, because they return
110 encoded bytes. It's much better to use alternatives which return
111 Unicode strings instead, since most Python applications will want to
112 manipulate human readable text as strings instead of bytes. Further,
113 it's possible that you may get unexpected Unicode-related exceptions
114 if there are encoding problems with the translated strings. It is
115 possible that the ``l*()`` functions will be deprecated in future Python
116 versions due to their inherent problems and limitations.
117
Georg Brandl116aa622007-08-15 14:28:22 +0000118
119Note that GNU :program:`gettext` also defines a :func:`dcgettext` method, but
120this was deemed not useful and so it is currently unimplemented.
121
122Here's an example of typical usage for this API::
123
124 import gettext
125 gettext.bindtextdomain('myapplication', '/path/to/my/language/directory')
126 gettext.textdomain('myapplication')
127 _ = gettext.gettext
128 # ...
Georg Brandl6911e3c2007-09-04 07:15:32 +0000129 print(_('This is a translatable string.'))
Georg Brandl116aa622007-08-15 14:28:22 +0000130
131
132Class-based API
133---------------
134
135The class-based API of the :mod:`gettext` module gives you more flexibility and
136greater convenience than the GNU :program:`gettext` API. It is the recommended
Serhiy Storchakac02a1f42017-10-04 20:28:20 +0300137way of localizing your Python applications and modules. :mod:`!gettext` defines
Georg Brandl116aa622007-08-15 14:28:22 +0000138a "translations" class which implements the parsing of GNU :file:`.mo` format
Georg Brandlf6945182008-02-01 11:56:49 +0000139files, and has methods for returning strings. Instances of this "translations"
140class can also install themselves in the built-in namespace as the function
141:func:`_`.
Georg Brandl116aa622007-08-15 14:28:22 +0000142
143
Georg Brandl036490d2009-05-17 13:00:36 +0000144.. function:: find(domain, localedir=None, languages=None, all=False)
Georg Brandl116aa622007-08-15 14:28:22 +0000145
146 This function implements the standard :file:`.mo` file search algorithm. It
147 takes a *domain*, identical to what :func:`textdomain` takes. Optional
148 *localedir* is as in :func:`bindtextdomain` Optional *languages* is a list of
149 strings, where each string is a language code.
150
151 If *localedir* is not given, then the default system locale directory is used.
152 [#]_ If *languages* is not given, then the following environment variables are
153 searched: :envvar:`LANGUAGE`, :envvar:`LC_ALL`, :envvar:`LC_MESSAGES`, and
154 :envvar:`LANG`. The first one returning a non-empty value is used for the
155 *languages* variable. The environment variables should contain a colon separated
156 list of languages, which will be split on the colon to produce the expected list
157 of language code strings.
158
159 :func:`find` then expands and normalizes the languages, and then iterates
160 through them, searching for an existing file built of these components:
161
Georg Brandl036490d2009-05-17 13:00:36 +0000162 :file:`{localedir}/{language}/LC_MESSAGES/{domain}.mo`
Georg Brandl116aa622007-08-15 14:28:22 +0000163
164 The first such file name that exists is returned by :func:`find`. If no such
165 file is found, then ``None`` is returned. If *all* is given, it returns a list
166 of all file names, in the order in which they appear in the languages list or
167 the environment variables.
168
169
Georg Brandl036490d2009-05-17 13:00:36 +0000170.. function:: translation(domain, localedir=None, languages=None, class_=None, fallback=False, codeset=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000171
Georg Brandlbded4d32008-07-17 18:15:35 +0000172 Return a :class:`Translations` instance based on the *domain*, *localedir*,
173 and *languages*, which are first passed to :func:`find` to get a list of the
Georg Brandl116aa622007-08-15 14:28:22 +0000174 associated :file:`.mo` file paths. Instances with identical :file:`.mo` file
Georg Brandlbded4d32008-07-17 18:15:35 +0000175 names are cached. The actual class instantiated is either *class_* if
176 provided, otherwise :class:`GNUTranslations`. The class's constructor must
Antoine Pitrou11cb9612010-09-15 11:11:28 +0000177 take a single :term:`file object` argument. If provided, *codeset* will change
Serhiy Storchaka26cb4652017-06-20 17:13:29 +0300178 the charset used to encode translated strings in the
179 :meth:`~NullTranslations.lgettext` and :meth:`~NullTranslations.lngettext`
180 methods.
Georg Brandl116aa622007-08-15 14:28:22 +0000181
182 If multiple files are found, later files are used as fallbacks for earlier ones.
183 To allow setting the fallback, :func:`copy.copy` is used to clone each
184 translation object from the cache; the actual instance data is still shared with
185 the cache.
186
Antoine Pitrou62ab10a02011-10-12 20:10:51 +0200187 If no :file:`.mo` file is found, this function raises :exc:`OSError` if
Georg Brandl116aa622007-08-15 14:28:22 +0000188 *fallback* is false (which is the default), and returns a
189 :class:`NullTranslations` instance if *fallback* is true.
190
Antoine Pitrou62ab10a02011-10-12 20:10:51 +0200191 .. versionchanged:: 3.3
192 :exc:`IOError` used to be raised instead of :exc:`OSError`.
193
Georg Brandl116aa622007-08-15 14:28:22 +0000194
Georg Brandl036490d2009-05-17 13:00:36 +0000195.. function:: install(domain, localedir=None, codeset=None, names=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000196
Georg Brandl22b34312009-07-26 14:54:51 +0000197 This installs the function :func:`_` in Python's builtins namespace, based on
Georg Brandl116aa622007-08-15 14:28:22 +0000198 *domain*, *localedir*, and *codeset* which are passed to the function
Benjamin Peterson801844d2008-07-14 14:32:15 +0000199 :func:`translation`.
Georg Brandl116aa622007-08-15 14:28:22 +0000200
201 For the *names* parameter, please see the description of the translation
Alexandre Vassalotti6d3dfc32009-07-29 19:54:39 +0000202 object's :meth:`~NullTranslations.install` method.
Georg Brandl116aa622007-08-15 14:28:22 +0000203
204 As seen below, you usually mark the strings in your application that are
205 candidates for translation, by wrapping them in a call to the :func:`_`
206 function, like this::
207
Georg Brandl6911e3c2007-09-04 07:15:32 +0000208 print(_('This string will be translated.'))
Georg Brandl116aa622007-08-15 14:28:22 +0000209
210 For convenience, you want the :func:`_` function to be installed in Python's
Georg Brandl22b34312009-07-26 14:54:51 +0000211 builtins namespace, so it is easily accessible in all modules of your
Georg Brandl116aa622007-08-15 14:28:22 +0000212 application.
213
Georg Brandl116aa622007-08-15 14:28:22 +0000214
215The :class:`NullTranslations` class
216^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
217
218Translation classes are what actually implement the translation of original
219source file message strings to translated message strings. The base class used
220by all translation classes is :class:`NullTranslations`; this provides the basic
221interface you can use to write your own specialized translation classes. Here
Serhiy Storchakac02a1f42017-10-04 20:28:20 +0300222are the methods of :class:`!NullTranslations`:
Georg Brandl116aa622007-08-15 14:28:22 +0000223
224
Georg Brandl036490d2009-05-17 13:00:36 +0000225.. class:: NullTranslations(fp=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000226
Antoine Pitrou11cb9612010-09-15 11:11:28 +0000227 Takes an optional :term:`file object` *fp*, which is ignored by the base class.
Georg Brandl116aa622007-08-15 14:28:22 +0000228 Initializes "protected" instance variables *_info* and *_charset* which are set
229 by derived classes, as well as *_fallback*, which is set through
230 :meth:`add_fallback`. It then calls ``self._parse(fp)`` if *fp* is not
231 ``None``.
232
Georg Brandlbded4d32008-07-17 18:15:35 +0000233 .. method:: _parse(fp)
Georg Brandl116aa622007-08-15 14:28:22 +0000234
Georg Brandlbded4d32008-07-17 18:15:35 +0000235 No-op'd in the base class, this method takes file object *fp*, and reads
236 the data from the file, initializing its message catalog. If you have an
237 unsupported message catalog file format, you should override this method
238 to parse your format.
Georg Brandl116aa622007-08-15 14:28:22 +0000239
240
Georg Brandlbded4d32008-07-17 18:15:35 +0000241 .. method:: add_fallback(fallback)
Georg Brandl116aa622007-08-15 14:28:22 +0000242
Georg Brandlbded4d32008-07-17 18:15:35 +0000243 Add *fallback* as the fallback object for the current translation object.
244 A translation object should consult the fallback if it cannot provide a
245 translation for a given message.
Georg Brandl116aa622007-08-15 14:28:22 +0000246
247
Georg Brandlbded4d32008-07-17 18:15:35 +0000248 .. method:: gettext(message)
Georg Brandl116aa622007-08-15 14:28:22 +0000249
Serhiy Storchakac02a1f42017-10-04 20:28:20 +0300250 If a fallback has been set, forward :meth:`!gettext` to the fallback.
Serhiy Storchaka26cb4652017-06-20 17:13:29 +0300251 Otherwise, return *message*. Overridden in derived classes.
Georg Brandl116aa622007-08-15 14:28:22 +0000252
Georg Brandl116aa622007-08-15 14:28:22 +0000253
Georg Brandlbded4d32008-07-17 18:15:35 +0000254 .. method:: ngettext(singular, plural, n)
Georg Brandl116aa622007-08-15 14:28:22 +0000255
Serhiy Storchakac02a1f42017-10-04 20:28:20 +0300256 If a fallback has been set, forward :meth:`!ngettext` to the fallback.
Serhiy Storchaka26cb4652017-06-20 17:13:29 +0300257 Otherwise, return *singular* if *n* is 1; return *plural* otherwise.
258 Overridden in derived classes.
Georg Brandl116aa622007-08-15 14:28:22 +0000259
260
Serhiy Storchaka26cb4652017-06-20 17:13:29 +0300261 .. method:: lgettext(message)
Georg Brandlbded4d32008-07-17 18:15:35 +0000262 .. method:: lngettext(singular, plural, n)
Georg Brandl116aa622007-08-15 14:28:22 +0000263
Serhiy Storchakac02a1f42017-10-04 20:28:20 +0300264 Equivalent to :meth:`.gettext` and :meth:`.ngettext`, but the translation
Serhiy Storchaka26cb4652017-06-20 17:13:29 +0300265 is returned as a byte string encoded in the preferred system encoding
266 if no encoding was explicitly set with :meth:`set_output_charset`.
267 Overridden in derived classes.
268
269 .. warning::
270
271 These methods should be avoided in Python 3. See the warning for the
272 :func:`lgettext` function.
Georg Brandl116aa622007-08-15 14:28:22 +0000273
Georg Brandl116aa622007-08-15 14:28:22 +0000274
Georg Brandlbded4d32008-07-17 18:15:35 +0000275 .. method:: info()
Georg Brandl116aa622007-08-15 14:28:22 +0000276
Georg Brandlbded4d32008-07-17 18:15:35 +0000277 Return the "protected" :attr:`_info` variable.
Georg Brandl116aa622007-08-15 14:28:22 +0000278
Georg Brandl116aa622007-08-15 14:28:22 +0000279
Georg Brandlbded4d32008-07-17 18:15:35 +0000280 .. method:: charset()
Georg Brandl116aa622007-08-15 14:28:22 +0000281
Serhiy Storchaka26cb4652017-06-20 17:13:29 +0300282 Return the encoding of the message catalog file.
Georg Brandl116aa622007-08-15 14:28:22 +0000283
Georg Brandl116aa622007-08-15 14:28:22 +0000284
Georg Brandlbded4d32008-07-17 18:15:35 +0000285 .. method:: output_charset()
Georg Brandl116aa622007-08-15 14:28:22 +0000286
Serhiy Storchaka26cb4652017-06-20 17:13:29 +0300287 Return the encoding used to return translated messages in :meth:`.lgettext`
288 and :meth:`.lngettext`.
Georg Brandl116aa622007-08-15 14:28:22 +0000289
290
Georg Brandlbded4d32008-07-17 18:15:35 +0000291 .. method:: set_output_charset(charset)
Georg Brandl116aa622007-08-15 14:28:22 +0000292
Serhiy Storchaka26cb4652017-06-20 17:13:29 +0300293 Change the encoding used to return translated messages.
Georg Brandl116aa622007-08-15 14:28:22 +0000294
Benjamin Peterson801844d2008-07-14 14:32:15 +0000295
Georg Brandl036490d2009-05-17 13:00:36 +0000296 .. method:: install(names=None)
Benjamin Peterson801844d2008-07-14 14:32:15 +0000297
Serhiy Storchaka26cb4652017-06-20 17:13:29 +0300298 This method installs :meth:`.gettext` into the built-in namespace,
Georg Brandlbded4d32008-07-17 18:15:35 +0000299 binding it to ``_``.
Benjamin Peterson801844d2008-07-14 14:32:15 +0000300
Georg Brandlbded4d32008-07-17 18:15:35 +0000301 If the *names* parameter is given, it must be a sequence containing the
Georg Brandl22b34312009-07-26 14:54:51 +0000302 names of functions you want to install in the builtins namespace in
Serhiy Storchaka26cb4652017-06-20 17:13:29 +0300303 addition to :func:`_`. Supported names are ``'gettext'``, ``'ngettext'``,
Georg Brandlbded4d32008-07-17 18:15:35 +0000304 ``'lgettext'`` and ``'lngettext'``.
Benjamin Peterson801844d2008-07-14 14:32:15 +0000305
Georg Brandlbded4d32008-07-17 18:15:35 +0000306 Note that this is only one way, albeit the most convenient way, to make
307 the :func:`_` function available to your application. Because it affects
308 the entire application globally, and specifically the built-in namespace,
309 localized modules should never install :func:`_`. Instead, they should use
310 this code to make :func:`_` available to their module::
Benjamin Peterson801844d2008-07-14 14:32:15 +0000311
Georg Brandlbded4d32008-07-17 18:15:35 +0000312 import gettext
313 t = gettext.translation('mymodule', ...)
314 _ = t.gettext
Benjamin Peterson801844d2008-07-14 14:32:15 +0000315
Georg Brandlbded4d32008-07-17 18:15:35 +0000316 This puts :func:`_` only in the module's global namespace and so only
317 affects calls within this module.
Georg Brandl116aa622007-08-15 14:28:22 +0000318
Georg Brandl116aa622007-08-15 14:28:22 +0000319
320The :class:`GNUTranslations` class
321^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
322
323The :mod:`gettext` module provides one additional class derived from
324:class:`NullTranslations`: :class:`GNUTranslations`. This class overrides
325:meth:`_parse` to enable reading GNU :program:`gettext` format :file:`.mo` files
Benjamin Peterson801844d2008-07-14 14:32:15 +0000326in both big-endian and little-endian format.
Georg Brandl116aa622007-08-15 14:28:22 +0000327
328:class:`GNUTranslations` parses optional meta-data out of the translation
329catalog. It is convention with GNU :program:`gettext` to include meta-data as
330the translation for the empty string. This meta-data is in :rfc:`822`\ -style
331``key: value`` pairs, and should contain the ``Project-Id-Version`` key. If the
332key ``Content-Type`` is found, then the ``charset`` property is used to
333initialize the "protected" :attr:`_charset` instance variable, defaulting to
334``None`` if not found. If the charset encoding is specified, then all message
335ids and message strings read from the catalog are converted to Unicode using
Georg Brandlbded4d32008-07-17 18:15:35 +0000336this encoding, else ASCII encoding is assumed.
337
338Since message ids are read as Unicode strings too, all :meth:`*gettext` methods
339will assume message ids as Unicode strings, not byte strings.
Georg Brandl116aa622007-08-15 14:28:22 +0000340
341The entire set of key/value pairs are placed into a dictionary and set as the
342"protected" :attr:`_info` instance variable.
343
Antoine Pitroube8d06f2014-10-28 20:17:51 +0100344If the :file:`.mo` file's magic number is invalid, the major version number is
345unexpected, or if other problems occur while reading the file, instantiating a
346:class:`GNUTranslations` class can raise :exc:`OSError`.
Georg Brandl116aa622007-08-15 14:28:22 +0000347
Serhiy Storchaka26cb4652017-06-20 17:13:29 +0300348.. class:: GNUTranslations
349
350 The following methods are overridden from the base class implementation:
351
352 .. method:: gettext(message)
353
354 Look up the *message* id in the catalog and return the corresponding message
355 string, as a Unicode string. If there is no entry in the catalog for the
356 *message* id, and a fallback has been set, the look up is forwarded to the
357 fallback's :meth:`~NullTranslations.gettext` method. Otherwise, the
358 *message* id is returned.
Georg Brandl116aa622007-08-15 14:28:22 +0000359
360
Serhiy Storchaka26cb4652017-06-20 17:13:29 +0300361 .. method:: ngettext(singular, plural, n)
Georg Brandl116aa622007-08-15 14:28:22 +0000362
Serhiy Storchaka26cb4652017-06-20 17:13:29 +0300363 Do a plural-forms lookup of a message id. *singular* is used as the message id
364 for purposes of lookup in the catalog, while *n* is used to determine which
365 plural form to use. The returned message string is a Unicode string.
366
367 If the message id is not found in the catalog, and a fallback is specified,
368 the request is forwarded to the fallback's :meth:`~NullTranslations.ngettext`
369 method. Otherwise, when *n* is 1 *singular* is returned, and *plural* is
370 returned in all other cases.
371
372 Here is an example::
373
374 n = len(os.listdir('.'))
375 cat = GNUTranslations(somefile)
376 message = cat.ngettext(
377 'There is %(num)d file in this directory',
378 'There are %(num)d files in this directory',
379 n) % {'num': n}
Georg Brandl116aa622007-08-15 14:28:22 +0000380
381
Serhiy Storchaka26cb4652017-06-20 17:13:29 +0300382 .. method:: lgettext(message)
383 .. method:: lngettext(singular, plural, n)
Georg Brandl116aa622007-08-15 14:28:22 +0000384
Serhiy Storchaka26cb4652017-06-20 17:13:29 +0300385 Equivalent to :meth:`.gettext` and :meth:`.ngettext`, but the translation
386 is returned as a byte string encoded in the preferred system encoding
387 if no encoding was explicitly set with
388 :meth:`~NullTranslations.set_output_charset`.
Georg Brandl116aa622007-08-15 14:28:22 +0000389
Serhiy Storchaka26cb4652017-06-20 17:13:29 +0300390 .. warning::
Georg Brandl116aa622007-08-15 14:28:22 +0000391
Serhiy Storchaka26cb4652017-06-20 17:13:29 +0300392 These methods should be avoided in Python 3. See the warning for the
393 :func:`lgettext` function.
Georg Brandl116aa622007-08-15 14:28:22 +0000394
Georg Brandl116aa622007-08-15 14:28:22 +0000395
Georg Brandl116aa622007-08-15 14:28:22 +0000396Solaris message catalog support
397^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
398
399The Solaris operating system defines its own binary :file:`.mo` file format, but
400since no documentation can be found on this format, it is not supported at this
401time.
402
403
404The Catalog constructor
405^^^^^^^^^^^^^^^^^^^^^^^
406
407.. index:: single: GNOME
408
409GNOME uses a version of the :mod:`gettext` module by James Henstridge, but this
410version has a slightly different API. Its documented usage was::
411
412 import gettext
413 cat = gettext.Catalog(domain, localedir)
414 _ = cat.gettext
Georg Brandl6911e3c2007-09-04 07:15:32 +0000415 print(_('hello world'))
Georg Brandl116aa622007-08-15 14:28:22 +0000416
417For compatibility with this older module, the function :func:`Catalog` is an
418alias for the :func:`translation` function described above.
419
420One difference between this module and Henstridge's: his catalog objects
421supported access through a mapping API, but this appears to be unused and so is
422not currently supported.
423
424
425Internationalizing your programs and modules
426--------------------------------------------
427
428Internationalization (I18N) refers to the operation by which a program is made
429aware of multiple languages. Localization (L10N) refers to the adaptation of
430your program, once internationalized, to the local language and cultural habits.
431In order to provide multilingual messages for your Python programs, you need to
432take the following steps:
433
434#. prepare your program or module by specially marking translatable strings
435
436#. run a suite of tools over your marked files to generate raw messages catalogs
437
438#. create language specific translations of the message catalogs
439
440#. use the :mod:`gettext` module so that message strings are properly translated
441
442In order to prepare your code for I18N, you need to look at all the strings in
443your files. Any string that needs to be translated should be marked by wrapping
444it in ``_('...')`` --- that is, a call to the function :func:`_`. For example::
445
446 filename = 'mylog.txt'
447 message = _('writing a log message')
448 fp = open(filename, 'w')
449 fp.write(message)
450 fp.close()
451
452In this example, the string ``'writing a log message'`` is marked as a candidate
453for translation, while the strings ``'mylog.txt'`` and ``'w'`` are not.
454
Andrew Kuchling30c5ad22013-11-19 11:05:20 -0500455There are a few tools to extract the strings meant for translation.
456The original GNU :program:`gettext` only supported C or C++ source
457code but its extended version :program:`xgettext` scans code written
458in a number of languages, including Python, to find strings marked as
459translatable. `Babel <http://babel.pocoo.org/>`__ is a Python
460internationalization library that includes a :file:`pybabel` script to
461extract and compile message catalogs. François Pinard's program
462called :program:`xpot` does a similar job and is available as part of
Georg Brandl525d3552014-10-29 10:26:56 +0100463his `po-utils package <https://github.com/pinard/po-utils>`__.
Georg Brandl116aa622007-08-15 14:28:22 +0000464
Andrew Kuchling30c5ad22013-11-19 11:05:20 -0500465(Python also includes pure-Python versions of these programs, called
466:program:`pygettext.py` and :program:`msgfmt.py`; some Python distributions
467will install them for you. :program:`pygettext.py` is similar to
468:program:`xgettext`, but only understands Python source code and
469cannot handle other programming languages such as C or C++.
470:program:`pygettext.py` supports a command-line interface similar to
471:program:`xgettext`; for details on its use, run ``pygettext.py
472--help``. :program:`msgfmt.py` is binary compatible with GNU
473:program:`msgfmt`. With these two programs, you may not need the GNU
474:program:`gettext` package to internationalize your Python
475applications.)
Georg Brandl116aa622007-08-15 14:28:22 +0000476
Andrew Kuchling30c5ad22013-11-19 11:05:20 -0500477:program:`xgettext`, :program:`pygettext`, and similar tools generate
478:file:`.po` files that are message catalogs. They are structured
Georg Brandled007d52013-11-24 16:09:26 +0100479human-readable files that contain every marked string in the source
480code, along with a placeholder for the translated versions of these
481strings.
Georg Brandl116aa622007-08-15 14:28:22 +0000482
Andrew Kuchling30c5ad22013-11-19 11:05:20 -0500483Copies of these :file:`.po` files are then handed over to the
484individual human translators who write translations for every
485supported natural language. They send back the completed
486language-specific versions as a :file:`<language-name>.po` file that's
487compiled into a machine-readable :file:`.mo` binary catalog file using
488the :program:`msgfmt` program. The :file:`.mo` files are used by the
489:mod:`gettext` module for the actual translation processing at
490run-time.
Georg Brandl116aa622007-08-15 14:28:22 +0000491
492How you use the :mod:`gettext` module in your code depends on whether you are
493internationalizing a single module or your entire application. The next two
494sections will discuss each case.
495
496
497Localizing your module
498^^^^^^^^^^^^^^^^^^^^^^
499
500If you are localizing your module, you must take care not to make global
501changes, e.g. to the built-in namespace. You should not use the GNU ``gettext``
502API but instead the class-based API.
503
504Let's say your module is called "spam" and the module's various natural language
505translation :file:`.mo` files reside in :file:`/usr/share/locale` in GNU
506:program:`gettext` format. Here's what you would put at the top of your
507module::
508
509 import gettext
510 t = gettext.translation('spam', '/usr/share/locale')
Serhiy Storchaka26cb4652017-06-20 17:13:29 +0300511 _ = t.gettext
Georg Brandl116aa622007-08-15 14:28:22 +0000512
Georg Brandl116aa622007-08-15 14:28:22 +0000513
514Localizing your application
515^^^^^^^^^^^^^^^^^^^^^^^^^^^
516
517If you are localizing your application, you can install the :func:`_` function
518globally into the built-in namespace, usually in the main driver file of your
519application. This will let all your application-specific files just use
520``_('...')`` without having to explicitly install it in each file.
521
522In the simple case then, you need only add the following bit of code to the main
523driver file of your application::
524
525 import gettext
526 gettext.install('myapplication')
527
Andrew Kuchling30c5ad22013-11-19 11:05:20 -0500528If you need to set the locale directory, you can pass it into the
Benjamin Peterson801844d2008-07-14 14:32:15 +0000529:func:`install` function::
Georg Brandl116aa622007-08-15 14:28:22 +0000530
531 import gettext
Benjamin Peterson801844d2008-07-14 14:32:15 +0000532 gettext.install('myapplication', '/usr/share/locale')
Georg Brandl116aa622007-08-15 14:28:22 +0000533
534
535Changing languages on the fly
536^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
537
538If your program needs to support many languages at the same time, you may want
539to create multiple translation instances and then switch between them
540explicitly, like so::
541
542 import gettext
543
544 lang1 = gettext.translation('myapplication', languages=['en'])
545 lang2 = gettext.translation('myapplication', languages=['fr'])
546 lang3 = gettext.translation('myapplication', languages=['de'])
547
548 # start by using language1
549 lang1.install()
550
551 # ... time goes by, user selects language 2
552 lang2.install()
553
554 # ... more time goes by, user selects language 3
555 lang3.install()
556
557
558Deferred translations
559^^^^^^^^^^^^^^^^^^^^^
560
561In most coding situations, strings are translated where they are coded.
562Occasionally however, you need to mark strings for translation, but defer actual
563translation until later. A classic example is::
564
565 animals = ['mollusk',
566 'albatross',
Georg Brandla1c6a1c2009-01-03 21:26:05 +0000567 'rat',
568 'penguin',
569 'python', ]
Georg Brandl116aa622007-08-15 14:28:22 +0000570 # ...
571 for a in animals:
Georg Brandl6911e3c2007-09-04 07:15:32 +0000572 print(a)
Georg Brandl116aa622007-08-15 14:28:22 +0000573
574Here, you want to mark the strings in the ``animals`` list as being
575translatable, but you don't actually want to translate them until they are
576printed.
577
578Here is one way you can handle this situation::
579
580 def _(message): return message
581
582 animals = [_('mollusk'),
583 _('albatross'),
Georg Brandla1c6a1c2009-01-03 21:26:05 +0000584 _('rat'),
585 _('penguin'),
586 _('python'), ]
Georg Brandl116aa622007-08-15 14:28:22 +0000587
588 del _
589
590 # ...
591 for a in animals:
Georg Brandl6911e3c2007-09-04 07:15:32 +0000592 print(_(a))
Georg Brandl116aa622007-08-15 14:28:22 +0000593
594This works because the dummy definition of :func:`_` simply returns the string
595unchanged. And this dummy definition will temporarily override any definition
596of :func:`_` in the built-in namespace (until the :keyword:`del` command). Take
597care, though if you have a previous definition of :func:`_` in the local
598namespace.
599
600Note that the second use of :func:`_` will not identify "a" as being
Andrew Kuchling30c5ad22013-11-19 11:05:20 -0500601translatable to the :program:`gettext` program, because the parameter
602is not a string literal.
Georg Brandl116aa622007-08-15 14:28:22 +0000603
604Another way to handle this is with the following example::
605
606 def N_(message): return message
607
608 animals = [N_('mollusk'),
609 N_('albatross'),
Georg Brandla1c6a1c2009-01-03 21:26:05 +0000610 N_('rat'),
611 N_('penguin'),
612 N_('python'), ]
Georg Brandl116aa622007-08-15 14:28:22 +0000613
614 # ...
615 for a in animals:
Georg Brandl6911e3c2007-09-04 07:15:32 +0000616 print(_(a))
Georg Brandl116aa622007-08-15 14:28:22 +0000617
Andrew Kuchling30c5ad22013-11-19 11:05:20 -0500618In this case, you are marking translatable strings with the function
619:func:`N_`, which won't conflict with any definition of :func:`_`.
620However, you will need to teach your message extraction program to
621look for translatable strings marked with :func:`N_`. :program:`xgettext`,
622:program:`pygettext`, ``pybabel extract``, and :program:`xpot` all
Martin Panter5c679332016-10-30 04:20:17 +0000623support this through the use of the :option:`!-k` command-line switch.
Andrew Kuchling30c5ad22013-11-19 11:05:20 -0500624The choice of :func:`N_` here is totally arbitrary; it could have just
625as easily been :func:`MarkThisStringForTranslation`.
Georg Brandl116aa622007-08-15 14:28:22 +0000626
627
Georg Brandl116aa622007-08-15 14:28:22 +0000628Acknowledgements
629----------------
630
631The following people contributed code, feedback, design suggestions, previous
632implementations, and valuable experience to the creation of this module:
633
634* Peter Funk
635
636* James Henstridge
637
638* Juan David Ibáñez Palomar
639
640* Marc-André Lemburg
641
642* Martin von Löwis
643
644* François Pinard
645
646* Barry Warsaw
647
648* Gustavo Niemeyer
649
650.. rubric:: Footnotes
651
652.. [#] The default locale directory is system dependent; for example, on RedHat Linux
653 it is :file:`/usr/share/locale`, but on Solaris it is :file:`/usr/lib/locale`.
654 The :mod:`gettext` module does not try to support these system dependent
655 defaults; instead its default is :file:`sys.prefix/share/locale`. For this
656 reason, it is always best to call :func:`bindtextdomain` with an explicit
657 absolute path at the start of your application.
658
659.. [#] See the footnote for :func:`bindtextdomain` above.