blob: 11a8bf755c386d67ab2cdc50eee17206ff4ce72c [file] [log] [blame]
Fred Drake295da241998-08-10 19:42:37 +00001\section{\module{locale} ---
Fred Drakec3845a11999-04-21 17:18:04 +00002 Internationalization services}
3
Fred Drakeb91e9341998-07-23 17:59:49 +00004\declaremodule{standard}{locale}
Fred Drakeb91e9341998-07-23 17:59:49 +00005\modulesynopsis{Internationalization services.}
6
Guido van Rossumbc12f781997-11-20 21:04:27 +00007
Fred Drakec3845a11999-04-21 17:18:04 +00008The \module{locale} module opens access to the \POSIX{} locale database
Fred Draked8a41e61999-02-19 17:54:10 +00009and functionality. The \POSIX{} locale mechanism allows programmers
10to deal with certain cultural issues in an application, without
Guido van Rossumbc12f781997-11-20 21:04:27 +000011requiring the programmer to know all the specifics of each country
12where the software is executed.
13
Fred Drake193338a1998-03-10 04:23:12 +000014The \module{locale} module is implemented on top of the
15\module{_locale}\refbimodindex{_locale} module, which in turn uses an
Fred Drakec3845a11999-04-21 17:18:04 +000016ANSI C locale implementation if available.
Guido van Rossumbc12f781997-11-20 21:04:27 +000017
Fred Drake193338a1998-03-10 04:23:12 +000018The \module{locale} module defines the following exception and
19functions:
Guido van Rossumbc12f781997-11-20 21:04:27 +000020
Guido van Rossumbc12f781997-11-20 21:04:27 +000021
Fred Drake193338a1998-03-10 04:23:12 +000022\begin{funcdesc}{setlocale}{category\optional{, value}}
Guido van Rossumbc12f781997-11-20 21:04:27 +000023If \var{value} is specified, modifies the locale setting for the
24\var{category}. The available categories are listed in the data
25description below. The value is the name of a locale. An empty string
26specifies the user's default settings. If the modification of the
Fred Drake193338a1998-03-10 04:23:12 +000027locale fails, the exception \exception{Error} is
Guido van Rossumbc12f781997-11-20 21:04:27 +000028raised. If successful, the new locale setting is returned.
29
30If no \var{value} is specified, the current setting for the
31\var{category} is returned.
32
Fred Drake193338a1998-03-10 04:23:12 +000033\function{setlocale()} is not thread safe on most systems. Applications
Guido van Rossumbc12f781997-11-20 21:04:27 +000034typically start with a call of
Fred Drake19479911998-02-13 06:58:54 +000035\begin{verbatim}
Guido van Rossumbc12f781997-11-20 21:04:27 +000036import locale
37locale.setlocale(locale.LC_ALL,"")
Fred Drake19479911998-02-13 06:58:54 +000038\end{verbatim}
Guido van Rossumbc12f781997-11-20 21:04:27 +000039This sets the locale for all categories to the user's default setting
Fred Drakec3845a11999-04-21 17:18:04 +000040(typically specified in the \envvar{LANG} environment variable). If
41the locale is not changed thereafter, using multithreading should not
Guido van Rossumbc12f781997-11-20 21:04:27 +000042cause problems.
43\end{funcdesc}
44
Fred Drake193338a1998-03-10 04:23:12 +000045\begin{excdesc}{Error}
46Exception raised when \function{setlocale()} fails.
47\end{excdesc}
48
Guido van Rossumbc12f781997-11-20 21:04:27 +000049\begin{funcdesc}{localeconv}{}
50Returns the database of of the local conventions as a dictionary. This
51dictionary has the following strings as keys:
52\begin{itemize}
53\item \code{decimal_point} specifies the decimal point used in
Fred Drakec3845a11999-04-21 17:18:04 +000054floating point number representations for the \constant{LC_NUMERIC}
Guido van Rossumbc12f781997-11-20 21:04:27 +000055category.
56\item \code{grouping} is a sequence of numbers specifying at which
57relative positions the \code{thousands_sep} is expected. If the
Fred Drakec3845a11999-04-21 17:18:04 +000058sequence is terminated with \constant{CHAR_MAX}, no further
Fred Drake304474f1997-12-17 15:30:07 +000059grouping is performed. If the sequence terminates with a \code{0}, the last
Guido van Rossumbc12f781997-11-20 21:04:27 +000060group size is repeatedly used.
61\item \code{thousands_sep} is the character used between groups.
62\item \code{int_curr_symbol} specifies the international currency
Fred Drakec3845a11999-04-21 17:18:04 +000063symbol from the \constant{LC_MONETARY} category.
Guido van Rossumbc12f781997-11-20 21:04:27 +000064\item \code{currency_symbol} is the local currency symbol.
65\item \code{mon_decimal_point} is the decimal point used in monetary
66values.
67\item \code{mon_thousands_sep} is the separator for grouping of
68monetary values.
69\item \code{mon_grouping} has the same format as the \code{grouping}
70key; it is used for monetary values.
71\item \code{positive_sign} and \code{negative_sign} gives the sign
72used for positive and negative monetary quantities.
73\item \code{int_frac_digits} and \code{frac_digits} specify the number
74of fractional digits used in the international and local formatting
75of monetary values.
76\item \code{p_cs_precedes} and \code{n_cs_precedes} specifies whether
77the currency symbol precedes the value for positive or negative
78values.
79\item \code{p_sep_by_space} and \code{n_sep_by_space} specifies
80whether there is a space between the positive or negative value and
81the currency symbol.
82\item \code{p_sign_posn} and \code{n_sign_posn} indicate how the
83sign should be placed for positive and negative monetary values.
84\end{itemize}
Fred Drake193338a1998-03-10 04:23:12 +000085
Fred Drakec3845a11999-04-21 17:18:04 +000086The possible values for \code{p_sign_posn} and
87\code{n_sign_posn} are given below.
Fred Drake193338a1998-03-10 04:23:12 +000088
Fred Drakeee601911998-04-11 20:53:03 +000089\begin{tableii}{c|l}{code}{Value}{Explanation}
Fred Drake193338a1998-03-10 04:23:12 +000090\lineii{0}{Currency and value are surrounded by parentheses.}
91\lineii{1}{The sign should precede the value and currency symbol.}
92\lineii{2}{The sign should follow the value and currency symbol.}
93\lineii{3}{The sign should immediately precede the value.}
94\lineii{4}{The sign should immediately follow the value.}
95\lineii{LC_MAX}{Nothing is specified in this locale.}
96\end{tableii}
Guido van Rossumbc12f781997-11-20 21:04:27 +000097\end{funcdesc}
98
99\begin{funcdesc}{strcoll}{string1,string2}
Fred Drake193338a1998-03-10 04:23:12 +0000100Compares two strings according to the current \constant{LC_COLLATE}
Fred Drake304474f1997-12-17 15:30:07 +0000101setting. As any other compare function, returns a negative, or a
102positive value, or \code{0}, depending on whether \var{string1}
103collates before or after \var{string2} or is equal to it.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000104\end{funcdesc}
105
106\begin{funcdesc}{strxfrm}{string}
Fred Drakedc409041998-04-02 18:54:54 +0000107Transforms a string to one that can be used for the built-in function
Fred Drake193338a1998-03-10 04:23:12 +0000108\function{cmp()}\bifuncindex{cmp}, and still returns locale-aware
109results. This function can be used when the same string is compared
110repeatedly, e.g. when collating a sequence of strings.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000111\end{funcdesc}
112
Fred Drake193338a1998-03-10 04:23:12 +0000113\begin{funcdesc}{format}{format, val, \optional{grouping\code{ = 0}}}
114Formats a number \var{val} according to the current
115\constant{LC_NUMERIC} setting. The format follows the conventions of
116the \code{\%} operator. For floating point values, the decimal point
117is modified if appropriate. If \var{grouping} is true, also takes the
118grouping into account.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000119\end{funcdesc}
120
121\begin{funcdesc}{str}{float}
Fred Drake304474f1997-12-17 15:30:07 +0000122Formats a floating point number using the same format as the built-in
123function \code{str(\var{float})}, but takes the decimal point into
124account.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000125\end{funcdesc}
126
127\begin{funcdesc}{atof}{string}
Fred Drake193338a1998-03-10 04:23:12 +0000128Converts a string to a floating point number, following the
129\constant{LC_NUMERIC} settings.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000130\end{funcdesc}
131
132\begin{funcdesc}{atoi}{string}
Fred Drake193338a1998-03-10 04:23:12 +0000133Converts a string to an integer, following the \constant{LC_NUMERIC}
134conventions.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000135\end{funcdesc}
136
137\begin{datadesc}{LC_CTYPE}
Fred Drake304474f1997-12-17 15:30:07 +0000138\refstmodindex{string}
Guido van Rossumbc12f781997-11-20 21:04:27 +0000139Locale category for the character type functions. Depending on the
Fred Drakec3845a11999-04-21 17:18:04 +0000140settings of this category, the functions of module \refmodule{string}
Guido van Rossumbc12f781997-11-20 21:04:27 +0000141dealing with case change their behaviour.
142\end{datadesc}
143
144\begin{datadesc}{LC_COLLATE}
Fred Drake193338a1998-03-10 04:23:12 +0000145Locale category for sorting strings. The functions
146\function{strcoll()} and \function{strxfrm()} of the \module{locale}
147module are affected.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000148\end{datadesc}
149
150\begin{datadesc}{LC_TIME}
151Locale category for the formatting of time. The function
Fred Drake193338a1998-03-10 04:23:12 +0000152\function{time.strftime()} follows these conventions.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000153\end{datadesc}
154
155\begin{datadesc}{LC_MONETARY}
156Locale category for formatting of monetary values. The available
Fred Drake193338a1998-03-10 04:23:12 +0000157options are available from the \function{localeconv()} function.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000158\end{datadesc}
159
160\begin{datadesc}{LC_MESSAGES}
161Locale category for message display. Python currently does not support
162application specific locale-aware messages. Messages displayed by the
Fred Drake193338a1998-03-10 04:23:12 +0000163operating system, like those returned by \function{os.strerror()}
164might be affected by this category.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000165\end{datadesc}
166
167\begin{datadesc}{LC_NUMERIC}
168Locale category for formatting numbers. The functions
Fred Drake193338a1998-03-10 04:23:12 +0000169\function{format()}, \function{atoi()}, \function{atof()} and
170\function{str()} of the \module{locale} module are affected by that
171category. All other numeric formatting operations are not affected.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000172\end{datadesc}
173
174\begin{datadesc}{LC_ALL}
175Combination of all locale settings. If this flag is used when the
176locale is changed, setting the locale for all categories is
177attempted. If that fails for any category, no category is changed at
178all. When the locale is retrieved using this flag, a string indicating
179the setting for all categories is returned. This string can be later
180used to restore the settings.
181\end{datadesc}
182
183\begin{datadesc}{CHAR_MAX}
184This is a symbolic constant used for different values returned by
Fred Drake193338a1998-03-10 04:23:12 +0000185\function{localeconv()}.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000186\end{datadesc}
187
Guido van Rossumbc12f781997-11-20 21:04:27 +0000188Example:
189
Fred Drake19479911998-02-13 06:58:54 +0000190\begin{verbatim}
Guido van Rossumbc12f781997-11-20 21:04:27 +0000191>>> import locale
Guido van Rossumd028ca91998-02-22 04:41:51 +0000192>>> loc = locale.setlocale(locale.LC_ALL) # get current locale
193>>> locale.setlocale(locale.LC_ALL, "de") # use German locale
194>>> locale.strcoll("f\344n", "foo") # compare a string containing an umlaut
195>>> locale.setlocale(locale.LC_ALL, "") # use user's preferred locale
196>>> locale.setlocale(locale.LC_ALL, "C") # use default (C) locale
197>>> locale.setlocale(locale.LC_ALL, loc) # restore saved locale
Fred Drake19479911998-02-13 06:58:54 +0000198\end{verbatim}
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000199
200\subsection{Background, details, hints, tips and caveats}
201
202The C standard defines the locale as a program-wide property that may
203be relatively expensive to change. On top of that, some
204implementation are broken in such a way that frequent locale changes
205may cause core dumps. This makes the locale somewhat painful to use
206correctly.
207
Fred Drake9fee0711998-04-03 06:21:23 +0000208Initially, when a program is started, the locale is the \samp{C} locale, no
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000209matter what the user's preferred locale is. The program must
210explicitly say that it wants the user's preferred locale settings by
211calling \code{setlocale(LC_ALL, "")}.
212
Fred Drake193338a1998-03-10 04:23:12 +0000213It is generally a bad idea to call \function{setlocale()} in some library
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000214routine, since as a side effect it affects the entire program. Saving
215and restoring it is almost as bad: it is expensive and affects other
216threads that happen to run before the settings have been restored.
217
218If, when coding a module for general use, you need a locale
219independent version of an operation that is affected by the locale
Fred Drake193338a1998-03-10 04:23:12 +0000220(e.g. \function{string.lower()}, or certain formats used with
221\function{time.strftime()})), you will have to find a way to do it
222without using the standard library routine. Even better is convincing
223yourself that using locale settings is okay. Only as a last resort
Fred Drake9fee0711998-04-03 06:21:23 +0000224should you document that your module is not compatible with
225non-\samp{C} locale settings.
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000226
Fred Drake193338a1998-03-10 04:23:12 +0000227The case conversion functions in the
Fred Drakec3845a11999-04-21 17:18:04 +0000228\refmodule{string}\refstmodindex{string} and
Fred Drake193338a1998-03-10 04:23:12 +0000229\module{strop}\refbimodindex{strop} modules are affected by the locale
230settings. When a call to the \function{setlocale()} function changes
231the \constant{LC_CTYPE} settings, the variables
232\code{string.lowercase}, \code{string.uppercase} and
233\code{string.letters} (and their counterparts in \module{strop}) are
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000234recalculated. Note that this code that uses these variable through
Fred Drake193338a1998-03-10 04:23:12 +0000235`\keyword{from} ... \keyword{import} ...', e.g. \code{from string
236import letters}, is not affected by subsequent \function{setlocale()}
237calls.
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000238
239The only way to perform numeric operations according to the locale
240is to use the special functions defined by this module:
Fred Drake193338a1998-03-10 04:23:12 +0000241\function{atof()}, \function{atoi()}, \function{format()},
242\function{str()}.
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000243
Fred Drake193338a1998-03-10 04:23:12 +0000244\subsection{For extension writers and programs that embed Python}
Fred Drake9fee0711998-04-03 06:21:23 +0000245\label{embedding-locale}
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000246
Fred Drake193338a1998-03-10 04:23:12 +0000247Extension modules should never call \function{setlocale()}, except to
248find out what the current locale is. But since the return value can
249only be used portably to restore it, that is not very useful (except
Fred Drake9fee0711998-04-03 06:21:23 +0000250perhaps to find out whether or not the locale is \samp{C}).
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000251
252When Python is embedded in an application, if the application sets the
253locale to something specific before initializing Python, that is
254generally okay, and Python will use whatever locale is set,
Fred Drake9fee0711998-04-03 06:21:23 +0000255\emph{except} that the \constant{LC_NUMERIC} locale should always be
256\samp{C}.
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000257
Fred Drake193338a1998-03-10 04:23:12 +0000258The \function{setlocale()} function in the \module{locale} module contains
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000259gives the Python progammer the impression that you can manipulate the
Fred Drakec3845a11999-04-21 17:18:04 +0000260\constant{LC_NUMERIC} locale setting, but this not the case at the C
261level: C code will always find that the \constant{LC_NUMERIC} locale
Fred Drake9fee0711998-04-03 06:21:23 +0000262setting is \samp{C}. This is because too much would break when the
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000263decimal point character is set to something else than a period
264(e.g. the Python parser would break). Caveat: threads that run
265without holding Python's global interpreter lock may occasionally find
266that the numeric locale setting differs; this is because the only
267portable way to implement this feature is to set the numeric locale
268settings to what the user requests, extract the relevant
Fred Drake9fee0711998-04-03 06:21:23 +0000269characteristics, and then restore the \samp{C} numeric locale.
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000270
Fred Drake193338a1998-03-10 04:23:12 +0000271When Python code uses the \module{locale} module to change the locale,
Fred Draked8a41e61999-02-19 17:54:10 +0000272this also affects the embedding application. If the embedding
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000273application doesn't want this to happen, it should remove the
Fred Drake193338a1998-03-10 04:23:12 +0000274\module{_locale} extension module (which does all the work) from the
275table of built-in modules in the \file{config.c} file, and make sure
276that the \module{_locale} module is not accessible as a shared library.