blob: bb84343f3bbedada64021020756b5e899f0118bb [file] [log] [blame]
Fred Drake295da241998-08-10 19:42:37 +00001\section{\module{locale} ---
Fred Drakec3845a11999-04-21 17:18:04 +00002 Internationalization services}
3
Fred Drakeb91e9341998-07-23 17:59:49 +00004\declaremodule{standard}{locale}
Fred Drakeb91e9341998-07-23 17:59:49 +00005\modulesynopsis{Internationalization services.}
Fred Drake85b56831999-07-01 16:31:03 +00006\moduleauthor{Martin von Loewis}{loewis@informatik.hu-berlin.de}
7\sectionauthor{Martin von Loewis}{loewis@informatik.hu-berlin.de}
Fred Drakeb91e9341998-07-23 17:59:49 +00008
Guido van Rossumbc12f781997-11-20 21:04:27 +00009
Fred Drakec3845a11999-04-21 17:18:04 +000010The \module{locale} module opens access to the \POSIX{} locale database
Fred Draked8a41e61999-02-19 17:54:10 +000011and functionality. The \POSIX{} locale mechanism allows programmers
12to deal with certain cultural issues in an application, without
Guido van Rossumbc12f781997-11-20 21:04:27 +000013requiring the programmer to know all the specifics of each country
14where the software is executed.
15
Fred Drake193338a1998-03-10 04:23:12 +000016The \module{locale} module is implemented on top of the
17\module{_locale}\refbimodindex{_locale} module, which in turn uses an
Fred Drakec3845a11999-04-21 17:18:04 +000018ANSI C locale implementation if available.
Guido van Rossumbc12f781997-11-20 21:04:27 +000019
Fred Drake193338a1998-03-10 04:23:12 +000020The \module{locale} module defines the following exception and
21functions:
Guido van Rossumbc12f781997-11-20 21:04:27 +000022
Guido van Rossumbc12f781997-11-20 21:04:27 +000023
Fred Drake193338a1998-03-10 04:23:12 +000024\begin{funcdesc}{setlocale}{category\optional{, value}}
Guido van Rossumbc12f781997-11-20 21:04:27 +000025If \var{value} is specified, modifies the locale setting for the
26\var{category}. The available categories are listed in the data
27description below. The value is the name of a locale. An empty string
28specifies the user's default settings. If the modification of the
Fred Drake193338a1998-03-10 04:23:12 +000029locale fails, the exception \exception{Error} is
Guido van Rossumbc12f781997-11-20 21:04:27 +000030raised. If successful, the new locale setting is returned.
31
32If no \var{value} is specified, the current setting for the
33\var{category} is returned.
34
Fred Drake193338a1998-03-10 04:23:12 +000035\function{setlocale()} is not thread safe on most systems. Applications
Guido van Rossumbc12f781997-11-20 21:04:27 +000036typically start with a call of
Fred Drake19479911998-02-13 06:58:54 +000037\begin{verbatim}
Guido van Rossumbc12f781997-11-20 21:04:27 +000038import locale
39locale.setlocale(locale.LC_ALL,"")
Fred Drake19479911998-02-13 06:58:54 +000040\end{verbatim}
Guido van Rossumbc12f781997-11-20 21:04:27 +000041This sets the locale for all categories to the user's default setting
Fred Drakec3845a11999-04-21 17:18:04 +000042(typically specified in the \envvar{LANG} environment variable). If
43the locale is not changed thereafter, using multithreading should not
Guido van Rossumbc12f781997-11-20 21:04:27 +000044cause problems.
45\end{funcdesc}
46
Fred Drake193338a1998-03-10 04:23:12 +000047\begin{excdesc}{Error}
48Exception raised when \function{setlocale()} fails.
49\end{excdesc}
50
Guido van Rossumbc12f781997-11-20 21:04:27 +000051\begin{funcdesc}{localeconv}{}
52Returns the database of of the local conventions as a dictionary. This
53dictionary has the following strings as keys:
54\begin{itemize}
55\item \code{decimal_point} specifies the decimal point used in
Fred Drakec3845a11999-04-21 17:18:04 +000056floating point number representations for the \constant{LC_NUMERIC}
Guido van Rossumbc12f781997-11-20 21:04:27 +000057category.
58\item \code{grouping} is a sequence of numbers specifying at which
59relative positions the \code{thousands_sep} is expected. If the
Fred Drakec3845a11999-04-21 17:18:04 +000060sequence is terminated with \constant{CHAR_MAX}, no further
Fred Drake304474f1997-12-17 15:30:07 +000061grouping is performed. If the sequence terminates with a \code{0}, the last
Guido van Rossumbc12f781997-11-20 21:04:27 +000062group size is repeatedly used.
63\item \code{thousands_sep} is the character used between groups.
64\item \code{int_curr_symbol} specifies the international currency
Fred Drakec3845a11999-04-21 17:18:04 +000065symbol from the \constant{LC_MONETARY} category.
Guido van Rossumbc12f781997-11-20 21:04:27 +000066\item \code{currency_symbol} is the local currency symbol.
67\item \code{mon_decimal_point} is the decimal point used in monetary
68values.
69\item \code{mon_thousands_sep} is the separator for grouping of
70monetary values.
71\item \code{mon_grouping} has the same format as the \code{grouping}
72key; it is used for monetary values.
73\item \code{positive_sign} and \code{negative_sign} gives the sign
74used for positive and negative monetary quantities.
75\item \code{int_frac_digits} and \code{frac_digits} specify the number
76of fractional digits used in the international and local formatting
77of monetary values.
78\item \code{p_cs_precedes} and \code{n_cs_precedes} specifies whether
79the currency symbol precedes the value for positive or negative
80values.
81\item \code{p_sep_by_space} and \code{n_sep_by_space} specifies
82whether there is a space between the positive or negative value and
83the currency symbol.
84\item \code{p_sign_posn} and \code{n_sign_posn} indicate how the
85sign should be placed for positive and negative monetary values.
86\end{itemize}
Fred Drake193338a1998-03-10 04:23:12 +000087
Fred Drakec3845a11999-04-21 17:18:04 +000088The possible values for \code{p_sign_posn} and
89\code{n_sign_posn} are given below.
Fred Drake193338a1998-03-10 04:23:12 +000090
Fred Drakeee601911998-04-11 20:53:03 +000091\begin{tableii}{c|l}{code}{Value}{Explanation}
Fred Drake193338a1998-03-10 04:23:12 +000092\lineii{0}{Currency and value are surrounded by parentheses.}
93\lineii{1}{The sign should precede the value and currency symbol.}
94\lineii{2}{The sign should follow the value and currency symbol.}
95\lineii{3}{The sign should immediately precede the value.}
96\lineii{4}{The sign should immediately follow the value.}
97\lineii{LC_MAX}{Nothing is specified in this locale.}
98\end{tableii}
Guido van Rossumbc12f781997-11-20 21:04:27 +000099\end{funcdesc}
100
101\begin{funcdesc}{strcoll}{string1,string2}
Fred Drake193338a1998-03-10 04:23:12 +0000102Compares two strings according to the current \constant{LC_COLLATE}
Fred Drake304474f1997-12-17 15:30:07 +0000103setting. As any other compare function, returns a negative, or a
104positive value, or \code{0}, depending on whether \var{string1}
105collates before or after \var{string2} or is equal to it.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000106\end{funcdesc}
107
108\begin{funcdesc}{strxfrm}{string}
Fred Drakedc409041998-04-02 18:54:54 +0000109Transforms a string to one that can be used for the built-in function
Fred Drake193338a1998-03-10 04:23:12 +0000110\function{cmp()}\bifuncindex{cmp}, and still returns locale-aware
111results. This function can be used when the same string is compared
112repeatedly, e.g. when collating a sequence of strings.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000113\end{funcdesc}
114
Fred Drake193338a1998-03-10 04:23:12 +0000115\begin{funcdesc}{format}{format, val, \optional{grouping\code{ = 0}}}
116Formats a number \var{val} according to the current
117\constant{LC_NUMERIC} setting. The format follows the conventions of
118the \code{\%} operator. For floating point values, the decimal point
119is modified if appropriate. If \var{grouping} is true, also takes the
120grouping into account.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000121\end{funcdesc}
122
123\begin{funcdesc}{str}{float}
Fred Drake304474f1997-12-17 15:30:07 +0000124Formats a floating point number using the same format as the built-in
125function \code{str(\var{float})}, but takes the decimal point into
126account.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000127\end{funcdesc}
128
129\begin{funcdesc}{atof}{string}
Fred Drake193338a1998-03-10 04:23:12 +0000130Converts a string to a floating point number, following the
131\constant{LC_NUMERIC} settings.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000132\end{funcdesc}
133
134\begin{funcdesc}{atoi}{string}
Fred Drake193338a1998-03-10 04:23:12 +0000135Converts a string to an integer, following the \constant{LC_NUMERIC}
136conventions.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000137\end{funcdesc}
138
139\begin{datadesc}{LC_CTYPE}
Fred Drake304474f1997-12-17 15:30:07 +0000140\refstmodindex{string}
Guido van Rossumbc12f781997-11-20 21:04:27 +0000141Locale category for the character type functions. Depending on the
Fred Drakec3845a11999-04-21 17:18:04 +0000142settings of this category, the functions of module \refmodule{string}
Guido van Rossumbc12f781997-11-20 21:04:27 +0000143dealing with case change their behaviour.
144\end{datadesc}
145
146\begin{datadesc}{LC_COLLATE}
Fred Drake193338a1998-03-10 04:23:12 +0000147Locale category for sorting strings. The functions
148\function{strcoll()} and \function{strxfrm()} of the \module{locale}
149module are affected.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000150\end{datadesc}
151
152\begin{datadesc}{LC_TIME}
153Locale category for the formatting of time. The function
Fred Drake193338a1998-03-10 04:23:12 +0000154\function{time.strftime()} follows these conventions.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000155\end{datadesc}
156
157\begin{datadesc}{LC_MONETARY}
158Locale category for formatting of monetary values. The available
Fred Drake193338a1998-03-10 04:23:12 +0000159options are available from the \function{localeconv()} function.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000160\end{datadesc}
161
162\begin{datadesc}{LC_MESSAGES}
163Locale category for message display. Python currently does not support
164application specific locale-aware messages. Messages displayed by the
Fred Drake193338a1998-03-10 04:23:12 +0000165operating system, like those returned by \function{os.strerror()}
166might be affected by this category.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000167\end{datadesc}
168
169\begin{datadesc}{LC_NUMERIC}
170Locale category for formatting numbers. The functions
Fred Drake193338a1998-03-10 04:23:12 +0000171\function{format()}, \function{atoi()}, \function{atof()} and
172\function{str()} of the \module{locale} module are affected by that
173category. All other numeric formatting operations are not affected.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000174\end{datadesc}
175
176\begin{datadesc}{LC_ALL}
177Combination of all locale settings. If this flag is used when the
178locale is changed, setting the locale for all categories is
179attempted. If that fails for any category, no category is changed at
180all. When the locale is retrieved using this flag, a string indicating
181the setting for all categories is returned. This string can be later
182used to restore the settings.
183\end{datadesc}
184
185\begin{datadesc}{CHAR_MAX}
186This is a symbolic constant used for different values returned by
Fred Drake193338a1998-03-10 04:23:12 +0000187\function{localeconv()}.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000188\end{datadesc}
189
Guido van Rossumbc12f781997-11-20 21:04:27 +0000190Example:
191
Fred Drake19479911998-02-13 06:58:54 +0000192\begin{verbatim}
Guido van Rossumbc12f781997-11-20 21:04:27 +0000193>>> import locale
Guido van Rossumd028ca91998-02-22 04:41:51 +0000194>>> loc = locale.setlocale(locale.LC_ALL) # get current locale
195>>> locale.setlocale(locale.LC_ALL, "de") # use German locale
196>>> locale.strcoll("f\344n", "foo") # compare a string containing an umlaut
197>>> locale.setlocale(locale.LC_ALL, "") # use user's preferred locale
198>>> locale.setlocale(locale.LC_ALL, "C") # use default (C) locale
199>>> locale.setlocale(locale.LC_ALL, loc) # restore saved locale
Fred Drake19479911998-02-13 06:58:54 +0000200\end{verbatim}
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000201
202\subsection{Background, details, hints, tips and caveats}
203
204The C standard defines the locale as a program-wide property that may
205be relatively expensive to change. On top of that, some
206implementation are broken in such a way that frequent locale changes
207may cause core dumps. This makes the locale somewhat painful to use
208correctly.
209
Fred Drake9fee0711998-04-03 06:21:23 +0000210Initially, when a program is started, the locale is the \samp{C} locale, no
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000211matter what the user's preferred locale is. The program must
212explicitly say that it wants the user's preferred locale settings by
213calling \code{setlocale(LC_ALL, "")}.
214
Fred Drake193338a1998-03-10 04:23:12 +0000215It is generally a bad idea to call \function{setlocale()} in some library
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000216routine, since as a side effect it affects the entire program. Saving
217and restoring it is almost as bad: it is expensive and affects other
218threads that happen to run before the settings have been restored.
219
220If, when coding a module for general use, you need a locale
221independent version of an operation that is affected by the locale
Fred Drake193338a1998-03-10 04:23:12 +0000222(e.g. \function{string.lower()}, or certain formats used with
223\function{time.strftime()})), you will have to find a way to do it
224without using the standard library routine. Even better is convincing
225yourself that using locale settings is okay. Only as a last resort
Fred Drake9fee0711998-04-03 06:21:23 +0000226should you document that your module is not compatible with
227non-\samp{C} locale settings.
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000228
Fred Drake193338a1998-03-10 04:23:12 +0000229The case conversion functions in the
Fred Drakec3845a11999-04-21 17:18:04 +0000230\refmodule{string}\refstmodindex{string} and
Fred Drake193338a1998-03-10 04:23:12 +0000231\module{strop}\refbimodindex{strop} modules are affected by the locale
232settings. When a call to the \function{setlocale()} function changes
233the \constant{LC_CTYPE} settings, the variables
234\code{string.lowercase}, \code{string.uppercase} and
235\code{string.letters} (and their counterparts in \module{strop}) are
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000236recalculated. Note that this code that uses these variable through
Fred Drake193338a1998-03-10 04:23:12 +0000237`\keyword{from} ... \keyword{import} ...', e.g. \code{from string
238import letters}, is not affected by subsequent \function{setlocale()}
239calls.
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000240
241The only way to perform numeric operations according to the locale
242is to use the special functions defined by this module:
Fred Drake193338a1998-03-10 04:23:12 +0000243\function{atof()}, \function{atoi()}, \function{format()},
244\function{str()}.
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000245
Fred Drake193338a1998-03-10 04:23:12 +0000246\subsection{For extension writers and programs that embed Python}
Fred Drake9fee0711998-04-03 06:21:23 +0000247\label{embedding-locale}
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000248
Fred Drake193338a1998-03-10 04:23:12 +0000249Extension modules should never call \function{setlocale()}, except to
250find out what the current locale is. But since the return value can
251only be used portably to restore it, that is not very useful (except
Fred Drake9fee0711998-04-03 06:21:23 +0000252perhaps to find out whether or not the locale is \samp{C}).
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000253
254When Python is embedded in an application, if the application sets the
255locale to something specific before initializing Python, that is
256generally okay, and Python will use whatever locale is set,
Fred Drake9fee0711998-04-03 06:21:23 +0000257\emph{except} that the \constant{LC_NUMERIC} locale should always be
258\samp{C}.
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000259
Fred Drake85b56831999-07-01 16:31:03 +0000260The \function{setlocale()} function in the \module{locale} module
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000261gives the Python progammer the impression that you can manipulate the
Fred Drakec3845a11999-04-21 17:18:04 +0000262\constant{LC_NUMERIC} locale setting, but this not the case at the C
263level: C code will always find that the \constant{LC_NUMERIC} locale
Fred Drake9fee0711998-04-03 06:21:23 +0000264setting is \samp{C}. This is because too much would break when the
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000265decimal point character is set to something else than a period
266(e.g. the Python parser would break). Caveat: threads that run
267without holding Python's global interpreter lock may occasionally find
268that the numeric locale setting differs; this is because the only
269portable way to implement this feature is to set the numeric locale
270settings to what the user requests, extract the relevant
Fred Drake9fee0711998-04-03 06:21:23 +0000271characteristics, and then restore the \samp{C} numeric locale.
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000272
Fred Drake193338a1998-03-10 04:23:12 +0000273When Python code uses the \module{locale} module to change the locale,
Fred Draked8a41e61999-02-19 17:54:10 +0000274this also affects the embedding application. If the embedding
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000275application doesn't want this to happen, it should remove the
Fred Drake193338a1998-03-10 04:23:12 +0000276\module{_locale} extension module (which does all the work) from the
277table of built-in modules in the \file{config.c} file, and make sure
278that the \module{_locale} module is not accessible as a shared library.