blob: 6425797d437e2e8a0cdcd152bb14d0dd2c9e825e [file] [log] [blame]
Fred Drake304474f1997-12-17 15:30:07 +00001\section{Standard Module \sectcode{locale}}
Guido van Rossumbc12f781997-11-20 21:04:27 +00002\stmodindex{locale}
3
4\label{module-locale}
5
Fred Drake65b32f71998-02-09 20:27:12 +00006The \code{locale} module opens access to the \POSIX{} locale database
7and functionality. The \POSIX{} locale mechanism allows applications
8to integrate certain cultural aspects into an applications, without
Guido van Rossumbc12f781997-11-20 21:04:27 +00009requiring the programmer to know all the specifics of each country
10where the software is executed.
11
12The \code{locale} module is implemented on top of the \code{_locale}
Fred Drake304474f1997-12-17 15:30:07 +000013module, which in turn uses an ANSI \C{} locale implementation if
14available.
15\refbimodindex{_locale}
Guido van Rossumbc12f781997-11-20 21:04:27 +000016
17The \code{locale} module defines the following functions:
18
Fred Drake19479911998-02-13 06:58:54 +000019\setindexsubitem{(in module locale)}
Guido van Rossumbc12f781997-11-20 21:04:27 +000020
21\begin{funcdesc}{setlocale}{category\optional{\, value}}
22If \var{value} is specified, modifies the locale setting for the
23\var{category}. The available categories are listed in the data
24description below. The value is the name of a locale. An empty string
25specifies the user's default settings. If the modification of the
26locale fails, the exception \code{locale.Error} is
27raised. If successful, the new locale setting is returned.
28
29If no \var{value} is specified, the current setting for the
30\var{category} is returned.
31
Fred Drake304474f1997-12-17 15:30:07 +000032\code{setlocale()} is not thread safe on most systems. Applications
Guido van Rossumbc12f781997-11-20 21:04:27 +000033typically start with a call of
Fred Drake19479911998-02-13 06:58:54 +000034\begin{verbatim}
Guido van Rossumbc12f781997-11-20 21:04:27 +000035import locale
36locale.setlocale(locale.LC_ALL,"")
Fred Drake19479911998-02-13 06:58:54 +000037\end{verbatim}
Guido van Rossumbc12f781997-11-20 21:04:27 +000038This sets the locale for all categories to the user's default setting
39(typically specified in the \code{LANG} environment variable). If the
40locale is not changed thereafter, using multithreading should not
41cause problems.
42\end{funcdesc}
43
44\begin{funcdesc}{localeconv}{}
45Returns the database of of the local conventions as a dictionary. This
46dictionary has the following strings as keys:
47\begin{itemize}
48\item \code{decimal_point} specifies the decimal point used in
49floating point number representations for the \code{LC_NUMERIC}
50category.
51\item \code{grouping} is a sequence of numbers specifying at which
52relative positions the \code{thousands_sep} is expected. If the
53sequence is terminated with \code{locale.CHAR_MAX}, no further
Fred Drake304474f1997-12-17 15:30:07 +000054grouping is performed. If the sequence terminates with a \code{0}, the last
Guido van Rossumbc12f781997-11-20 21:04:27 +000055group size is repeatedly used.
56\item \code{thousands_sep} is the character used between groups.
57\item \code{int_curr_symbol} specifies the international currency
58symbol from the \code{LC_MONETARY} category.
59\item \code{currency_symbol} is the local currency symbol.
60\item \code{mon_decimal_point} is the decimal point used in monetary
61values.
62\item \code{mon_thousands_sep} is the separator for grouping of
63monetary values.
64\item \code{mon_grouping} has the same format as the \code{grouping}
65key; it is used for monetary values.
66\item \code{positive_sign} and \code{negative_sign} gives the sign
67used for positive and negative monetary quantities.
68\item \code{int_frac_digits} and \code{frac_digits} specify the number
69of fractional digits used in the international and local formatting
70of monetary values.
71\item \code{p_cs_precedes} and \code{n_cs_precedes} specifies whether
72the currency symbol precedes the value for positive or negative
73values.
74\item \code{p_sep_by_space} and \code{n_sep_by_space} specifies
75whether there is a space between the positive or negative value and
76the currency symbol.
77\item \code{p_sign_posn} and \code{n_sign_posn} indicate how the
78sign should be placed for positive and negative monetary values.
79\end{itemize}
80The possible values for \code{p_sign_posn} and \code{n_sign_posn}
81are given below.
82\begin{itemize}
83\item 0 - Currency and value are surrounded by parentheses.
84\item 1 - The sign should precede the value and currency symbol.
85\item 2 - The sign should follow the value and currency symbol.
86\item 3 - The sign should immediately precede the value.
87\item 4 - The sign should immediately follow the value.
88\item LC_MAX - nothing is specified in this locale.
89\end{itemize}
90\end{funcdesc}
91
92\begin{funcdesc}{strcoll}{string1,string2}
Fred Drake304474f1997-12-17 15:30:07 +000093Compares two strings according to the current \code{LC_COLLATE}
94setting. As any other compare function, returns a negative, or a
95positive value, or \code{0}, depending on whether \var{string1}
96collates before or after \var{string2} or is equal to it.
Guido van Rossumbc12f781997-11-20 21:04:27 +000097\end{funcdesc}
98
99\begin{funcdesc}{strxfrm}{string}
100Transforms a string to one that can be used for the builtin function
Fred Drake304474f1997-12-17 15:30:07 +0000101\code{cmp()}, and still returns locale-aware results. This function can be
Guido van Rossumbc12f781997-11-20 21:04:27 +0000102used when the same string is compared repeatedly, e.g. when collating
103a sequence of strings.
104\end{funcdesc}
105
106\begin{funcdesc}{format}{format,val\optional{grouping=0}}
Fred Drake304474f1997-12-17 15:30:07 +0000107Formats a number \var{val} according to the current \code{LC_NUMERIC}
108setting. The format follows the conventions of the \code{\%} operator. For
Guido van Rossumbc12f781997-11-20 21:04:27 +0000109floating point values, the decimal point is modified if
110appropriate. If \var{grouping} is true, also takes the grouping into
111account.
112\end{funcdesc}
113
114\begin{funcdesc}{str}{float}
Fred Drake304474f1997-12-17 15:30:07 +0000115Formats a floating point number using the same format as the built-in
116function \code{str(\var{float})}, but takes the decimal point into
117account.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000118\end{funcdesc}
119
120\begin{funcdesc}{atof}{string}
Fred Drake304474f1997-12-17 15:30:07 +0000121Converts a string to a floating point number, following the \code{LC_NUMERIC}
Guido van Rossumbc12f781997-11-20 21:04:27 +0000122settings.
123\end{funcdesc}
124
125\begin{funcdesc}{atoi}{string}
Fred Drake304474f1997-12-17 15:30:07 +0000126Converts a string to an integer, following the \code{LC_NUMERIC} conventions.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000127\end{funcdesc}
128
129\begin{datadesc}{LC_CTYPE}
Fred Drake304474f1997-12-17 15:30:07 +0000130\refstmodindex{string}
Guido van Rossumbc12f781997-11-20 21:04:27 +0000131Locale category for the character type functions. Depending on the
132settings of this category, the functions of module \code{string}
133dealing with case change their behaviour.
134\end{datadesc}
135
136\begin{datadesc}{LC_COLLATE}
Fred Drake304474f1997-12-17 15:30:07 +0000137Locale category for sorting strings. The functions \code{strcoll()} and
138\code{strxfrm()} of the \code{locale} module are affected.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000139\end{datadesc}
140
141\begin{datadesc}{LC_TIME}
142Locale category for the formatting of time. The function
Fred Drake304474f1997-12-17 15:30:07 +0000143\code{time.strftime()} follows these conventions.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000144\end{datadesc}
145
146\begin{datadesc}{LC_MONETARY}
147Locale category for formatting of monetary values. The available
Fred Drake304474f1997-12-17 15:30:07 +0000148options are available from the \code{localeconv()} function.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000149\end{datadesc}
150
151\begin{datadesc}{LC_MESSAGES}
152Locale category for message display. Python currently does not support
153application specific locale-aware messages. Messages displayed by the
Fred Drake304474f1997-12-17 15:30:07 +0000154operating system, like those returned by \code{posix.strerror()} might
Guido van Rossumbc12f781997-11-20 21:04:27 +0000155be affected by this category.
156\end{datadesc}
157
158\begin{datadesc}{LC_NUMERIC}
159Locale category for formatting numbers. The functions
Fred Drake304474f1997-12-17 15:30:07 +0000160\code{format()}, \code{atoi()}, \code{atof()} and \code{str()} of the
161\code{locale} module are affected by that category. All other numeric
162formatting operations are not affected.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000163\end{datadesc}
164
165\begin{datadesc}{LC_ALL}
166Combination of all locale settings. If this flag is used when the
167locale is changed, setting the locale for all categories is
168attempted. If that fails for any category, no category is changed at
169all. When the locale is retrieved using this flag, a string indicating
170the setting for all categories is returned. This string can be later
171used to restore the settings.
172\end{datadesc}
173
174\begin{datadesc}{CHAR_MAX}
175This is a symbolic constant used for different values returned by
Fred Drake304474f1997-12-17 15:30:07 +0000176\code{localeconv()}.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000177\end{datadesc}
178
179\begin{excdesc}{Error}
Fred Drake304474f1997-12-17 15:30:07 +0000180Exception raised when \code{setlocale()} fails.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000181\end{excdesc}
182
183Example:
184
Fred Drake19479911998-02-13 06:58:54 +0000185\begin{verbatim}
Guido van Rossumbc12f781997-11-20 21:04:27 +0000186>>> import locale
Guido van Rossumd028ca91998-02-22 04:41:51 +0000187>>> loc = locale.setlocale(locale.LC_ALL) # get current locale
188>>> locale.setlocale(locale.LC_ALL, "de") # use German locale
189>>> locale.strcoll("f\344n", "foo") # compare a string containing an umlaut
190>>> locale.setlocale(locale.LC_ALL, "") # use user's preferred locale
191>>> locale.setlocale(locale.LC_ALL, "C") # use default (C) locale
192>>> locale.setlocale(locale.LC_ALL, loc) # restore saved locale
Fred Drake19479911998-02-13 06:58:54 +0000193\end{verbatim}
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000194
195\subsection{Background, details, hints, tips and caveats}
196
197The C standard defines the locale as a program-wide property that may
198be relatively expensive to change. On top of that, some
199implementation are broken in such a way that frequent locale changes
200may cause core dumps. This makes the locale somewhat painful to use
201correctly.
202
203Initially, when a program is started, the locale is the "C" locale, no
204matter what the user's preferred locale is. The program must
205explicitly say that it wants the user's preferred locale settings by
206calling \code{setlocale(LC_ALL, "")}.
207
208It is generally a bad idea to call \code{setlocale()} in some library
209routine, since as a side effect it affects the entire program. Saving
210and restoring it is almost as bad: it is expensive and affects other
211threads that happen to run before the settings have been restored.
212
213If, when coding a module for general use, you need a locale
214independent version of an operation that is affected by the locale
215(e.g. \code{string.lower()}, or certain formats used with
216\code{time.strftime()})), you will have to find a way to do it without
217using the standard library routine. Even better is convincing
218yourself that using locale settings is okay. Only as a last should
219you document that your module is not compatible with non-C locale
220settings.
221
222The case conversion functions in the \code{string} and \code{strop}
223modules are affected by the locale settings. When a call to the
224\code{setlocale()} function changes the \code{LC_CTYPE} settings, the
225variables \code{string.lowercase}, \code{string.uppercase} and
226\code{string.letters} (and their counterparts in \code{strop}) are
227recalculated. Note that this code that uses these variable through
228\code{from ... import ...}, e.g. \code{from string import letters}, is
229not affected by subsequent \code{setlocale()} calls.
230
231The only way to perform numeric operations according to the locale
232is to use the special functions defined by this module:
233\code{atof()}, \code{atoi()}, \code{format()}, \code{str()}.
234
235\code{For extension writers and programs that embed Python}
236
237Extension modules should never call \code{setlocale()}, except to find
238out what the current locale is. But since the return value can only
239be used portably to restore it, that is not very useful (except
240perhaps to find out whether or not the locale is ``C'').
241
242When Python is embedded in an application, if the application sets the
243locale to something specific before initializing Python, that is
244generally okay, and Python will use whatever locale is set,
245\strong{except} that the \code{LC_NUMERIC} locale should always be
246``C''.
247
248The \code{setlocale()} function in the \code{locale} module contains
249gives the Python progammer the impression that you can manipulate the
250\code{LC_NUMERIC} locale setting, but this not the case at the C
251level: C code will always find that the \code{LC_NUMERIC} locale
252setting is ``C''. This is because too much would break when the
253decimal point character is set to something else than a period
254(e.g. the Python parser would break). Caveat: threads that run
255without holding Python's global interpreter lock may occasionally find
256that the numeric locale setting differs; this is because the only
257portable way to implement this feature is to set the numeric locale
258settings to what the user requests, extract the relevant
259characteristics, and then restore the ``C'' numeric locale.
260
261When Python code uses the \code{locale} module to change the locale,
262this also affect the embedding application. If the embedding
263application doesn't want this to happen, it should remove the
264\code{_locale} extension module (which does all the work) from the
265table of built-in modules in the \code{config.c} file, and make sure
266that the \code{_locale} module is not accessible as a shared library.