blob: 07361b84e9a9dce8e28e31606a846356325aa9be [file] [log] [blame]
Fred Drake3a0351c1998-04-04 07:23:21 +00001\section{Standard Module \module{locale}}
Guido van Rossumbc12f781997-11-20 21:04:27 +00002\stmodindex{locale}
3
4\label{module-locale}
5
Fred Drake65b32f71998-02-09 20:27:12 +00006The \code{locale} module opens access to the \POSIX{} locale database
7and functionality. The \POSIX{} locale mechanism allows applications
8to integrate certain cultural aspects into an applications, without
Guido van Rossumbc12f781997-11-20 21:04:27 +00009requiring the programmer to know all the specifics of each country
10where the software is executed.
11
Fred Drake193338a1998-03-10 04:23:12 +000012The \module{locale} module is implemented on top of the
13\module{_locale}\refbimodindex{_locale} module, which in turn uses an
14ANSI \C{} locale implementation if available.
Guido van Rossumbc12f781997-11-20 21:04:27 +000015
Fred Drake193338a1998-03-10 04:23:12 +000016The \module{locale} module defines the following exception and
17functions:
Guido van Rossumbc12f781997-11-20 21:04:27 +000018
Guido van Rossumbc12f781997-11-20 21:04:27 +000019
Fred Drake193338a1998-03-10 04:23:12 +000020\begin{funcdesc}{setlocale}{category\optional{, value}}
Guido van Rossumbc12f781997-11-20 21:04:27 +000021If \var{value} is specified, modifies the locale setting for the
22\var{category}. The available categories are listed in the data
23description below. The value is the name of a locale. An empty string
24specifies the user's default settings. If the modification of the
Fred Drake193338a1998-03-10 04:23:12 +000025locale fails, the exception \exception{Error} is
Guido van Rossumbc12f781997-11-20 21:04:27 +000026raised. If successful, the new locale setting is returned.
27
28If no \var{value} is specified, the current setting for the
29\var{category} is returned.
30
Fred Drake193338a1998-03-10 04:23:12 +000031\function{setlocale()} is not thread safe on most systems. Applications
Guido van Rossumbc12f781997-11-20 21:04:27 +000032typically start with a call of
Fred Drake19479911998-02-13 06:58:54 +000033\begin{verbatim}
Guido van Rossumbc12f781997-11-20 21:04:27 +000034import locale
35locale.setlocale(locale.LC_ALL,"")
Fred Drake19479911998-02-13 06:58:54 +000036\end{verbatim}
Guido van Rossumbc12f781997-11-20 21:04:27 +000037This sets the locale for all categories to the user's default setting
38(typically specified in the \code{LANG} environment variable). If the
39locale is not changed thereafter, using multithreading should not
40cause problems.
41\end{funcdesc}
42
Fred Drake193338a1998-03-10 04:23:12 +000043\begin{excdesc}{Error}
44Exception raised when \function{setlocale()} fails.
45\end{excdesc}
46
Guido van Rossumbc12f781997-11-20 21:04:27 +000047\begin{funcdesc}{localeconv}{}
48Returns the database of of the local conventions as a dictionary. This
49dictionary has the following strings as keys:
50\begin{itemize}
51\item \code{decimal_point} specifies the decimal point used in
52floating point number representations for the \code{LC_NUMERIC}
53category.
54\item \code{grouping} is a sequence of numbers specifying at which
55relative positions the \code{thousands_sep} is expected. If the
56sequence is terminated with \code{locale.CHAR_MAX}, no further
Fred Drake304474f1997-12-17 15:30:07 +000057grouping is performed. If the sequence terminates with a \code{0}, the last
Guido van Rossumbc12f781997-11-20 21:04:27 +000058group size is repeatedly used.
59\item \code{thousands_sep} is the character used between groups.
60\item \code{int_curr_symbol} specifies the international currency
61symbol from the \code{LC_MONETARY} category.
62\item \code{currency_symbol} is the local currency symbol.
63\item \code{mon_decimal_point} is the decimal point used in monetary
64values.
65\item \code{mon_thousands_sep} is the separator for grouping of
66monetary values.
67\item \code{mon_grouping} has the same format as the \code{grouping}
68key; it is used for monetary values.
69\item \code{positive_sign} and \code{negative_sign} gives the sign
70used for positive and negative monetary quantities.
71\item \code{int_frac_digits} and \code{frac_digits} specify the number
72of fractional digits used in the international and local formatting
73of monetary values.
74\item \code{p_cs_precedes} and \code{n_cs_precedes} specifies whether
75the currency symbol precedes the value for positive or negative
76values.
77\item \code{p_sep_by_space} and \code{n_sep_by_space} specifies
78whether there is a space between the positive or negative value and
79the currency symbol.
80\item \code{p_sign_posn} and \code{n_sign_posn} indicate how the
81sign should be placed for positive and negative monetary values.
82\end{itemize}
Fred Drake193338a1998-03-10 04:23:12 +000083
Guido van Rossumbc12f781997-11-20 21:04:27 +000084The possible values for \code{p_sign_posn} and \code{n_sign_posn}
85are given below.
Fred Drake193338a1998-03-10 04:23:12 +000086
Fred Drakeee601911998-04-11 20:53:03 +000087\begin{tableii}{c|l}{code}{Value}{Explanation}
Fred Drake193338a1998-03-10 04:23:12 +000088\lineii{0}{Currency and value are surrounded by parentheses.}
89\lineii{1}{The sign should precede the value and currency symbol.}
90\lineii{2}{The sign should follow the value and currency symbol.}
91\lineii{3}{The sign should immediately precede the value.}
92\lineii{4}{The sign should immediately follow the value.}
93\lineii{LC_MAX}{Nothing is specified in this locale.}
94\end{tableii}
Guido van Rossumbc12f781997-11-20 21:04:27 +000095\end{funcdesc}
96
97\begin{funcdesc}{strcoll}{string1,string2}
Fred Drake193338a1998-03-10 04:23:12 +000098Compares two strings according to the current \constant{LC_COLLATE}
Fred Drake304474f1997-12-17 15:30:07 +000099setting. As any other compare function, returns a negative, or a
100positive value, or \code{0}, depending on whether \var{string1}
101collates before or after \var{string2} or is equal to it.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000102\end{funcdesc}
103
104\begin{funcdesc}{strxfrm}{string}
Fred Drakedc409041998-04-02 18:54:54 +0000105Transforms a string to one that can be used for the built-in function
Fred Drake193338a1998-03-10 04:23:12 +0000106\function{cmp()}\bifuncindex{cmp}, and still returns locale-aware
107results. This function can be used when the same string is compared
108repeatedly, e.g. when collating a sequence of strings.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000109\end{funcdesc}
110
Fred Drake193338a1998-03-10 04:23:12 +0000111\begin{funcdesc}{format}{format, val, \optional{grouping\code{ = 0}}}
112Formats a number \var{val} according to the current
113\constant{LC_NUMERIC} setting. The format follows the conventions of
114the \code{\%} operator. For floating point values, the decimal point
115is modified if appropriate. If \var{grouping} is true, also takes the
116grouping into account.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000117\end{funcdesc}
118
119\begin{funcdesc}{str}{float}
Fred Drake304474f1997-12-17 15:30:07 +0000120Formats a floating point number using the same format as the built-in
121function \code{str(\var{float})}, but takes the decimal point into
122account.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000123\end{funcdesc}
124
125\begin{funcdesc}{atof}{string}
Fred Drake193338a1998-03-10 04:23:12 +0000126Converts a string to a floating point number, following the
127\constant{LC_NUMERIC} settings.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000128\end{funcdesc}
129
130\begin{funcdesc}{atoi}{string}
Fred Drake193338a1998-03-10 04:23:12 +0000131Converts a string to an integer, following the \constant{LC_NUMERIC}
132conventions.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000133\end{funcdesc}
134
135\begin{datadesc}{LC_CTYPE}
Fred Drake304474f1997-12-17 15:30:07 +0000136\refstmodindex{string}
Guido van Rossumbc12f781997-11-20 21:04:27 +0000137Locale category for the character type functions. Depending on the
Fred Drake193338a1998-03-10 04:23:12 +0000138settings of this category, the functions of module \module{string}
Guido van Rossumbc12f781997-11-20 21:04:27 +0000139dealing with case change their behaviour.
140\end{datadesc}
141
142\begin{datadesc}{LC_COLLATE}
Fred Drake193338a1998-03-10 04:23:12 +0000143Locale category for sorting strings. The functions
144\function{strcoll()} and \function{strxfrm()} of the \module{locale}
145module are affected.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000146\end{datadesc}
147
148\begin{datadesc}{LC_TIME}
149Locale category for the formatting of time. The function
Fred Drake193338a1998-03-10 04:23:12 +0000150\function{time.strftime()} follows these conventions.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000151\end{datadesc}
152
153\begin{datadesc}{LC_MONETARY}
154Locale category for formatting of monetary values. The available
Fred Drake193338a1998-03-10 04:23:12 +0000155options are available from the \function{localeconv()} function.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000156\end{datadesc}
157
158\begin{datadesc}{LC_MESSAGES}
159Locale category for message display. Python currently does not support
160application specific locale-aware messages. Messages displayed by the
Fred Drake193338a1998-03-10 04:23:12 +0000161operating system, like those returned by \function{os.strerror()}
162might be affected by this category.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000163\end{datadesc}
164
165\begin{datadesc}{LC_NUMERIC}
166Locale category for formatting numbers. The functions
Fred Drake193338a1998-03-10 04:23:12 +0000167\function{format()}, \function{atoi()}, \function{atof()} and
168\function{str()} of the \module{locale} module are affected by that
169category. All other numeric formatting operations are not affected.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000170\end{datadesc}
171
172\begin{datadesc}{LC_ALL}
173Combination of all locale settings. If this flag is used when the
174locale is changed, setting the locale for all categories is
175attempted. If that fails for any category, no category is changed at
176all. When the locale is retrieved using this flag, a string indicating
177the setting for all categories is returned. This string can be later
178used to restore the settings.
179\end{datadesc}
180
181\begin{datadesc}{CHAR_MAX}
182This is a symbolic constant used for different values returned by
Fred Drake193338a1998-03-10 04:23:12 +0000183\function{localeconv()}.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000184\end{datadesc}
185
Guido van Rossumbc12f781997-11-20 21:04:27 +0000186Example:
187
Fred Drake19479911998-02-13 06:58:54 +0000188\begin{verbatim}
Guido van Rossumbc12f781997-11-20 21:04:27 +0000189>>> import locale
Guido van Rossumd028ca91998-02-22 04:41:51 +0000190>>> loc = locale.setlocale(locale.LC_ALL) # get current locale
191>>> locale.setlocale(locale.LC_ALL, "de") # use German locale
192>>> locale.strcoll("f\344n", "foo") # compare a string containing an umlaut
193>>> locale.setlocale(locale.LC_ALL, "") # use user's preferred locale
194>>> locale.setlocale(locale.LC_ALL, "C") # use default (C) locale
195>>> locale.setlocale(locale.LC_ALL, loc) # restore saved locale
Fred Drake19479911998-02-13 06:58:54 +0000196\end{verbatim}
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000197
198\subsection{Background, details, hints, tips and caveats}
199
200The C standard defines the locale as a program-wide property that may
201be relatively expensive to change. On top of that, some
202implementation are broken in such a way that frequent locale changes
203may cause core dumps. This makes the locale somewhat painful to use
204correctly.
205
Fred Drake9fee0711998-04-03 06:21:23 +0000206Initially, when a program is started, the locale is the \samp{C} locale, no
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000207matter what the user's preferred locale is. The program must
208explicitly say that it wants the user's preferred locale settings by
209calling \code{setlocale(LC_ALL, "")}.
210
Fred Drake193338a1998-03-10 04:23:12 +0000211It is generally a bad idea to call \function{setlocale()} in some library
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000212routine, since as a side effect it affects the entire program. Saving
213and restoring it is almost as bad: it is expensive and affects other
214threads that happen to run before the settings have been restored.
215
216If, when coding a module for general use, you need a locale
217independent version of an operation that is affected by the locale
Fred Drake193338a1998-03-10 04:23:12 +0000218(e.g. \function{string.lower()}, or certain formats used with
219\function{time.strftime()})), you will have to find a way to do it
220without using the standard library routine. Even better is convincing
221yourself that using locale settings is okay. Only as a last resort
Fred Drake9fee0711998-04-03 06:21:23 +0000222should you document that your module is not compatible with
223non-\samp{C} locale settings.
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000224
Fred Drake193338a1998-03-10 04:23:12 +0000225The case conversion functions in the
226\module{string}\refstmodindex{string} and
227\module{strop}\refbimodindex{strop} modules are affected by the locale
228settings. When a call to the \function{setlocale()} function changes
229the \constant{LC_CTYPE} settings, the variables
230\code{string.lowercase}, \code{string.uppercase} and
231\code{string.letters} (and their counterparts in \module{strop}) are
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000232recalculated. Note that this code that uses these variable through
Fred Drake193338a1998-03-10 04:23:12 +0000233`\keyword{from} ... \keyword{import} ...', e.g. \code{from string
234import letters}, is not affected by subsequent \function{setlocale()}
235calls.
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000236
237The only way to perform numeric operations according to the locale
238is to use the special functions defined by this module:
Fred Drake193338a1998-03-10 04:23:12 +0000239\function{atof()}, \function{atoi()}, \function{format()},
240\function{str()}.
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000241
Fred Drake193338a1998-03-10 04:23:12 +0000242\subsection{For extension writers and programs that embed Python}
Fred Drake9fee0711998-04-03 06:21:23 +0000243\label{embedding-locale}
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000244
Fred Drake193338a1998-03-10 04:23:12 +0000245Extension modules should never call \function{setlocale()}, except to
246find out what the current locale is. But since the return value can
247only be used portably to restore it, that is not very useful (except
Fred Drake9fee0711998-04-03 06:21:23 +0000248perhaps to find out whether or not the locale is \samp{C}).
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000249
250When Python is embedded in an application, if the application sets the
251locale to something specific before initializing Python, that is
252generally okay, and Python will use whatever locale is set,
Fred Drake9fee0711998-04-03 06:21:23 +0000253\emph{except} that the \constant{LC_NUMERIC} locale should always be
254\samp{C}.
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000255
Fred Drake193338a1998-03-10 04:23:12 +0000256The \function{setlocale()} function in the \module{locale} module contains
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000257gives the Python progammer the impression that you can manipulate the
Fred Drake9fee0711998-04-03 06:21:23 +0000258\constant{LC_NUMERIC} locale setting, but this not the case at the \C{}
259level: \C{} code will always find that the \constant{LC_NUMERIC} locale
260setting is \samp{C}. This is because too much would break when the
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000261decimal point character is set to something else than a period
262(e.g. the Python parser would break). Caveat: threads that run
263without holding Python's global interpreter lock may occasionally find
264that the numeric locale setting differs; this is because the only
265portable way to implement this feature is to set the numeric locale
266settings to what the user requests, extract the relevant
Fred Drake9fee0711998-04-03 06:21:23 +0000267characteristics, and then restore the \samp{C} numeric locale.
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000268
Fred Drake193338a1998-03-10 04:23:12 +0000269When Python code uses the \module{locale} module to change the locale,
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000270this also affect the embedding application. If the embedding
271application doesn't want this to happen, it should remove the
Fred Drake193338a1998-03-10 04:23:12 +0000272\module{_locale} extension module (which does all the work) from the
273table of built-in modules in the \file{config.c} file, and make sure
274that the \module{_locale} module is not accessible as a shared library.