blob: d70ffcfdc20b636be0c73daaf19f75f8f37f26f6 [file] [log] [blame]
Fred Drake295da241998-08-10 19:42:37 +00001\section{\module{locale} ---
2 Internationalization services.}
Fred Drakeb91e9341998-07-23 17:59:49 +00003\declaremodule{standard}{locale}
Guido van Rossumbc12f781997-11-20 21:04:27 +00004
Fred Drakeb91e9341998-07-23 17:59:49 +00005
6\modulesynopsis{Internationalization services.}
7
Guido van Rossumbc12f781997-11-20 21:04:27 +00008
Fred Drake65b32f71998-02-09 20:27:12 +00009The \code{locale} module opens access to the \POSIX{} locale database
10and functionality. The \POSIX{} locale mechanism allows applications
11to integrate certain cultural aspects into an applications, without
Guido van Rossumbc12f781997-11-20 21:04:27 +000012requiring the programmer to know all the specifics of each country
13where the software is executed.
14
Fred Drake193338a1998-03-10 04:23:12 +000015The \module{locale} module is implemented on top of the
16\module{_locale}\refbimodindex{_locale} module, which in turn uses an
17ANSI \C{} locale implementation if available.
Guido van Rossumbc12f781997-11-20 21:04:27 +000018
Fred Drake193338a1998-03-10 04:23:12 +000019The \module{locale} module defines the following exception and
20functions:
Guido van Rossumbc12f781997-11-20 21:04:27 +000021
Guido van Rossumbc12f781997-11-20 21:04:27 +000022
Fred Drake193338a1998-03-10 04:23:12 +000023\begin{funcdesc}{setlocale}{category\optional{, value}}
Guido van Rossumbc12f781997-11-20 21:04:27 +000024If \var{value} is specified, modifies the locale setting for the
25\var{category}. The available categories are listed in the data
26description below. The value is the name of a locale. An empty string
27specifies the user's default settings. If the modification of the
Fred Drake193338a1998-03-10 04:23:12 +000028locale fails, the exception \exception{Error} is
Guido van Rossumbc12f781997-11-20 21:04:27 +000029raised. If successful, the new locale setting is returned.
30
31If no \var{value} is specified, the current setting for the
32\var{category} is returned.
33
Fred Drake193338a1998-03-10 04:23:12 +000034\function{setlocale()} is not thread safe on most systems. Applications
Guido van Rossumbc12f781997-11-20 21:04:27 +000035typically start with a call of
Fred Drake19479911998-02-13 06:58:54 +000036\begin{verbatim}
Guido van Rossumbc12f781997-11-20 21:04:27 +000037import locale
38locale.setlocale(locale.LC_ALL,"")
Fred Drake19479911998-02-13 06:58:54 +000039\end{verbatim}
Guido van Rossumbc12f781997-11-20 21:04:27 +000040This sets the locale for all categories to the user's default setting
41(typically specified in the \code{LANG} environment variable). If the
42locale is not changed thereafter, using multithreading should not
43cause problems.
44\end{funcdesc}
45
Fred Drake193338a1998-03-10 04:23:12 +000046\begin{excdesc}{Error}
47Exception raised when \function{setlocale()} fails.
48\end{excdesc}
49
Guido van Rossumbc12f781997-11-20 21:04:27 +000050\begin{funcdesc}{localeconv}{}
51Returns the database of of the local conventions as a dictionary. This
52dictionary has the following strings as keys:
53\begin{itemize}
54\item \code{decimal_point} specifies the decimal point used in
55floating point number representations for the \code{LC_NUMERIC}
56category.
57\item \code{grouping} is a sequence of numbers specifying at which
58relative positions the \code{thousands_sep} is expected. If the
59sequence is terminated with \code{locale.CHAR_MAX}, no further
Fred Drake304474f1997-12-17 15:30:07 +000060grouping is performed. If the sequence terminates with a \code{0}, the last
Guido van Rossumbc12f781997-11-20 21:04:27 +000061group size is repeatedly used.
62\item \code{thousands_sep} is the character used between groups.
63\item \code{int_curr_symbol} specifies the international currency
64symbol from the \code{LC_MONETARY} category.
65\item \code{currency_symbol} is the local currency symbol.
66\item \code{mon_decimal_point} is the decimal point used in monetary
67values.
68\item \code{mon_thousands_sep} is the separator for grouping of
69monetary values.
70\item \code{mon_grouping} has the same format as the \code{grouping}
71key; it is used for monetary values.
72\item \code{positive_sign} and \code{negative_sign} gives the sign
73used for positive and negative monetary quantities.
74\item \code{int_frac_digits} and \code{frac_digits} specify the number
75of fractional digits used in the international and local formatting
76of monetary values.
77\item \code{p_cs_precedes} and \code{n_cs_precedes} specifies whether
78the currency symbol precedes the value for positive or negative
79values.
80\item \code{p_sep_by_space} and \code{n_sep_by_space} specifies
81whether there is a space between the positive or negative value and
82the currency symbol.
83\item \code{p_sign_posn} and \code{n_sign_posn} indicate how the
84sign should be placed for positive and negative monetary values.
85\end{itemize}
Fred Drake193338a1998-03-10 04:23:12 +000086
Guido van Rossumbc12f781997-11-20 21:04:27 +000087The possible values for \code{p_sign_posn} and \code{n_sign_posn}
88are given below.
Fred Drake193338a1998-03-10 04:23:12 +000089
Fred Drakeee601911998-04-11 20:53:03 +000090\begin{tableii}{c|l}{code}{Value}{Explanation}
Fred Drake193338a1998-03-10 04:23:12 +000091\lineii{0}{Currency and value are surrounded by parentheses.}
92\lineii{1}{The sign should precede the value and currency symbol.}
93\lineii{2}{The sign should follow the value and currency symbol.}
94\lineii{3}{The sign should immediately precede the value.}
95\lineii{4}{The sign should immediately follow the value.}
96\lineii{LC_MAX}{Nothing is specified in this locale.}
97\end{tableii}
Guido van Rossumbc12f781997-11-20 21:04:27 +000098\end{funcdesc}
99
100\begin{funcdesc}{strcoll}{string1,string2}
Fred Drake193338a1998-03-10 04:23:12 +0000101Compares two strings according to the current \constant{LC_COLLATE}
Fred Drake304474f1997-12-17 15:30:07 +0000102setting. As any other compare function, returns a negative, or a
103positive value, or \code{0}, depending on whether \var{string1}
104collates before or after \var{string2} or is equal to it.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000105\end{funcdesc}
106
107\begin{funcdesc}{strxfrm}{string}
Fred Drakedc409041998-04-02 18:54:54 +0000108Transforms a string to one that can be used for the built-in function
Fred Drake193338a1998-03-10 04:23:12 +0000109\function{cmp()}\bifuncindex{cmp}, and still returns locale-aware
110results. This function can be used when the same string is compared
111repeatedly, e.g. when collating a sequence of strings.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000112\end{funcdesc}
113
Fred Drake193338a1998-03-10 04:23:12 +0000114\begin{funcdesc}{format}{format, val, \optional{grouping\code{ = 0}}}
115Formats a number \var{val} according to the current
116\constant{LC_NUMERIC} setting. The format follows the conventions of
117the \code{\%} operator. For floating point values, the decimal point
118is modified if appropriate. If \var{grouping} is true, also takes the
119grouping into account.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000120\end{funcdesc}
121
122\begin{funcdesc}{str}{float}
Fred Drake304474f1997-12-17 15:30:07 +0000123Formats a floating point number using the same format as the built-in
124function \code{str(\var{float})}, but takes the decimal point into
125account.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000126\end{funcdesc}
127
128\begin{funcdesc}{atof}{string}
Fred Drake193338a1998-03-10 04:23:12 +0000129Converts a string to a floating point number, following the
130\constant{LC_NUMERIC} settings.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000131\end{funcdesc}
132
133\begin{funcdesc}{atoi}{string}
Fred Drake193338a1998-03-10 04:23:12 +0000134Converts a string to an integer, following the \constant{LC_NUMERIC}
135conventions.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000136\end{funcdesc}
137
138\begin{datadesc}{LC_CTYPE}
Fred Drake304474f1997-12-17 15:30:07 +0000139\refstmodindex{string}
Guido van Rossumbc12f781997-11-20 21:04:27 +0000140Locale category for the character type functions. Depending on the
Fred Drake193338a1998-03-10 04:23:12 +0000141settings of this category, the functions of module \module{string}
Guido van Rossumbc12f781997-11-20 21:04:27 +0000142dealing with case change their behaviour.
143\end{datadesc}
144
145\begin{datadesc}{LC_COLLATE}
Fred Drake193338a1998-03-10 04:23:12 +0000146Locale category for sorting strings. The functions
147\function{strcoll()} and \function{strxfrm()} of the \module{locale}
148module are affected.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000149\end{datadesc}
150
151\begin{datadesc}{LC_TIME}
152Locale category for the formatting of time. The function
Fred Drake193338a1998-03-10 04:23:12 +0000153\function{time.strftime()} follows these conventions.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000154\end{datadesc}
155
156\begin{datadesc}{LC_MONETARY}
157Locale category for formatting of monetary values. The available
Fred Drake193338a1998-03-10 04:23:12 +0000158options are available from the \function{localeconv()} function.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000159\end{datadesc}
160
161\begin{datadesc}{LC_MESSAGES}
162Locale category for message display. Python currently does not support
163application specific locale-aware messages. Messages displayed by the
Fred Drake193338a1998-03-10 04:23:12 +0000164operating system, like those returned by \function{os.strerror()}
165might be affected by this category.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000166\end{datadesc}
167
168\begin{datadesc}{LC_NUMERIC}
169Locale category for formatting numbers. The functions
Fred Drake193338a1998-03-10 04:23:12 +0000170\function{format()}, \function{atoi()}, \function{atof()} and
171\function{str()} of the \module{locale} module are affected by that
172category. All other numeric formatting operations are not affected.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000173\end{datadesc}
174
175\begin{datadesc}{LC_ALL}
176Combination of all locale settings. If this flag is used when the
177locale is changed, setting the locale for all categories is
178attempted. If that fails for any category, no category is changed at
179all. When the locale is retrieved using this flag, a string indicating
180the setting for all categories is returned. This string can be later
181used to restore the settings.
182\end{datadesc}
183
184\begin{datadesc}{CHAR_MAX}
185This is a symbolic constant used for different values returned by
Fred Drake193338a1998-03-10 04:23:12 +0000186\function{localeconv()}.
Guido van Rossumbc12f781997-11-20 21:04:27 +0000187\end{datadesc}
188
Guido van Rossumbc12f781997-11-20 21:04:27 +0000189Example:
190
Fred Drake19479911998-02-13 06:58:54 +0000191\begin{verbatim}
Guido van Rossumbc12f781997-11-20 21:04:27 +0000192>>> import locale
Guido van Rossumd028ca91998-02-22 04:41:51 +0000193>>> loc = locale.setlocale(locale.LC_ALL) # get current locale
194>>> locale.setlocale(locale.LC_ALL, "de") # use German locale
195>>> locale.strcoll("f\344n", "foo") # compare a string containing an umlaut
196>>> locale.setlocale(locale.LC_ALL, "") # use user's preferred locale
197>>> locale.setlocale(locale.LC_ALL, "C") # use default (C) locale
198>>> locale.setlocale(locale.LC_ALL, loc) # restore saved locale
Fred Drake19479911998-02-13 06:58:54 +0000199\end{verbatim}
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000200
201\subsection{Background, details, hints, tips and caveats}
202
203The C standard defines the locale as a program-wide property that may
204be relatively expensive to change. On top of that, some
205implementation are broken in such a way that frequent locale changes
206may cause core dumps. This makes the locale somewhat painful to use
207correctly.
208
Fred Drake9fee0711998-04-03 06:21:23 +0000209Initially, when a program is started, the locale is the \samp{C} locale, no
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000210matter what the user's preferred locale is. The program must
211explicitly say that it wants the user's preferred locale settings by
212calling \code{setlocale(LC_ALL, "")}.
213
Fred Drake193338a1998-03-10 04:23:12 +0000214It is generally a bad idea to call \function{setlocale()} in some library
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000215routine, since as a side effect it affects the entire program. Saving
216and restoring it is almost as bad: it is expensive and affects other
217threads that happen to run before the settings have been restored.
218
219If, when coding a module for general use, you need a locale
220independent version of an operation that is affected by the locale
Fred Drake193338a1998-03-10 04:23:12 +0000221(e.g. \function{string.lower()}, or certain formats used with
222\function{time.strftime()})), you will have to find a way to do it
223without using the standard library routine. Even better is convincing
224yourself that using locale settings is okay. Only as a last resort
Fred Drake9fee0711998-04-03 06:21:23 +0000225should you document that your module is not compatible with
226non-\samp{C} locale settings.
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000227
Fred Drake193338a1998-03-10 04:23:12 +0000228The case conversion functions in the
229\module{string}\refstmodindex{string} and
230\module{strop}\refbimodindex{strop} modules are affected by the locale
231settings. When a call to the \function{setlocale()} function changes
232the \constant{LC_CTYPE} settings, the variables
233\code{string.lowercase}, \code{string.uppercase} and
234\code{string.letters} (and their counterparts in \module{strop}) are
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000235recalculated. Note that this code that uses these variable through
Fred Drake193338a1998-03-10 04:23:12 +0000236`\keyword{from} ... \keyword{import} ...', e.g. \code{from string
237import letters}, is not affected by subsequent \function{setlocale()}
238calls.
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000239
240The only way to perform numeric operations according to the locale
241is to use the special functions defined by this module:
Fred Drake193338a1998-03-10 04:23:12 +0000242\function{atof()}, \function{atoi()}, \function{format()},
243\function{str()}.
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000244
Fred Drake193338a1998-03-10 04:23:12 +0000245\subsection{For extension writers and programs that embed Python}
Fred Drake9fee0711998-04-03 06:21:23 +0000246\label{embedding-locale}
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000247
Fred Drake193338a1998-03-10 04:23:12 +0000248Extension modules should never call \function{setlocale()}, except to
249find out what the current locale is. But since the return value can
250only be used portably to restore it, that is not very useful (except
Fred Drake9fee0711998-04-03 06:21:23 +0000251perhaps to find out whether or not the locale is \samp{C}).
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000252
253When Python is embedded in an application, if the application sets the
254locale to something specific before initializing Python, that is
255generally okay, and Python will use whatever locale is set,
Fred Drake9fee0711998-04-03 06:21:23 +0000256\emph{except} that the \constant{LC_NUMERIC} locale should always be
257\samp{C}.
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000258
Fred Drake193338a1998-03-10 04:23:12 +0000259The \function{setlocale()} function in the \module{locale} module contains
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000260gives the Python progammer the impression that you can manipulate the
Fred Drake9fee0711998-04-03 06:21:23 +0000261\constant{LC_NUMERIC} locale setting, but this not the case at the \C{}
262level: \C{} code will always find that the \constant{LC_NUMERIC} locale
263setting is \samp{C}. This is because too much would break when the
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000264decimal point character is set to something else than a period
265(e.g. the Python parser would break). Caveat: threads that run
266without holding Python's global interpreter lock may occasionally find
267that the numeric locale setting differs; this is because the only
268portable way to implement this feature is to set the numeric locale
269settings to what the user requests, extract the relevant
Fred Drake9fee0711998-04-03 06:21:23 +0000270characteristics, and then restore the \samp{C} numeric locale.
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000271
Fred Drake193338a1998-03-10 04:23:12 +0000272When Python code uses the \module{locale} module to change the locale,
Guido van Rossum3ffb7151998-02-22 04:23:51 +0000273this also affect the embedding application. If the embedding
274application doesn't want this to happen, it should remove the
Fred Drake193338a1998-03-10 04:23:12 +0000275\module{_locale} extension module (which does all the work) from the
276table of built-in modules in the \file{config.c} file, and make sure
277that the \module{_locale} module is not accessible as a shared library.