Fred Drake | 295da24 | 1998-08-10 19:42:37 +0000 | [diff] [blame] | 1 | \section{\module{locale} --- |
Fred Drake | c3845a1 | 1999-04-21 17:18:04 +0000 | [diff] [blame] | 2 | Internationalization services} |
| 3 | |
Fred Drake | b91e934 | 1998-07-23 17:59:49 +0000 | [diff] [blame] | 4 | \declaremodule{standard}{locale} |
Fred Drake | b91e934 | 1998-07-23 17:59:49 +0000 | [diff] [blame] | 5 | \modulesynopsis{Internationalization services.} |
Martin v. Löwis | 338bcbc | 2003-04-18 22:04:34 +0000 | [diff] [blame] | 6 | \moduleauthor{Martin von L\"owis}{martin@v.loewis.de} |
| 7 | \sectionauthor{Martin von L\"owis}{martin@v.loewis.de} |
Fred Drake | b91e934 | 1998-07-23 17:59:49 +0000 | [diff] [blame] | 8 | |
Guido van Rossum | bc12f78 | 1997-11-20 21:04:27 +0000 | [diff] [blame] | 9 | |
Fred Drake | 1491cac | 2000-10-25 20:59:52 +0000 | [diff] [blame] | 10 | The \module{locale} module opens access to the \POSIX{} locale |
| 11 | database and functionality. The \POSIX{} locale mechanism allows |
| 12 | programmers to deal with certain cultural issues in an application, |
| 13 | without requiring the programmer to know all the specifics of each |
| 14 | country where the software is executed. |
Guido van Rossum | bc12f78 | 1997-11-20 21:04:27 +0000 | [diff] [blame] | 15 | |
Fred Drake | 193338a | 1998-03-10 04:23:12 +0000 | [diff] [blame] | 16 | The \module{locale} module is implemented on top of the |
| 17 | \module{_locale}\refbimodindex{_locale} module, which in turn uses an |
Fred Drake | c3845a1 | 1999-04-21 17:18:04 +0000 | [diff] [blame] | 18 | ANSI C locale implementation if available. |
Guido van Rossum | bc12f78 | 1997-11-20 21:04:27 +0000 | [diff] [blame] | 19 | |
Fred Drake | 193338a | 1998-03-10 04:23:12 +0000 | [diff] [blame] | 20 | The \module{locale} module defines the following exception and |
| 21 | functions: |
Guido van Rossum | bc12f78 | 1997-11-20 21:04:27 +0000 | [diff] [blame] | 22 | |
Guido van Rossum | bc12f78 | 1997-11-20 21:04:27 +0000 | [diff] [blame] | 23 | |
Fred Drake | 1491cac | 2000-10-25 20:59:52 +0000 | [diff] [blame] | 24 | \begin{excdesc}{Error} |
| 25 | Exception raised when \function{setlocale()} fails. |
| 26 | \end{excdesc} |
Guido van Rossum | bc12f78 | 1997-11-20 21:04:27 +0000 | [diff] [blame] | 27 | |
Fred Drake | 1491cac | 2000-10-25 20:59:52 +0000 | [diff] [blame] | 28 | \begin{funcdesc}{setlocale}{category\optional{, locale}} |
| 29 | If \var{locale} is specified, it may be a string, a tuple of the |
| 30 | form \code{(\var{language code}, \var{encoding})}, or \code{None}. |
| 31 | If it is a tuple, it is converted to a string using the locale |
| 32 | aliasing engine. If \var{locale} is given and not \code{None}, |
| 33 | \function{setlocale()} modifies the locale setting for the |
| 34 | \var{category}. The available categories are listed in the data |
| 35 | description below. The value is the name of a locale. An empty |
| 36 | string specifies the user's default settings. If the modification of |
| 37 | the locale fails, the exception \exception{Error} is raised. If |
| 38 | successful, the new locale setting is returned. |
Guido van Rossum | bc12f78 | 1997-11-20 21:04:27 +0000 | [diff] [blame] | 39 | |
Fred Drake | 1491cac | 2000-10-25 20:59:52 +0000 | [diff] [blame] | 40 | If \var{locale} is omitted or \code{None}, the current setting for |
| 41 | \var{category} is returned. |
| 42 | |
| 43 | \function{setlocale()} is not thread safe on most systems. |
| 44 | Applications typically start with a call of |
| 45 | |
Fred Drake | 1947991 | 1998-02-13 06:58:54 +0000 | [diff] [blame] | 46 | \begin{verbatim} |
Guido van Rossum | bc12f78 | 1997-11-20 21:04:27 +0000 | [diff] [blame] | 47 | import locale |
Fred Drake | c01f6e6 | 2000-11-30 07:13:58 +0000 | [diff] [blame] | 48 | locale.setlocale(locale.LC_ALL, '') |
Fred Drake | 1947991 | 1998-02-13 06:58:54 +0000 | [diff] [blame] | 49 | \end{verbatim} |
Guido van Rossum | bc12f78 | 1997-11-20 21:04:27 +0000 | [diff] [blame] | 50 | |
Fred Drake | 1491cac | 2000-10-25 20:59:52 +0000 | [diff] [blame] | 51 | This sets the locale for all categories to the user's default |
| 52 | setting (typically specified in the \envvar{LANG} environment |
| 53 | variable). If the locale is not changed thereafter, using |
| 54 | multithreading should not cause problems. |
| 55 | |
| 56 | \versionchanged[Added support for tuple values of the \var{locale} |
| 57 | parameter]{2.0} |
| 58 | \end{funcdesc} |
Fred Drake | 193338a | 1998-03-10 04:23:12 +0000 | [diff] [blame] | 59 | |
Guido van Rossum | bc12f78 | 1997-11-20 21:04:27 +0000 | [diff] [blame] | 60 | \begin{funcdesc}{localeconv}{} |
Raymond Hettinger | 999b57c | 2003-08-25 04:28:05 +0000 | [diff] [blame] | 61 | Returns the database of the local conventions as a dictionary. |
Fred Drake | 1491cac | 2000-10-25 20:59:52 +0000 | [diff] [blame] | 62 | This dictionary has the following strings as keys: |
Fred Drake | 193338a | 1998-03-10 04:23:12 +0000 | [diff] [blame] | 63 | |
Fred Drake | c01f6e6 | 2000-11-30 07:13:58 +0000 | [diff] [blame] | 64 | \begin{tableiii}{l|l|p{3in}}{constant}{Key}{Category}{Meaning} |
| 65 | \lineiii{LC_NUMERIC}{\code{'decimal_point'}} |
| 66 | {Decimal point character.} |
| 67 | \lineiii{}{\code{'grouping'}} |
| 68 | {Sequence of numbers specifying which relative positions |
| 69 | the \code{'thousands_sep'} is expected. If the sequence is |
| 70 | terminated with \constant{CHAR_MAX}, no further grouping |
| 71 | is performed. If the sequence terminates with a \code{0}, |
| 72 | the last group size is repeatedly used.} |
| 73 | \lineiii{}{\code{'thousands_sep'}} |
| 74 | {Character used between groups.}\hline |
| 75 | \lineiii{LC_MONETARY}{\code{'int_curr_symbol'}} |
| 76 | {International currency symbol.} |
| 77 | \lineiii{}{\code{'currency_symbol'}} |
| 78 | {Local currency symbol.} |
| 79 | \lineiii{}{\code{'mon_decimal_point'}} |
| 80 | {Decimal point used for monetary values.} |
| 81 | \lineiii{}{\code{'mon_thousands_sep'}} |
| 82 | {Group separator used for monetary values.} |
| 83 | \lineiii{}{\code{'mon_grouping'}} |
| 84 | {Equivalent to \code{'grouping'}, used for monetary |
| 85 | values.} |
| 86 | \lineiii{}{\code{'positive_sign'}} |
| 87 | {Symbol used to annotate a positive monetary value.} |
| 88 | \lineiii{}{\code{'negative_sign'}} |
| 89 | {Symbol used to annotate a nnegative monetary value.} |
| 90 | \lineiii{}{\code{'frac_digits'}} |
| 91 | {Number of fractional digits used in local formatting |
| 92 | of monetary values.} |
| 93 | \lineiii{}{\code{'int_frac_digits'}} |
| 94 | {Number of fractional digits used in international |
| 95 | formatting of monetary values.} |
| 96 | \end{tableiii} |
Fred Drake | 193338a | 1998-03-10 04:23:12 +0000 | [diff] [blame] | 97 | |
Fred Drake | c01f6e6 | 2000-11-30 07:13:58 +0000 | [diff] [blame] | 98 | The possible values for \code{'p_sign_posn'} and |
| 99 | \code{'n_sign_posn'} are given below. |
Fred Drake | 1491cac | 2000-10-25 20:59:52 +0000 | [diff] [blame] | 100 | |
| 101 | \begin{tableii}{c|l}{code}{Value}{Explanation} |
| 102 | \lineii{0}{Currency and value are surrounded by parentheses.} |
| 103 | \lineii{1}{The sign should precede the value and currency symbol.} |
| 104 | \lineii{2}{The sign should follow the value and currency symbol.} |
| 105 | \lineii{3}{The sign should immediately precede the value.} |
| 106 | \lineii{4}{The sign should immediately follow the value.} |
Fred Drake | c01f6e6 | 2000-11-30 07:13:58 +0000 | [diff] [blame] | 107 | \lineii{\constant{LC_MAX}}{Nothing is specified in this locale.} |
Fred Drake | 1491cac | 2000-10-25 20:59:52 +0000 | [diff] [blame] | 108 | \end{tableii} |
Guido van Rossum | bc12f78 | 1997-11-20 21:04:27 +0000 | [diff] [blame] | 109 | \end{funcdesc} |
| 110 | |
Martin v. Löwis | 9b75dca | 2001-08-10 13:58:50 +0000 | [diff] [blame] | 111 | \begin{funcdesc}{nl_langinfo}{option} |
| 112 | |
| 113 | Return some locale-specific information as a string. This function is |
| 114 | not available on all systems, and the set of possible options might |
| 115 | also vary across platforms. The possible argument values are numbers, |
| 116 | for which symbolic constants are available in the locale module. |
| 117 | |
| 118 | \end{funcdesc} |
| 119 | |
Fred Drake | 1491cac | 2000-10-25 20:59:52 +0000 | [diff] [blame] | 120 | \begin{funcdesc}{getdefaultlocale}{\optional{envvars}} |
| 121 | Tries to determine the default locale settings and returns |
| 122 | them as a tuple of the form \code{(\var{language code}, |
| 123 | \var{encoding})}. |
| 124 | |
| 125 | According to \POSIX, a program which has not called |
| 126 | \code{setlocale(LC_ALL, '')} runs using the portable \code{'C'} |
| 127 | locale. Calling \code{setlocale(LC_ALL, '')} lets it use the |
| 128 | default locale as defined by the \envvar{LANG} variable. Since we |
| 129 | do not want to interfere with the current locale setting we thus |
| 130 | emulate the behavior in the way described above. |
| 131 | |
| 132 | To maintain compatibility with other platforms, not only the |
| 133 | \envvar{LANG} variable is tested, but a list of variables given as |
| 134 | envvars parameter. The first found to be defined will be |
| 135 | used. \var{envvars} defaults to the search path used in GNU gettext; |
| 136 | it must always contain the variable name \samp{LANG}. The GNU |
| 137 | gettext search path contains \code{'LANGUAGE'}, \code{'LC_ALL'}, |
Fred Drake | f69868f | 2001-07-20 19:03:44 +0000 | [diff] [blame] | 138 | \code{'LC_CTYPE'}, and \code{'LANG'}, in that order. |
Fred Drake | 1491cac | 2000-10-25 20:59:52 +0000 | [diff] [blame] | 139 | |
| 140 | Except for the code \code{'C'}, the language code corresponds to |
| 141 | \rfc{1766}. \var{language code} and \var{encoding} may be |
| 142 | \code{None} if their values cannot be determined. |
| 143 | \versionadded{2.0} |
| 144 | \end{funcdesc} |
| 145 | |
| 146 | \begin{funcdesc}{getlocale}{\optional{category}} |
| 147 | Returns the current setting for the given locale category as |
Fred Drake | efb9097 | 2002-06-13 17:54:06 +0000 | [diff] [blame] | 148 | sequence containing \var{language code}, \var{encoding}. |
| 149 | \var{category} may be one of the \constant{LC_*} values except |
| 150 | \constant{LC_ALL}. It defaults to \constant{LC_CTYPE}. |
Fred Drake | 1491cac | 2000-10-25 20:59:52 +0000 | [diff] [blame] | 151 | |
| 152 | Except for the code \code{'C'}, the language code corresponds to |
| 153 | \rfc{1766}. \var{language code} and \var{encoding} may be |
| 154 | \code{None} if their values cannot be determined. |
| 155 | \versionadded{2.0} |
| 156 | \end{funcdesc} |
| 157 | |
Martin v. Löwis | f0a4668 | 2002-11-03 17:20:12 +0000 | [diff] [blame] | 158 | \begin{funcdesc}{getpreferredencoding}{\optional{do_setlocale}} |
| 159 | Return the encoding used for text data, according to user |
| 160 | preferences. User preferences are expressed differently on |
| 161 | different systems, and might not be available programmatically on |
| 162 | some systems, so this function only returns a guess. |
| 163 | |
| 164 | On some systems, it is necessary to invoke \function{setlocale} |
| 165 | to obtain the user preferences, so this function is not thread-safe. |
| 166 | If invoking setlocale is not necessary or desired, \var{do_setlocale} |
| 167 | should be set to \code{False}. |
| 168 | |
| 169 | \versionadded{2.3} |
| 170 | \end{funcdesc} |
| 171 | |
Fred Drake | 1491cac | 2000-10-25 20:59:52 +0000 | [diff] [blame] | 172 | \begin{funcdesc}{normalize}{localename} |
| 173 | Returns a normalized locale code for the given locale name. The |
| 174 | returned locale code is formatted for use with |
| 175 | \function{setlocale()}. If normalization fails, the original name |
| 176 | is returned unchanged. |
| 177 | |
| 178 | If the given encoding is not known, the function defaults to |
| 179 | the default encoding for the locale code just like |
| 180 | \function{setlocale()}. |
| 181 | \versionadded{2.0} |
| 182 | \end{funcdesc} |
| 183 | |
| 184 | \begin{funcdesc}{resetlocale}{\optional{category}} |
| 185 | Sets the locale for \var{category} to the default setting. |
| 186 | |
| 187 | The default setting is determined by calling |
| 188 | \function{getdefaultlocale()}. \var{category} defaults to |
| 189 | \constant{LC_ALL}. |
| 190 | \versionadded{2.0} |
| 191 | \end{funcdesc} |
| 192 | |
| 193 | \begin{funcdesc}{strcoll}{string1, string2} |
| 194 | Compares two strings according to the current |
| 195 | \constant{LC_COLLATE} setting. As any other compare function, |
| 196 | returns a negative, or a positive value, or \code{0}, depending on |
| 197 | whether \var{string1} collates before or after \var{string2} or is |
| 198 | equal to it. |
Guido van Rossum | bc12f78 | 1997-11-20 21:04:27 +0000 | [diff] [blame] | 199 | \end{funcdesc} |
| 200 | |
| 201 | \begin{funcdesc}{strxfrm}{string} |
Fred Drake | 1491cac | 2000-10-25 20:59:52 +0000 | [diff] [blame] | 202 | Transforms a string to one that can be used for the built-in |
| 203 | function \function{cmp()}\bifuncindex{cmp}, and still returns |
| 204 | locale-aware results. This function can be used when the same |
| 205 | string is compared repeatedly, e.g. when collating a sequence of |
| 206 | strings. |
Guido van Rossum | bc12f78 | 1997-11-20 21:04:27 +0000 | [diff] [blame] | 207 | \end{funcdesc} |
| 208 | |
Fred Drake | 1491cac | 2000-10-25 20:59:52 +0000 | [diff] [blame] | 209 | \begin{funcdesc}{format}{format, val\optional{, grouping}} |
| 210 | Formats a number \var{val} according to the current |
| 211 | \constant{LC_NUMERIC} setting. The format follows the conventions |
| 212 | of the \code{\%} operator. For floating point values, the decimal |
| 213 | point is modified if appropriate. If \var{grouping} is true, also |
| 214 | takes the grouping into account. |
Guido van Rossum | bc12f78 | 1997-11-20 21:04:27 +0000 | [diff] [blame] | 215 | \end{funcdesc} |
| 216 | |
| 217 | \begin{funcdesc}{str}{float} |
Fred Drake | 1491cac | 2000-10-25 20:59:52 +0000 | [diff] [blame] | 218 | Formats a floating point number using the same format as the |
| 219 | built-in function \code{str(\var{float})}, but takes the decimal |
| 220 | point into account. |
Guido van Rossum | bc12f78 | 1997-11-20 21:04:27 +0000 | [diff] [blame] | 221 | \end{funcdesc} |
| 222 | |
| 223 | \begin{funcdesc}{atof}{string} |
Fred Drake | 1491cac | 2000-10-25 20:59:52 +0000 | [diff] [blame] | 224 | Converts a string to a floating point number, following the |
| 225 | \constant{LC_NUMERIC} settings. |
Guido van Rossum | bc12f78 | 1997-11-20 21:04:27 +0000 | [diff] [blame] | 226 | \end{funcdesc} |
| 227 | |
| 228 | \begin{funcdesc}{atoi}{string} |
Fred Drake | 1491cac | 2000-10-25 20:59:52 +0000 | [diff] [blame] | 229 | Converts a string to an integer, following the |
| 230 | \constant{LC_NUMERIC} conventions. |
Guido van Rossum | bc12f78 | 1997-11-20 21:04:27 +0000 | [diff] [blame] | 231 | \end{funcdesc} |
| 232 | |
| 233 | \begin{datadesc}{LC_CTYPE} |
Fred Drake | 304474f | 1997-12-17 15:30:07 +0000 | [diff] [blame] | 234 | \refstmodindex{string} |
Fred Drake | 1491cac | 2000-10-25 20:59:52 +0000 | [diff] [blame] | 235 | Locale category for the character type functions. Depending on the |
| 236 | settings of this category, the functions of module |
| 237 | \refmodule{string} dealing with case change their behaviour. |
Guido van Rossum | bc12f78 | 1997-11-20 21:04:27 +0000 | [diff] [blame] | 238 | \end{datadesc} |
| 239 | |
| 240 | \begin{datadesc}{LC_COLLATE} |
Fred Drake | 1491cac | 2000-10-25 20:59:52 +0000 | [diff] [blame] | 241 | Locale category for sorting strings. The functions |
| 242 | \function{strcoll()} and \function{strxfrm()} of the |
| 243 | \module{locale} module are affected. |
Guido van Rossum | bc12f78 | 1997-11-20 21:04:27 +0000 | [diff] [blame] | 244 | \end{datadesc} |
| 245 | |
| 246 | \begin{datadesc}{LC_TIME} |
Fred Drake | 1491cac | 2000-10-25 20:59:52 +0000 | [diff] [blame] | 247 | Locale category for the formatting of time. The function |
| 248 | \function{time.strftime()} follows these conventions. |
Guido van Rossum | bc12f78 | 1997-11-20 21:04:27 +0000 | [diff] [blame] | 249 | \end{datadesc} |
| 250 | |
| 251 | \begin{datadesc}{LC_MONETARY} |
Fred Drake | 1491cac | 2000-10-25 20:59:52 +0000 | [diff] [blame] | 252 | Locale category for formatting of monetary values. The available |
| 253 | options are available from the \function{localeconv()} function. |
Guido van Rossum | bc12f78 | 1997-11-20 21:04:27 +0000 | [diff] [blame] | 254 | \end{datadesc} |
| 255 | |
| 256 | \begin{datadesc}{LC_MESSAGES} |
Fred Drake | 1491cac | 2000-10-25 20:59:52 +0000 | [diff] [blame] | 257 | Locale category for message display. Python currently does not |
| 258 | support application specific locale-aware messages. Messages |
| 259 | displayed by the operating system, like those returned by |
| 260 | \function{os.strerror()} might be affected by this category. |
Guido van Rossum | bc12f78 | 1997-11-20 21:04:27 +0000 | [diff] [blame] | 261 | \end{datadesc} |
| 262 | |
| 263 | \begin{datadesc}{LC_NUMERIC} |
Fred Drake | 1491cac | 2000-10-25 20:59:52 +0000 | [diff] [blame] | 264 | Locale category for formatting numbers. The functions |
| 265 | \function{format()}, \function{atoi()}, \function{atof()} and |
| 266 | \function{str()} of the \module{locale} module are affected by that |
| 267 | category. All other numeric formatting operations are not |
| 268 | affected. |
Guido van Rossum | bc12f78 | 1997-11-20 21:04:27 +0000 | [diff] [blame] | 269 | \end{datadesc} |
| 270 | |
| 271 | \begin{datadesc}{LC_ALL} |
Fred Drake | 1491cac | 2000-10-25 20:59:52 +0000 | [diff] [blame] | 272 | Combination of all locale settings. If this flag is used when the |
| 273 | locale is changed, setting the locale for all categories is |
| 274 | attempted. If that fails for any category, no category is changed at |
| 275 | all. When the locale is retrieved using this flag, a string |
| 276 | indicating the setting for all categories is returned. This string |
| 277 | can be later used to restore the settings. |
Guido van Rossum | bc12f78 | 1997-11-20 21:04:27 +0000 | [diff] [blame] | 278 | \end{datadesc} |
| 279 | |
| 280 | \begin{datadesc}{CHAR_MAX} |
Fred Drake | 1491cac | 2000-10-25 20:59:52 +0000 | [diff] [blame] | 281 | This is a symbolic constant used for different values returned by |
| 282 | \function{localeconv()}. |
Guido van Rossum | bc12f78 | 1997-11-20 21:04:27 +0000 | [diff] [blame] | 283 | \end{datadesc} |
| 284 | |
Martin v. Löwis | 9b75dca | 2001-08-10 13:58:50 +0000 | [diff] [blame] | 285 | The \function{nl_langinfo} function accepts one of the following keys. |
| 286 | Most descriptions are taken from the corresponding description in the |
| 287 | GNU C library. |
| 288 | |
| 289 | \begin{datadesc}{CODESET} |
| 290 | Return a string with the name of the character encoding used in the |
| 291 | selected locale. |
| 292 | \end{datadesc} |
| 293 | |
| 294 | \begin{datadesc}{D_T_FMT} |
| 295 | Return a string that can be used as a format string for strftime(3) to |
| 296 | represent time and date in a locale-specific way. |
| 297 | \end{datadesc} |
| 298 | |
| 299 | \begin{datadesc}{D_FMT} |
| 300 | Return a string that can be used as a format string for strftime(3) to |
| 301 | represent a date in a locale-specific way. |
| 302 | \end{datadesc} |
| 303 | |
| 304 | \begin{datadesc}{T_FMT} |
| 305 | Return a string that can be used as a format string for strftime(3) to |
| 306 | represent a time in a locale-specific way. |
| 307 | \end{datadesc} |
| 308 | |
| 309 | \begin{datadesc}{T_FMT_AMPM} |
| 310 | The return value can be used as a format string for `strftime' to |
| 311 | represent time in the am/pm format. |
| 312 | \end{datadesc} |
| 313 | |
| 314 | \begin{datadesc}{DAY_1 ... DAY_7} |
Fred Drake | 0aa811c | 2001-10-20 04:24:09 +0000 | [diff] [blame] | 315 | Return name of the n-th day of the week. \warning{This |
Fred Drake | b802a1e | 2001-09-27 04:16:27 +0000 | [diff] [blame] | 316 | follows the US convention of \constant{DAY_1} being Sunday, not the |
| 317 | international convention (ISO 8601) that Monday is the first day of |
Fred Drake | 0aa811c | 2001-10-20 04:24:09 +0000 | [diff] [blame] | 318 | the week.} |
Martin v. Löwis | 9b75dca | 2001-08-10 13:58:50 +0000 | [diff] [blame] | 319 | \end{datadesc} |
| 320 | |
| 321 | \begin{datadesc}{ABDAY_1 ... ABDAY_7} |
| 322 | Return abbreviated name of the n-th day of the week. |
| 323 | \end{datadesc} |
| 324 | |
| 325 | \begin{datadesc}{MON_1 ... MON_12} |
| 326 | Return name of the n-th month. |
| 327 | \end{datadesc} |
| 328 | |
| 329 | \begin{datadesc}{ABMON_1 ... ABMON_12} |
| 330 | Return abbreviated name of the n-th month. |
| 331 | \end{datadesc} |
| 332 | |
| 333 | \begin{datadesc}{RADIXCHAR} |
| 334 | Return radix character (decimal dot, decimal comma, etc.) |
| 335 | \end{datadesc} |
| 336 | |
| 337 | \begin{datadesc}{THOUSEP} |
| 338 | Return separator character for thousands (groups of three digits). |
| 339 | \end{datadesc} |
| 340 | |
| 341 | \begin{datadesc}{YESEXPR} |
| 342 | Return a regular expression that can be used with the regex |
| 343 | function to recognize a positive response to a yes/no question. |
Fred Drake | 0aa811c | 2001-10-20 04:24:09 +0000 | [diff] [blame] | 344 | \warning{The expression is in the syntax suitable for the |
Fred Drake | b802a1e | 2001-09-27 04:16:27 +0000 | [diff] [blame] | 345 | \cfunction{regex()} function from the C library, which might differ |
Fred Drake | 0aa811c | 2001-10-20 04:24:09 +0000 | [diff] [blame] | 346 | from the syntax used in \refmodule{re}.} |
Martin v. Löwis | 9b75dca | 2001-08-10 13:58:50 +0000 | [diff] [blame] | 347 | \end{datadesc} |
| 348 | |
| 349 | \begin{datadesc}{NOEXPR} |
| 350 | Return a regular expression that can be used with the regex(3) |
| 351 | function to recognize a negative response to a yes/no question. |
| 352 | \end{datadesc} |
| 353 | |
| 354 | \begin{datadesc}{CRNCYSTR} |
| 355 | Return the currency symbol, preceded by "-" if the symbol should |
| 356 | appear before the value, "+" if the symbol should appear after the |
| 357 | value, or "." if the symbol should replace the radix character. |
| 358 | \end{datadesc} |
| 359 | |
| 360 | \begin{datadesc}{ERA} |
| 361 | The return value represents the era used in the current locale. |
| 362 | |
| 363 | Most locales do not define this value. An example of a locale which |
| 364 | does define this value is the Japanese one. In Japan, the traditional |
| 365 | representation of dates includes the name of the era corresponding to |
| 366 | the then-emperor's reign. |
| 367 | |
| 368 | Normally it should not be necessary to use this value directly. |
| 369 | Specifying the \code{E} modifier in their format strings causes the |
| 370 | \function{strftime} function to use this information. The format of the |
| 371 | returned string is not specified, and therefore you should not assume |
| 372 | knowledge of it on different systems. |
| 373 | \end{datadesc} |
| 374 | |
| 375 | \begin{datadesc}{ERA_YEAR} |
| 376 | The return value gives the year in the relevant era of the locale. |
| 377 | \end{datadesc} |
| 378 | |
| 379 | \begin{datadesc}{ERA_D_T_FMT} |
| 380 | This return value can be used as a format string for |
| 381 | \function{strftime} to represent dates and times in a locale-specific |
| 382 | era-based way. |
| 383 | \end{datadesc} |
| 384 | |
| 385 | \begin{datadesc}{ERA_D_FMT} |
| 386 | This return value can be used as a format string for |
| 387 | \function{strftime} to represent time in a locale-specific era-based |
| 388 | way. |
| 389 | \end{datadesc} |
| 390 | |
| 391 | \begin{datadesc}{ALT_DIGITS} |
| 392 | The return value is a representation of up to 100 values used to |
| 393 | represent the values 0 to 99. |
| 394 | \end{datadesc} |
| 395 | |
Guido van Rossum | bc12f78 | 1997-11-20 21:04:27 +0000 | [diff] [blame] | 396 | Example: |
| 397 | |
Fred Drake | 1947991 | 1998-02-13 06:58:54 +0000 | [diff] [blame] | 398 | \begin{verbatim} |
Guido van Rossum | bc12f78 | 1997-11-20 21:04:27 +0000 | [diff] [blame] | 399 | >>> import locale |
Guido van Rossum | d028ca9 | 1998-02-22 04:41:51 +0000 | [diff] [blame] | 400 | >>> loc = locale.setlocale(locale.LC_ALL) # get current locale |
Martin v. Löwis | 25f90d5 | 2003-09-03 04:50:13 +0000 | [diff] [blame] | 401 | >>> locale.setlocale(locale.LC_ALL, 'de_DE') # use German locale; name might vary with platform |
Ka-Ping Yee | fa004ad | 2001-01-24 17:19:08 +0000 | [diff] [blame] | 402 | >>> locale.strcoll('f\xe4n', 'foo') # compare a string containing an umlaut |
Fred Drake | c01f6e6 | 2000-11-30 07:13:58 +0000 | [diff] [blame] | 403 | >>> locale.setlocale(locale.LC_ALL, '') # use user's preferred locale |
| 404 | >>> locale.setlocale(locale.LC_ALL, 'C') # use default (C) locale |
Guido van Rossum | d028ca9 | 1998-02-22 04:41:51 +0000 | [diff] [blame] | 405 | >>> locale.setlocale(locale.LC_ALL, loc) # restore saved locale |
Fred Drake | 1947991 | 1998-02-13 06:58:54 +0000 | [diff] [blame] | 406 | \end{verbatim} |
Guido van Rossum | 3ffb715 | 1998-02-22 04:23:51 +0000 | [diff] [blame] | 407 | |
Fred Drake | 1491cac | 2000-10-25 20:59:52 +0000 | [diff] [blame] | 408 | |
Guido van Rossum | 3ffb715 | 1998-02-22 04:23:51 +0000 | [diff] [blame] | 409 | \subsection{Background, details, hints, tips and caveats} |
| 410 | |
| 411 | The C standard defines the locale as a program-wide property that may |
| 412 | be relatively expensive to change. On top of that, some |
| 413 | implementation are broken in such a way that frequent locale changes |
| 414 | may cause core dumps. This makes the locale somewhat painful to use |
| 415 | correctly. |
| 416 | |
Fred Drake | 9fee071 | 1998-04-03 06:21:23 +0000 | [diff] [blame] | 417 | Initially, when a program is started, the locale is the \samp{C} locale, no |
Guido van Rossum | 3ffb715 | 1998-02-22 04:23:51 +0000 | [diff] [blame] | 418 | matter what the user's preferred locale is. The program must |
| 419 | explicitly say that it wants the user's preferred locale settings by |
Fred Drake | c01f6e6 | 2000-11-30 07:13:58 +0000 | [diff] [blame] | 420 | calling \code{setlocale(LC_ALL, '')}. |
Guido van Rossum | 3ffb715 | 1998-02-22 04:23:51 +0000 | [diff] [blame] | 421 | |
Fred Drake | 193338a | 1998-03-10 04:23:12 +0000 | [diff] [blame] | 422 | It is generally a bad idea to call \function{setlocale()} in some library |
Guido van Rossum | 3ffb715 | 1998-02-22 04:23:51 +0000 | [diff] [blame] | 423 | routine, since as a side effect it affects the entire program. Saving |
| 424 | and restoring it is almost as bad: it is expensive and affects other |
| 425 | threads that happen to run before the settings have been restored. |
| 426 | |
| 427 | If, when coding a module for general use, you need a locale |
| 428 | independent version of an operation that is affected by the locale |
Raymond Hettinger | bf3a752 | 2003-05-12 03:23:51 +0000 | [diff] [blame] | 429 | (such as \function{string.lower()}, or certain formats used with |
| 430 | \function{time.strftime()}), you will have to find a way to do it |
Fred Drake | 193338a | 1998-03-10 04:23:12 +0000 | [diff] [blame] | 431 | without using the standard library routine. Even better is convincing |
| 432 | yourself that using locale settings is okay. Only as a last resort |
Fred Drake | 9fee071 | 1998-04-03 06:21:23 +0000 | [diff] [blame] | 433 | should you document that your module is not compatible with |
| 434 | non-\samp{C} locale settings. |
Guido van Rossum | 3ffb715 | 1998-02-22 04:23:51 +0000 | [diff] [blame] | 435 | |
Fred Drake | 193338a | 1998-03-10 04:23:12 +0000 | [diff] [blame] | 436 | The case conversion functions in the |
Fred Drake | e9735ac | 2001-05-10 15:05:03 +0000 | [diff] [blame] | 437 | \refmodule{string}\refstmodindex{string} module are affected by the |
| 438 | locale settings. When a call to the \function{setlocale()} function |
| 439 | changes the \constant{LC_CTYPE} settings, the variables |
Fred Drake | 193338a | 1998-03-10 04:23:12 +0000 | [diff] [blame] | 440 | \code{string.lowercase}, \code{string.uppercase} and |
Fred Drake | e9735ac | 2001-05-10 15:05:03 +0000 | [diff] [blame] | 441 | \code{string.letters} are recalculated. Note that this code that uses |
| 442 | these variable through `\keyword{from} ... \keyword{import} ...', |
| 443 | e.g.\ \code{from string import letters}, is not affected by subsequent |
| 444 | \function{setlocale()} calls. |
Guido van Rossum | 3ffb715 | 1998-02-22 04:23:51 +0000 | [diff] [blame] | 445 | |
| 446 | The only way to perform numeric operations according to the locale |
| 447 | is to use the special functions defined by this module: |
Fred Drake | 193338a | 1998-03-10 04:23:12 +0000 | [diff] [blame] | 448 | \function{atof()}, \function{atoi()}, \function{format()}, |
| 449 | \function{str()}. |
Guido van Rossum | 3ffb715 | 1998-02-22 04:23:51 +0000 | [diff] [blame] | 450 | |
Fred Drake | 1491cac | 2000-10-25 20:59:52 +0000 | [diff] [blame] | 451 | \subsection{For extension writers and programs that embed Python |
| 452 | \label{embedding-locale}} |
Guido van Rossum | 3ffb715 | 1998-02-22 04:23:51 +0000 | [diff] [blame] | 453 | |
Fred Drake | 193338a | 1998-03-10 04:23:12 +0000 | [diff] [blame] | 454 | Extension modules should never call \function{setlocale()}, except to |
| 455 | find out what the current locale is. But since the return value can |
| 456 | only be used portably to restore it, that is not very useful (except |
Fred Drake | 9fee071 | 1998-04-03 06:21:23 +0000 | [diff] [blame] | 457 | perhaps to find out whether or not the locale is \samp{C}). |
Guido van Rossum | 3ffb715 | 1998-02-22 04:23:51 +0000 | [diff] [blame] | 458 | |
| 459 | When Python is embedded in an application, if the application sets the |
| 460 | locale to something specific before initializing Python, that is |
| 461 | generally okay, and Python will use whatever locale is set, |
Fred Drake | 9fee071 | 1998-04-03 06:21:23 +0000 | [diff] [blame] | 462 | \emph{except} that the \constant{LC_NUMERIC} locale should always be |
| 463 | \samp{C}. |
Guido van Rossum | 3ffb715 | 1998-02-22 04:23:51 +0000 | [diff] [blame] | 464 | |
Fred Drake | 85b5683 | 1999-07-01 16:31:03 +0000 | [diff] [blame] | 465 | The \function{setlocale()} function in the \module{locale} module |
Thomas Wouters | f831663 | 2000-07-16 19:01:10 +0000 | [diff] [blame] | 466 | gives the Python programmer the impression that you can manipulate the |
Fred Drake | c3845a1 | 1999-04-21 17:18:04 +0000 | [diff] [blame] | 467 | \constant{LC_NUMERIC} locale setting, but this not the case at the C |
| 468 | level: C code will always find that the \constant{LC_NUMERIC} locale |
Fred Drake | 9fee071 | 1998-04-03 06:21:23 +0000 | [diff] [blame] | 469 | setting is \samp{C}. This is because too much would break when the |
Guido van Rossum | 3ffb715 | 1998-02-22 04:23:51 +0000 | [diff] [blame] | 470 | decimal point character is set to something else than a period |
| 471 | (e.g. the Python parser would break). Caveat: threads that run |
| 472 | without holding Python's global interpreter lock may occasionally find |
| 473 | that the numeric locale setting differs; this is because the only |
| 474 | portable way to implement this feature is to set the numeric locale |
| 475 | settings to what the user requests, extract the relevant |
Fred Drake | 9fee071 | 1998-04-03 06:21:23 +0000 | [diff] [blame] | 476 | characteristics, and then restore the \samp{C} numeric locale. |
Guido van Rossum | 3ffb715 | 1998-02-22 04:23:51 +0000 | [diff] [blame] | 477 | |
Fred Drake | 193338a | 1998-03-10 04:23:12 +0000 | [diff] [blame] | 478 | When Python code uses the \module{locale} module to change the locale, |
Fred Drake | d8a41e6 | 1999-02-19 17:54:10 +0000 | [diff] [blame] | 479 | this also affects the embedding application. If the embedding |
Guido van Rossum | 3ffb715 | 1998-02-22 04:23:51 +0000 | [diff] [blame] | 480 | application doesn't want this to happen, it should remove the |
Fred Drake | 193338a | 1998-03-10 04:23:12 +0000 | [diff] [blame] | 481 | \module{_locale} extension module (which does all the work) from the |
| 482 | table of built-in modules in the \file{config.c} file, and make sure |
| 483 | that the \module{_locale} module is not accessible as a shared library. |
Martin v. Löwis | 2e64c34 | 2002-03-27 18:49:02 +0000 | [diff] [blame] | 484 | |
Fred Drake | e3a3ceb | 2002-03-28 12:40:45 +0000 | [diff] [blame] | 485 | |
| 486 | \subsection{Access to message catalogs \label{locale-gettext}} |
Martin v. Löwis | 2e64c34 | 2002-03-27 18:49:02 +0000 | [diff] [blame] | 487 | |
| 488 | The locale module exposes the C library's gettext interface on systems |
Fred Drake | e3a3ceb | 2002-03-28 12:40:45 +0000 | [diff] [blame] | 489 | that provide this interface. It consists of the functions |
| 490 | \function{gettext()}, \function{dgettext()}, \function{dcgettext()}, |
| 491 | \function{textdomain()}, and \function{bindtextdomain()}. These are |
| 492 | similar to the same functions in the \refmodule{gettext} module, but use |
Martin v. Löwis | 2e64c34 | 2002-03-27 18:49:02 +0000 | [diff] [blame] | 493 | the C library's binary format for message catalogs, and the C |
| 494 | library's search algorithms for locating message catalogs. |
| 495 | |
| 496 | Python applications should normally find no need to invoke these |
Fred Drake | e3a3ceb | 2002-03-28 12:40:45 +0000 | [diff] [blame] | 497 | functions, and should use \refmodule{gettext} instead. A known |
| 498 | exception to this rule are applications that link use additional C |
| 499 | libraries which internally invoke \cfunction{gettext()} or |
| 500 | \function{cdgettext()}. For these applications, it may be necessary to |
| 501 | bind the text domain, so that the libraries can properly locate their |
| 502 | message catalogs. |