blob: bd9c0bca60121102c53b3833165298be828ec4f7 [file] [log] [blame]
Barry Warsawaf572511999-08-11 21:40:38 +00001#! /usr/bin/env python
Barry Warsaw6e972412001-05-21 19:35:20 +00002# Originally written by Barry Warsaw <barry@digicool.com>
Barry Warsawc8f08922000-02-26 20:56:47 +00003#
Barry Warsaw6e972412001-05-21 19:35:20 +00004# Minimally patched to make it even more xgettext compatible
Barry Warsawc8f08922000-02-26 20:56:47 +00005# by Peter Funk <pf@artcom-gmbh.de>
Barry Warsawe27db5a1999-08-13 20:59:48 +00006
Barry Warsaw08a8a352000-10-27 04:56:28 +00007"""pygettext -- Python equivalent of xgettext(1)
Barry Warsawe27db5a1999-08-13 20:59:48 +00008
9Many systems (Solaris, Linux, Gnu) provide extensive tools that ease the
10internationalization of C programs. Most of these tools are independent of
11the programming language and can be used from within Python programs. Martin
12von Loewis' work[1] helps considerably in this regard.
13
Barry Warsaw5dbf5261999-11-03 18:47:52 +000014There's one problem though; xgettext is the program that scans source code
Barry Warsawe27db5a1999-08-13 20:59:48 +000015looking for message strings, but it groks only C (or C++). Python introduces
16a few wrinkles, such as dual quoting characters, triple quoted strings, and
17raw strings. xgettext understands none of this.
18
19Enter pygettext, which uses Python's standard tokenize module to scan Python
20source code, generating .pot files identical to what GNU xgettext[2] generates
Barry Warsaw5dbf5261999-11-03 18:47:52 +000021for C and C++ code. From there, the standard GNU tools can be used.
Barry Warsawe27db5a1999-08-13 20:59:48 +000022
23A word about marking Python strings as candidates for translation. GNU
24xgettext recognizes the following keywords: gettext, dgettext, dcgettext, and
25gettext_noop. But those can be a lot of text to include all over your code.
Barry Warsaw5dbf5261999-11-03 18:47:52 +000026C and C++ have a trick: they use the C preprocessor. Most internationalized C
Barry Warsawe27db5a1999-08-13 20:59:48 +000027source includes a #define for gettext() to _() so that what has to be written
28in the source is much less. Thus these are both translatable strings:
29
30 gettext("Translatable String")
31 _("Translatable String")
32
33Python of course has no preprocessor so this doesn't work so well. Thus,
34pygettext searches only for _() by default, but see the -k/--keyword flag
35below for how to augment this.
36
37 [1] http://www.python.org/workshops/1997-10/proceedings/loewis.html
38 [2] http://www.gnu.org/software/gettext/gettext.html
39
Barry Warsawe27db5a1999-08-13 20:59:48 +000040NOTE: pygettext attempts to be option and feature compatible with GNU xgettext
Barry Warsawc8f08922000-02-26 20:56:47 +000041where ever possible. However some options are still missing or are not fully
Barry Warsawa17e0f12000-03-08 15:18:35 +000042implemented. Also, xgettext's use of command line switches with option
43arguments is broken, and in these cases, pygettext just defines additional
44switches.
Barry Warsawe27db5a1999-08-13 20:59:48 +000045
Barry Warsawa17e0f12000-03-08 15:18:35 +000046Usage: pygettext [options] inputfile ...
Barry Warsawe27db5a1999-08-13 20:59:48 +000047
48Options:
49
50 -a
51 --extract-all
52 Extract all strings
53
Barry Warsawc8f08922000-02-26 20:56:47 +000054 -d name
55 --default-domain=name
56 Rename the default output file from messages.pot to name.pot
57
58 -E
59 --escape
Barry Warsaw08a8a352000-10-27 04:56:28 +000060 Replace non-ASCII characters with octal escape sequences.
61
62 -D
63 --docstrings
64 Extract module, class, method, and function docstrings. These do not
65 need to be wrapped in _() markers, and in fact cannot be for Python to
66 consider them docstrings.
Barry Warsawc8f08922000-02-26 20:56:47 +000067
68 -h
69 --help
70 print this help message and exit
Barry Warsawe27db5a1999-08-13 20:59:48 +000071
Barry Warsawa17e0f12000-03-08 15:18:35 +000072 -k word
73 --keyword=word
74 Keywords to look for in addition to the default set, which are:
75 %(DEFAULTKEYWORDS)s
Barry Warsawe27db5a1999-08-13 20:59:48 +000076
Barry Warsawa17e0f12000-03-08 15:18:35 +000077 You can have multiple -k flags on the command line.
78
79 -K
80 --no-default-keywords
81 Disable the default set of keywords (see above). Any keywords
82 explicitly added with the -k/--keyword option are still recognized.
Barry Warsawe27db5a1999-08-13 20:59:48 +000083
84 --no-location
Barry Warsawa17e0f12000-03-08 15:18:35 +000085 Do not write filename/lineno location comments.
Barry Warsawe27db5a1999-08-13 20:59:48 +000086
Barry Warsawa17e0f12000-03-08 15:18:35 +000087 -n
88 --add-location
Barry Warsawe27db5a1999-08-13 20:59:48 +000089 Write filename/lineno location comments indicating where each
90 extracted string is found in the source. These lines appear before
Barry Warsawa17e0f12000-03-08 15:18:35 +000091 each msgid. The style of comments is controlled by the -S/--style
92 option. This is the default.
93
Barry Warsaw08a8a352000-10-27 04:56:28 +000094 -o filename
95 --output=filename
96 Rename the default output file from messages.pot to filename. If
97 filename is `-' then the output is sent to standard out.
98
99 -p dir
100 --output-dir=dir
101 Output files will be placed in directory dir.
102
Barry Warsawa17e0f12000-03-08 15:18:35 +0000103 -S stylename
104 --style stylename
105 Specify which style to use for location comments. Two styles are
106 supported:
Barry Warsawe27db5a1999-08-13 20:59:48 +0000107
108 Solaris # File: filename, line: line-number
Barry Warsawa17e0f12000-03-08 15:18:35 +0000109 GNU #: filename:line
Barry Warsawe27db5a1999-08-13 20:59:48 +0000110
Barry Warsawa17e0f12000-03-08 15:18:35 +0000111 The style name is case insensitive. GNU style is the default.
Barry Warsawe27db5a1999-08-13 20:59:48 +0000112
Barry Warsaw5dbf5261999-11-03 18:47:52 +0000113 -v
114 --verbose
115 Print the names of the files being processed.
116
Barry Warsawc8f08922000-02-26 20:56:47 +0000117 -V
118 --version
119 Print the version of pygettext and exit.
120
121 -w columns
122 --width=columns
123 Set width of output to columns.
124
125 -x filename
126 --exclude-file=filename
127 Specify a file that contains a list of strings that are not be
128 extracted from the input files. Each string to be excluded must
129 appear on a line by itself in the file.
Barry Warsawe27db5a1999-08-13 20:59:48 +0000130
Barry Warsawa17e0f12000-03-08 15:18:35 +0000131If `inputfile' is -, standard input is read.
132
Barry Warsaw08a8a352000-10-27 04:56:28 +0000133"""
Barry Warsawe27db5a1999-08-13 20:59:48 +0000134
135import os
136import sys
Barry Warsawe27db5a1999-08-13 20:59:48 +0000137import time
138import getopt
139import tokenize
Barry Warsaw16b62c12001-05-21 19:51:26 +0000140import operator
Barry Warsawe27db5a1999-08-13 20:59:48 +0000141
Barry Warsaw08a8a352000-10-27 04:56:28 +0000142# for selftesting
143try:
144 import fintl
145 _ = fintl.gettext
146except ImportError:
147 def _(s): return s
148
Martin v. Löwis0f6b3832001-03-01 22:56:17 +0000149__version__ = '1.3'
Barry Warsawa17e0f12000-03-08 15:18:35 +0000150
151default_keywords = ['_']
152DEFAULTKEYWORDS = ', '.join(default_keywords)
153
154EMPTYSTRING = ''
Barry Warsawe27db5a1999-08-13 20:59:48 +0000155
156
157
Barry Warsaw5dbf5261999-11-03 18:47:52 +0000158# The normal pot-file header. msgmerge and EMACS' po-mode work better if
159# it's there.
160pot_header = _('''\
161# SOME DESCRIPTIVE TITLE.
162# Copyright (C) YEAR ORGANIZATION
163# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
164#
165msgid ""
166msgstr ""
167"Project-Id-Version: PACKAGE VERSION\\n"
Martin v. Löwis0f6b3832001-03-01 22:56:17 +0000168"POT-Creation-Date: %(time)s\\n"
169"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\\n"
Barry Warsaw5dbf5261999-11-03 18:47:52 +0000170"Last-Translator: FULL NAME <EMAIL@ADDRESS>\\n"
171"Language-Team: LANGUAGE <LL@li.org>\\n"
172"MIME-Version: 1.0\\n"
173"Content-Type: text/plain; charset=CHARSET\\n"
174"Content-Transfer-Encoding: ENCODING\\n"
175"Generated-By: pygettext.py %(version)s\\n"
176
177''')
178
179
Barry Warsawe27db5a1999-08-13 20:59:48 +0000180def usage(code, msg=''):
Barry Warsaw08a8a352000-10-27 04:56:28 +0000181 print >> sys.stderr, _(__doc__) % globals()
Barry Warsawe27db5a1999-08-13 20:59:48 +0000182 if msg:
Barry Warsaw08a8a352000-10-27 04:56:28 +0000183 print >> sys.stderr, msg
Barry Warsawe27db5a1999-08-13 20:59:48 +0000184 sys.exit(code)
185
Barry Warsawc8f08922000-02-26 20:56:47 +0000186
Barry Warsawe27db5a1999-08-13 20:59:48 +0000187
Barry Warsaw5dbf5261999-11-03 18:47:52 +0000188escapes = []
Barry Warsaw5dbf5261999-11-03 18:47:52 +0000189
Barry Warsawc8f08922000-02-26 20:56:47 +0000190def make_escapes(pass_iso8859):
191 global escapes
Barry Warsaw7733e122000-02-27 14:30:48 +0000192 if pass_iso8859:
193 # Allow iso-8859 characters to pass through so that e.g. 'msgid
194 # "Höhe"' would result not result in 'msgid "H\366he"'. Otherwise we
195 # escape any character outside the 32..126 range.
196 mod = 128
197 else:
198 mod = 256
Barry Warsawc8f08922000-02-26 20:56:47 +0000199 for i in range(256):
Barry Warsaw7733e122000-02-27 14:30:48 +0000200 if 32 <= (i % mod) <= 126:
Barry Warsawc8f08922000-02-26 20:56:47 +0000201 escapes.append(chr(i))
202 else:
203 escapes.append("\\%03o" % i)
204 escapes[ord('\\')] = '\\\\'
205 escapes[ord('\t')] = '\\t'
206 escapes[ord('\r')] = '\\r'
207 escapes[ord('\n')] = '\\n'
208 escapes[ord('\"')] = '\\"'
209
Barry Warsaw5dbf5261999-11-03 18:47:52 +0000210
211def escape(s):
Barry Warsawc8f08922000-02-26 20:56:47 +0000212 global escapes
Barry Warsaw5dbf5261999-11-03 18:47:52 +0000213 s = list(s)
214 for i in range(len(s)):
215 s[i] = escapes[ord(s[i])]
Barry Warsawa17e0f12000-03-08 15:18:35 +0000216 return EMPTYSTRING.join(s)
Barry Warsaw5dbf5261999-11-03 18:47:52 +0000217
218
219def safe_eval(s):
220 # unwrap quotes, safely
221 return eval(s, {'__builtins__':{}}, {})
222
223
Barry Warsawe27db5a1999-08-13 20:59:48 +0000224def normalize(s):
225 # This converts the various Python string types into a format that is
226 # appropriate for .po files, namely much closer to C style.
Barry Warsawa17e0f12000-03-08 15:18:35 +0000227 lines = s.split('\n')
Barry Warsaw5dbf5261999-11-03 18:47:52 +0000228 if len(lines) == 1:
229 s = '"' + escape(s) + '"'
Barry Warsawe27db5a1999-08-13 20:59:48 +0000230 else:
Barry Warsaw5dbf5261999-11-03 18:47:52 +0000231 if not lines[-1]:
232 del lines[-1]
233 lines[-1] = lines[-1] + '\n'
234 for i in range(len(lines)):
235 lines[i] = escape(lines[i])
Barry Warsawa17e0f12000-03-08 15:18:35 +0000236 lineterm = '\\n"\n"'
237 s = '""\n"' + lineterm.join(lines) + '"'
Barry Warsaw5dbf5261999-11-03 18:47:52 +0000238 return s
Barry Warsawe27db5a1999-08-13 20:59:48 +0000239
240
241
242class TokenEater:
243 def __init__(self, options):
244 self.__options = options
245 self.__messages = {}
246 self.__state = self.__waiting
247 self.__data = []
248 self.__lineno = -1
Barry Warsaw08a8a352000-10-27 04:56:28 +0000249 self.__freshmodule = 1
Barry Warsawe27db5a1999-08-13 20:59:48 +0000250
251 def __call__(self, ttype, tstring, stup, etup, line):
252 # dispatch
Barry Warsaw08a8a352000-10-27 04:56:28 +0000253## import token
254## print >> sys.stderr, 'ttype:', token.tok_name[ttype], \
255## 'tstring:', tstring
Barry Warsawe27db5a1999-08-13 20:59:48 +0000256 self.__state(ttype, tstring, stup[0])
257
258 def __waiting(self, ttype, tstring, lineno):
Barry Warsaw08a8a352000-10-27 04:56:28 +0000259 # Do docstring extractions, if enabled
260 if self.__options.docstrings:
261 # module docstring?
262 if self.__freshmodule:
263 if ttype == tokenize.STRING:
Barry Warsaw16b62c12001-05-21 19:51:26 +0000264 self.__addentry(safe_eval(tstring), lineno, isdocstring=1)
Barry Warsaw08a8a352000-10-27 04:56:28 +0000265 self.__freshmodule = 0
266 elif ttype not in (tokenize.COMMENT, tokenize.NL):
267 self.__freshmodule = 0
268 return
269 # class docstring?
270 if ttype == tokenize.NAME and tstring in ('class', 'def'):
271 self.__state = self.__suiteseen
272 return
Barry Warsawe27db5a1999-08-13 20:59:48 +0000273 if ttype == tokenize.NAME and tstring in self.__options.keywords:
274 self.__state = self.__keywordseen
275
Barry Warsaw08a8a352000-10-27 04:56:28 +0000276 def __suiteseen(self, ttype, tstring, lineno):
277 # ignore anything until we see the colon
278 if ttype == tokenize.OP and tstring == ':':
279 self.__state = self.__suitedocstring
280
281 def __suitedocstring(self, ttype, tstring, lineno):
282 # ignore any intervening noise
283 if ttype == tokenize.STRING:
Barry Warsaw16b62c12001-05-21 19:51:26 +0000284 self.__addentry(safe_eval(tstring), lineno, isdocstring=1)
Barry Warsaw08a8a352000-10-27 04:56:28 +0000285 self.__state = self.__waiting
286 elif ttype not in (tokenize.NEWLINE, tokenize.INDENT,
287 tokenize.COMMENT):
288 # there was no class docstring
289 self.__state = self.__waiting
290
Barry Warsawe27db5a1999-08-13 20:59:48 +0000291 def __keywordseen(self, ttype, tstring, lineno):
292 if ttype == tokenize.OP and tstring == '(':
293 self.__data = []
294 self.__lineno = lineno
295 self.__state = self.__openseen
296 else:
297 self.__state = self.__waiting
298
299 def __openseen(self, ttype, tstring, lineno):
300 if ttype == tokenize.OP and tstring == ')':
301 # We've seen the last of the translatable strings. Record the
302 # line number of the first line of the strings and update the list
303 # of messages seen. Reset state for the next batch. If there
304 # were no strings inside _(), then just ignore this entry.
305 if self.__data:
Barry Warsaw08a8a352000-10-27 04:56:28 +0000306 self.__addentry(EMPTYSTRING.join(self.__data))
Barry Warsawe27db5a1999-08-13 20:59:48 +0000307 self.__state = self.__waiting
308 elif ttype == tokenize.STRING:
Barry Warsaw5dbf5261999-11-03 18:47:52 +0000309 self.__data.append(safe_eval(tstring))
Barry Warsawe27db5a1999-08-13 20:59:48 +0000310 # TBD: should we warn if we seen anything else?
311
Barry Warsaw16b62c12001-05-21 19:51:26 +0000312 def __addentry(self, msg, lineno=None, isdocstring=0):
Barry Warsaw08a8a352000-10-27 04:56:28 +0000313 if lineno is None:
314 lineno = self.__lineno
315 if not msg in self.__options.toexclude:
316 entry = (self.__curfile, lineno)
Barry Warsaw16b62c12001-05-21 19:51:26 +0000317 self.__messages.setdefault(msg, {})[entry] = isdocstring
Barry Warsaw08a8a352000-10-27 04:56:28 +0000318
Barry Warsawe27db5a1999-08-13 20:59:48 +0000319 def set_filename(self, filename):
320 self.__curfile = filename
321
322 def write(self, fp):
323 options = self.__options
324 timestamp = time.ctime(time.time())
Barry Warsaw08a8a352000-10-27 04:56:28 +0000325 # The time stamp in the header doesn't have the same format as that
326 # generated by xgettext...
327 print >> fp, pot_header % {'time': timestamp, 'version': __version__}
Barry Warsaw128c77d2001-05-23 16:59:45 +0000328 # Sort the entries. First sort each particular entry's keys, then
329 # sort all the entries by their first item.
330 reverse = {}
Fred Drake33e2c3e2000-10-26 03:49:15 +0000331 for k, v in self.__messages.items():
Barry Warsaw128c77d2001-05-23 16:59:45 +0000332 keys = v.keys()
333 keys.sort()
Barry Warsaw50cf7062001-05-24 23:06:13 +0000334 reverse.setdefault(tuple(keys), []).append((k, v))
Barry Warsaw128c77d2001-05-23 16:59:45 +0000335 rkeys = reverse.keys()
336 rkeys.sort()
337 for rkey in rkeys:
Barry Warsaw50cf7062001-05-24 23:06:13 +0000338 rentries = reverse[rkey]
339 rentries.sort()
340 for k, v in rentries:
Barry Warsaw5c94ce52001-06-20 19:41:40 +0000341 isdocstring = 0
Barry Warsaw50cf7062001-05-24 23:06:13 +0000342 # If the entry was gleaned out of a docstring, then add a
343 # comment stating so. This is to aid translators who may wish
344 # to skip translating some unimportant docstrings.
345 if reduce(operator.__add__, v.values()):
Barry Warsaw5c94ce52001-06-20 19:41:40 +0000346 isdocstring = 1
Barry Warsaw50cf7062001-05-24 23:06:13 +0000347 # k is the message string, v is a dictionary-set of (filename,
348 # lineno) tuples. We want to sort the entries in v first by
349 # file name and then by line number.
350 v = v.keys()
351 v.sort()
352 if not options.writelocations:
353 pass
354 # location comments are different b/w Solaris and GNU:
355 elif options.locationstyle == options.SOLARIS:
356 for filename, lineno in v:
357 d = {'filename': filename, 'lineno': lineno}
358 print >>fp, _(
359 '# File: %(filename)s, line: %(lineno)d') % d
360 elif options.locationstyle == options.GNU:
361 # fit as many locations on one line, as long as the
362 # resulting line length doesn't exceeds 'options.width'
363 locline = '#:'
364 for filename, lineno in v:
365 d = {'filename': filename, 'lineno': lineno}
366 s = _(' %(filename)s:%(lineno)d') % d
367 if len(locline) + len(s) <= options.width:
368 locline = locline + s
369 else:
370 print >> fp, locline
371 locline = "#:" + s
372 if len(locline) > 2:
Barry Warsaw08a8a352000-10-27 04:56:28 +0000373 print >> fp, locline
Barry Warsaw5c94ce52001-06-20 19:41:40 +0000374 if isdocstring:
375 print >> fp, '#, docstring'
Barry Warsaw50cf7062001-05-24 23:06:13 +0000376 print >> fp, 'msgid', normalize(k)
377 print >> fp, 'msgstr ""\n'
Barry Warsaw08a8a352000-10-27 04:56:28 +0000378
Barry Warsawe27db5a1999-08-13 20:59:48 +0000379
380
381def main():
Barry Warsawa17e0f12000-03-08 15:18:35 +0000382 global default_keywords
Barry Warsawe27db5a1999-08-13 20:59:48 +0000383 try:
384 opts, args = getopt.getopt(
385 sys.argv[1:],
Barry Warsaw08a8a352000-10-27 04:56:28 +0000386 'ad:DEhk:Kno:p:S:Vvw:x:',
Barry Warsaw2b639692001-05-21 19:58:23 +0000387 ['extract-all', 'default-domain=', 'escape', 'help',
Barry Warsawa17e0f12000-03-08 15:18:35 +0000388 'keyword=', 'no-default-keywords',
Barry Warsawc8f08922000-02-26 20:56:47 +0000389 'add-location', 'no-location', 'output=', 'output-dir=',
Barry Warsawa17e0f12000-03-08 15:18:35 +0000390 'style=', 'verbose', 'version', 'width=', 'exclude-file=',
Barry Warsaw08a8a352000-10-27 04:56:28 +0000391 'docstrings',
Barry Warsawc8f08922000-02-26 20:56:47 +0000392 ])
Barry Warsawe27db5a1999-08-13 20:59:48 +0000393 except getopt.error, msg:
394 usage(1, msg)
395
396 # for holding option values
397 class Options:
398 # constants
399 GNU = 1
400 SOLARIS = 2
401 # defaults
Barry Warsawc8f08922000-02-26 20:56:47 +0000402 extractall = 0 # FIXME: currently this option has no effect at all.
403 escape = 0
Barry Warsawe27db5a1999-08-13 20:59:48 +0000404 keywords = []
Barry Warsawc8f08922000-02-26 20:56:47 +0000405 outpath = ''
Barry Warsawe27db5a1999-08-13 20:59:48 +0000406 outfile = 'messages.pot'
Barry Warsawa17e0f12000-03-08 15:18:35 +0000407 writelocations = 1
408 locationstyle = GNU
Barry Warsaw5dbf5261999-11-03 18:47:52 +0000409 verbose = 0
Barry Warsawc8f08922000-02-26 20:56:47 +0000410 width = 78
411 excludefilename = ''
Barry Warsaw08a8a352000-10-27 04:56:28 +0000412 docstrings = 0
Barry Warsawe27db5a1999-08-13 20:59:48 +0000413
414 options = Options()
415 locations = {'gnu' : options.GNU,
416 'solaris' : options.SOLARIS,
417 }
418
419 # parse options
420 for opt, arg in opts:
421 if opt in ('-h', '--help'):
422 usage(0)
Barry Warsawc8f08922000-02-26 20:56:47 +0000423 elif opt in ('-a', '--extract-all'):
424 options.extractall = 1
425 elif opt in ('-d', '--default-domain'):
426 options.outfile = arg + '.pot'
427 elif opt in ('-E', '--escape'):
428 options.escape = 1
Barry Warsaw08a8a352000-10-27 04:56:28 +0000429 elif opt in ('-D', '--docstrings'):
430 options.docstrings = 1
Barry Warsawe27db5a1999-08-13 20:59:48 +0000431 elif opt in ('-k', '--keyword'):
Barry Warsawe27db5a1999-08-13 20:59:48 +0000432 options.keywords.append(arg)
Barry Warsawa17e0f12000-03-08 15:18:35 +0000433 elif opt in ('-K', '--no-default-keywords'):
434 default_keywords = []
Barry Warsawe27db5a1999-08-13 20:59:48 +0000435 elif opt in ('-n', '--add-location'):
Barry Warsawa17e0f12000-03-08 15:18:35 +0000436 options.writelocations = 1
Barry Warsawe27db5a1999-08-13 20:59:48 +0000437 elif opt in ('--no-location',):
Barry Warsawa17e0f12000-03-08 15:18:35 +0000438 options.writelocations = 0
439 elif opt in ('-S', '--style'):
440 options.locationstyle = locations.get(arg.lower())
441 if options.locationstyle is None:
442 usage(1, _('Invalid value for --style: %s') % arg)
Barry Warsawc8f08922000-02-26 20:56:47 +0000443 elif opt in ('-o', '--output'):
444 options.outfile = arg
445 elif opt in ('-p', '--output-dir'):
446 options.outpath = arg
Barry Warsaw5dbf5261999-11-03 18:47:52 +0000447 elif opt in ('-v', '--verbose'):
448 options.verbose = 1
Barry Warsawc8f08922000-02-26 20:56:47 +0000449 elif opt in ('-V', '--version'):
450 print _('pygettext.py (xgettext for Python) %s') % __version__
451 sys.exit(0)
452 elif opt in ('-w', '--width'):
453 try:
454 options.width = int(arg)
455 except ValueError:
Barry Warsawa17e0f12000-03-08 15:18:35 +0000456 usage(1, _('--width argument must be an integer: %s') % arg)
Barry Warsawc8f08922000-02-26 20:56:47 +0000457 elif opt in ('-x', '--exclude-file'):
458 options.excludefilename = arg
459
460 # calculate escapes
Barry Warsaw7733e122000-02-27 14:30:48 +0000461 make_escapes(options.escape)
Barry Warsawe27db5a1999-08-13 20:59:48 +0000462
463 # calculate all keywords
464 options.keywords.extend(default_keywords)
465
Barry Warsawc8f08922000-02-26 20:56:47 +0000466 # initialize list of strings to exclude
467 if options.excludefilename:
468 try:
469 fp = open(options.excludefilename)
470 options.toexclude = fp.readlines()
471 fp.close()
472 except IOError:
Barry Warsaw6e972412001-05-21 19:35:20 +0000473 print >> sys.stderr, _(
474 "Can't read --exclude-file: %s") % options.excludefilename
Barry Warsawc8f08922000-02-26 20:56:47 +0000475 sys.exit(1)
476 else:
477 options.toexclude = []
478
Barry Warsawe27db5a1999-08-13 20:59:48 +0000479 # slurp through all the files
480 eater = TokenEater(options)
481 for filename in args:
Barry Warsawa17e0f12000-03-08 15:18:35 +0000482 if filename == '-':
483 if options.verbose:
484 print _('Reading standard input')
485 fp = sys.stdin
486 closep = 0
487 else:
488 if options.verbose:
489 print _('Working on %s') % filename
490 fp = open(filename)
491 closep = 1
492 try:
493 eater.set_filename(filename)
Barry Warsaw75ee8f52001-02-26 04:46:53 +0000494 try:
495 tokenize.tokenize(fp.readline, eater)
496 except tokenize.TokenError, e:
Barry Warsaw6e972412001-05-21 19:35:20 +0000497 print >> sys.stderr, '%s: %s, line %d, column %d' % (
498 e[0], filename, e[1][0], e[1][1])
Barry Warsawa17e0f12000-03-08 15:18:35 +0000499 finally:
500 if closep:
501 fp.close()
Barry Warsawe27db5a1999-08-13 20:59:48 +0000502
Barry Warsawa17e0f12000-03-08 15:18:35 +0000503 # write the output
504 if options.outfile == '-':
505 fp = sys.stdout
506 closep = 0
507 else:
508 if options.outpath:
509 options.outfile = os.path.join(options.outpath, options.outfile)
510 fp = open(options.outfile, 'w')
511 closep = 1
512 try:
513 eater.write(fp)
514 finally:
515 if closep:
516 fp.close()
Barry Warsawe27db5a1999-08-13 20:59:48 +0000517
518
519if __name__ == '__main__':
520 main()
Barry Warsaw75a6e672000-05-02 19:28:30 +0000521 # some more test strings
522 _(u'a unicode string')