blob: 41a0970647e3aa55adb30c8c0991a2b102fdab4b [file] [log] [blame]
Barry Warsawaf572511999-08-11 21:40:38 +00001#! /usr/bin/env python
Barry Warsaw6e972412001-05-21 19:35:20 +00002# Originally written by Barry Warsaw <barry@digicool.com>
Barry Warsawc8f08922000-02-26 20:56:47 +00003#
Barry Warsaw6e972412001-05-21 19:35:20 +00004# Minimally patched to make it even more xgettext compatible
Barry Warsawc8f08922000-02-26 20:56:47 +00005# by Peter Funk <pf@artcom-gmbh.de>
Barry Warsawe27db5a1999-08-13 20:59:48 +00006
Barry Warsaw08a8a352000-10-27 04:56:28 +00007"""pygettext -- Python equivalent of xgettext(1)
Barry Warsawe27db5a1999-08-13 20:59:48 +00008
9Many systems (Solaris, Linux, Gnu) provide extensive tools that ease the
10internationalization of C programs. Most of these tools are independent of
11the programming language and can be used from within Python programs. Martin
12von Loewis' work[1] helps considerably in this regard.
13
Barry Warsaw5dbf5261999-11-03 18:47:52 +000014There's one problem though; xgettext is the program that scans source code
Barry Warsawe27db5a1999-08-13 20:59:48 +000015looking for message strings, but it groks only C (or C++). Python introduces
16a few wrinkles, such as dual quoting characters, triple quoted strings, and
17raw strings. xgettext understands none of this.
18
19Enter pygettext, which uses Python's standard tokenize module to scan Python
20source code, generating .pot files identical to what GNU xgettext[2] generates
Barry Warsaw5dbf5261999-11-03 18:47:52 +000021for C and C++ code. From there, the standard GNU tools can be used.
Barry Warsawe27db5a1999-08-13 20:59:48 +000022
23A word about marking Python strings as candidates for translation. GNU
24xgettext recognizes the following keywords: gettext, dgettext, dcgettext, and
25gettext_noop. But those can be a lot of text to include all over your code.
Barry Warsaw5dbf5261999-11-03 18:47:52 +000026C and C++ have a trick: they use the C preprocessor. Most internationalized C
Barry Warsawe27db5a1999-08-13 20:59:48 +000027source includes a #define for gettext() to _() so that what has to be written
28in the source is much less. Thus these are both translatable strings:
29
30 gettext("Translatable String")
31 _("Translatable String")
32
33Python of course has no preprocessor so this doesn't work so well. Thus,
34pygettext searches only for _() by default, but see the -k/--keyword flag
35below for how to augment this.
36
37 [1] http://www.python.org/workshops/1997-10/proceedings/loewis.html
38 [2] http://www.gnu.org/software/gettext/gettext.html
39
Barry Warsawe27db5a1999-08-13 20:59:48 +000040NOTE: pygettext attempts to be option and feature compatible with GNU xgettext
Barry Warsawc8f08922000-02-26 20:56:47 +000041where ever possible. However some options are still missing or are not fully
Barry Warsawa17e0f12000-03-08 15:18:35 +000042implemented. Also, xgettext's use of command line switches with option
43arguments is broken, and in these cases, pygettext just defines additional
44switches.
Barry Warsawe27db5a1999-08-13 20:59:48 +000045
Barry Warsawa17e0f12000-03-08 15:18:35 +000046Usage: pygettext [options] inputfile ...
Barry Warsawe27db5a1999-08-13 20:59:48 +000047
48Options:
49
50 -a
51 --extract-all
52 Extract all strings
53
Barry Warsawc8f08922000-02-26 20:56:47 +000054 -d name
55 --default-domain=name
56 Rename the default output file from messages.pot to name.pot
57
58 -E
59 --escape
Barry Warsaw08a8a352000-10-27 04:56:28 +000060 Replace non-ASCII characters with octal escape sequences.
61
62 -D
63 --docstrings
64 Extract module, class, method, and function docstrings. These do not
65 need to be wrapped in _() markers, and in fact cannot be for Python to
66 consider them docstrings.
Barry Warsawc8f08922000-02-26 20:56:47 +000067
68 -h
69 --help
70 print this help message and exit
Barry Warsawe27db5a1999-08-13 20:59:48 +000071
Barry Warsawa17e0f12000-03-08 15:18:35 +000072 -k word
73 --keyword=word
74 Keywords to look for in addition to the default set, which are:
75 %(DEFAULTKEYWORDS)s
Barry Warsawe27db5a1999-08-13 20:59:48 +000076
Barry Warsawa17e0f12000-03-08 15:18:35 +000077 You can have multiple -k flags on the command line.
78
79 -K
80 --no-default-keywords
81 Disable the default set of keywords (see above). Any keywords
82 explicitly added with the -k/--keyword option are still recognized.
Barry Warsawe27db5a1999-08-13 20:59:48 +000083
84 --no-location
Barry Warsawa17e0f12000-03-08 15:18:35 +000085 Do not write filename/lineno location comments.
Barry Warsawe27db5a1999-08-13 20:59:48 +000086
Barry Warsawa17e0f12000-03-08 15:18:35 +000087 -n
88 --add-location
Barry Warsawe27db5a1999-08-13 20:59:48 +000089 Write filename/lineno location comments indicating where each
90 extracted string is found in the source. These lines appear before
Barry Warsawa17e0f12000-03-08 15:18:35 +000091 each msgid. The style of comments is controlled by the -S/--style
92 option. This is the default.
93
Barry Warsaw08a8a352000-10-27 04:56:28 +000094 -o filename
95 --output=filename
96 Rename the default output file from messages.pot to filename. If
97 filename is `-' then the output is sent to standard out.
98
99 -p dir
100 --output-dir=dir
101 Output files will be placed in directory dir.
102
Barry Warsawa17e0f12000-03-08 15:18:35 +0000103 -S stylename
104 --style stylename
105 Specify which style to use for location comments. Two styles are
106 supported:
Barry Warsawe27db5a1999-08-13 20:59:48 +0000107
108 Solaris # File: filename, line: line-number
Barry Warsawa17e0f12000-03-08 15:18:35 +0000109 GNU #: filename:line
Barry Warsawe27db5a1999-08-13 20:59:48 +0000110
Barry Warsawa17e0f12000-03-08 15:18:35 +0000111 The style name is case insensitive. GNU style is the default.
Barry Warsawe27db5a1999-08-13 20:59:48 +0000112
Barry Warsaw5dbf5261999-11-03 18:47:52 +0000113 -v
114 --verbose
115 Print the names of the files being processed.
116
Barry Warsawc8f08922000-02-26 20:56:47 +0000117 -V
118 --version
119 Print the version of pygettext and exit.
120
121 -w columns
122 --width=columns
123 Set width of output to columns.
124
125 -x filename
126 --exclude-file=filename
127 Specify a file that contains a list of strings that are not be
128 extracted from the input files. Each string to be excluded must
129 appear on a line by itself in the file.
Barry Warsawe27db5a1999-08-13 20:59:48 +0000130
Barry Warsawa17e0f12000-03-08 15:18:35 +0000131If `inputfile' is -, standard input is read.
132
Barry Warsaw08a8a352000-10-27 04:56:28 +0000133"""
Barry Warsawe27db5a1999-08-13 20:59:48 +0000134
135import os
136import sys
Barry Warsawe27db5a1999-08-13 20:59:48 +0000137import time
138import getopt
139import tokenize
140
Barry Warsaw08a8a352000-10-27 04:56:28 +0000141# for selftesting
142try:
143 import fintl
144 _ = fintl.gettext
145except ImportError:
146 def _(s): return s
147
Martin v. Löwis0f6b3832001-03-01 22:56:17 +0000148__version__ = '1.3'
Barry Warsawa17e0f12000-03-08 15:18:35 +0000149
150default_keywords = ['_']
151DEFAULTKEYWORDS = ', '.join(default_keywords)
152
153EMPTYSTRING = ''
Barry Warsawe27db5a1999-08-13 20:59:48 +0000154
155
156
Barry Warsaw5dbf5261999-11-03 18:47:52 +0000157# The normal pot-file header. msgmerge and EMACS' po-mode work better if
158# it's there.
159pot_header = _('''\
160# SOME DESCRIPTIVE TITLE.
161# Copyright (C) YEAR ORGANIZATION
162# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
163#
164msgid ""
165msgstr ""
166"Project-Id-Version: PACKAGE VERSION\\n"
Martin v. Löwis0f6b3832001-03-01 22:56:17 +0000167"POT-Creation-Date: %(time)s\\n"
168"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\\n"
Barry Warsaw5dbf5261999-11-03 18:47:52 +0000169"Last-Translator: FULL NAME <EMAIL@ADDRESS>\\n"
170"Language-Team: LANGUAGE <LL@li.org>\\n"
171"MIME-Version: 1.0\\n"
172"Content-Type: text/plain; charset=CHARSET\\n"
173"Content-Transfer-Encoding: ENCODING\\n"
174"Generated-By: pygettext.py %(version)s\\n"
175
176''')
177
178
Barry Warsawe27db5a1999-08-13 20:59:48 +0000179def usage(code, msg=''):
Barry Warsaw08a8a352000-10-27 04:56:28 +0000180 print >> sys.stderr, _(__doc__) % globals()
Barry Warsawe27db5a1999-08-13 20:59:48 +0000181 if msg:
Barry Warsaw08a8a352000-10-27 04:56:28 +0000182 print >> sys.stderr, msg
Barry Warsawe27db5a1999-08-13 20:59:48 +0000183 sys.exit(code)
184
Barry Warsawc8f08922000-02-26 20:56:47 +0000185
Barry Warsawe27db5a1999-08-13 20:59:48 +0000186
Barry Warsaw5dbf5261999-11-03 18:47:52 +0000187escapes = []
Barry Warsaw5dbf5261999-11-03 18:47:52 +0000188
Barry Warsawc8f08922000-02-26 20:56:47 +0000189def make_escapes(pass_iso8859):
190 global escapes
Barry Warsaw7733e122000-02-27 14:30:48 +0000191 if pass_iso8859:
192 # Allow iso-8859 characters to pass through so that e.g. 'msgid
193 # "Höhe"' would result not result in 'msgid "H\366he"'. Otherwise we
194 # escape any character outside the 32..126 range.
195 mod = 128
196 else:
197 mod = 256
Barry Warsawc8f08922000-02-26 20:56:47 +0000198 for i in range(256):
Barry Warsaw7733e122000-02-27 14:30:48 +0000199 if 32 <= (i % mod) <= 126:
Barry Warsawc8f08922000-02-26 20:56:47 +0000200 escapes.append(chr(i))
201 else:
202 escapes.append("\\%03o" % i)
203 escapes[ord('\\')] = '\\\\'
204 escapes[ord('\t')] = '\\t'
205 escapes[ord('\r')] = '\\r'
206 escapes[ord('\n')] = '\\n'
207 escapes[ord('\"')] = '\\"'
208
Barry Warsaw5dbf5261999-11-03 18:47:52 +0000209
210def escape(s):
Barry Warsawc8f08922000-02-26 20:56:47 +0000211 global escapes
Barry Warsaw5dbf5261999-11-03 18:47:52 +0000212 s = list(s)
213 for i in range(len(s)):
214 s[i] = escapes[ord(s[i])]
Barry Warsawa17e0f12000-03-08 15:18:35 +0000215 return EMPTYSTRING.join(s)
Barry Warsaw5dbf5261999-11-03 18:47:52 +0000216
217
218def safe_eval(s):
219 # unwrap quotes, safely
220 return eval(s, {'__builtins__':{}}, {})
221
222
Barry Warsawe27db5a1999-08-13 20:59:48 +0000223def normalize(s):
224 # This converts the various Python string types into a format that is
225 # appropriate for .po files, namely much closer to C style.
Barry Warsawa17e0f12000-03-08 15:18:35 +0000226 lines = s.split('\n')
Barry Warsaw5dbf5261999-11-03 18:47:52 +0000227 if len(lines) == 1:
228 s = '"' + escape(s) + '"'
Barry Warsawe27db5a1999-08-13 20:59:48 +0000229 else:
Barry Warsaw5dbf5261999-11-03 18:47:52 +0000230 if not lines[-1]:
231 del lines[-1]
232 lines[-1] = lines[-1] + '\n'
233 for i in range(len(lines)):
234 lines[i] = escape(lines[i])
Barry Warsawa17e0f12000-03-08 15:18:35 +0000235 lineterm = '\\n"\n"'
236 s = '""\n"' + lineterm.join(lines) + '"'
Barry Warsaw5dbf5261999-11-03 18:47:52 +0000237 return s
Barry Warsawe27db5a1999-08-13 20:59:48 +0000238
239
240
241class TokenEater:
242 def __init__(self, options):
243 self.__options = options
244 self.__messages = {}
245 self.__state = self.__waiting
246 self.__data = []
247 self.__lineno = -1
Barry Warsaw08a8a352000-10-27 04:56:28 +0000248 self.__freshmodule = 1
Barry Warsawe27db5a1999-08-13 20:59:48 +0000249
250 def __call__(self, ttype, tstring, stup, etup, line):
251 # dispatch
Barry Warsaw08a8a352000-10-27 04:56:28 +0000252## import token
253## print >> sys.stderr, 'ttype:', token.tok_name[ttype], \
254## 'tstring:', tstring
Barry Warsawe27db5a1999-08-13 20:59:48 +0000255 self.__state(ttype, tstring, stup[0])
256
257 def __waiting(self, ttype, tstring, lineno):
Barry Warsaw08a8a352000-10-27 04:56:28 +0000258 # Do docstring extractions, if enabled
259 if self.__options.docstrings:
260 # module docstring?
261 if self.__freshmodule:
262 if ttype == tokenize.STRING:
263 self.__addentry(safe_eval(tstring), lineno)
264 self.__freshmodule = 0
265 elif ttype not in (tokenize.COMMENT, tokenize.NL):
266 self.__freshmodule = 0
267 return
268 # class docstring?
269 if ttype == tokenize.NAME and tstring in ('class', 'def'):
270 self.__state = self.__suiteseen
271 return
Barry Warsawe27db5a1999-08-13 20:59:48 +0000272 if ttype == tokenize.NAME and tstring in self.__options.keywords:
273 self.__state = self.__keywordseen
274
Barry Warsaw08a8a352000-10-27 04:56:28 +0000275 def __suiteseen(self, ttype, tstring, lineno):
276 # ignore anything until we see the colon
277 if ttype == tokenize.OP and tstring == ':':
278 self.__state = self.__suitedocstring
279
280 def __suitedocstring(self, ttype, tstring, lineno):
281 # ignore any intervening noise
282 if ttype == tokenize.STRING:
283 self.__addentry(safe_eval(tstring), lineno)
284 self.__state = self.__waiting
285 elif ttype not in (tokenize.NEWLINE, tokenize.INDENT,
286 tokenize.COMMENT):
287 # there was no class docstring
288 self.__state = self.__waiting
289
Barry Warsawe27db5a1999-08-13 20:59:48 +0000290 def __keywordseen(self, ttype, tstring, lineno):
291 if ttype == tokenize.OP and tstring == '(':
292 self.__data = []
293 self.__lineno = lineno
294 self.__state = self.__openseen
295 else:
296 self.__state = self.__waiting
297
298 def __openseen(self, ttype, tstring, lineno):
299 if ttype == tokenize.OP and tstring == ')':
300 # We've seen the last of the translatable strings. Record the
301 # line number of the first line of the strings and update the list
302 # of messages seen. Reset state for the next batch. If there
303 # were no strings inside _(), then just ignore this entry.
304 if self.__data:
Barry Warsaw08a8a352000-10-27 04:56:28 +0000305 self.__addentry(EMPTYSTRING.join(self.__data))
Barry Warsawe27db5a1999-08-13 20:59:48 +0000306 self.__state = self.__waiting
307 elif ttype == tokenize.STRING:
Barry Warsaw5dbf5261999-11-03 18:47:52 +0000308 self.__data.append(safe_eval(tstring))
Barry Warsawe27db5a1999-08-13 20:59:48 +0000309 # TBD: should we warn if we seen anything else?
310
Barry Warsaw08a8a352000-10-27 04:56:28 +0000311 def __addentry(self, msg, lineno=None):
312 if lineno is None:
313 lineno = self.__lineno
314 if not msg in self.__options.toexclude:
315 entry = (self.__curfile, lineno)
Barry Warsaw6e972412001-05-21 19:35:20 +0000316 self.__messages.setdefault(msg, {})[entry] = 1
Barry Warsaw08a8a352000-10-27 04:56:28 +0000317
Barry Warsawe27db5a1999-08-13 20:59:48 +0000318 def set_filename(self, filename):
319 self.__curfile = filename
320
321 def write(self, fp):
322 options = self.__options
323 timestamp = time.ctime(time.time())
Barry Warsaw08a8a352000-10-27 04:56:28 +0000324 # The time stamp in the header doesn't have the same format as that
325 # generated by xgettext...
326 print >> fp, pot_header % {'time': timestamp, 'version': __version__}
Fred Drake33e2c3e2000-10-26 03:49:15 +0000327 for k, v in self.__messages.items():
Barry Warsaw6e972412001-05-21 19:35:20 +0000328 # k is the message string, v is a dictionary-set of (filename,
329 # lineno) tuples. We want to sort the entries in v first by file
330 # name and then by line number.
331 v = v.keys()
332 v.sort()
Fred Drake33e2c3e2000-10-26 03:49:15 +0000333 if not options.writelocations:
334 pass
335 # location comments are different b/w Solaris and GNU:
336 elif options.locationstyle == options.SOLARIS:
337 for filename, lineno in v:
338 d = {'filename': filename, 'lineno': lineno}
339 print >>fp, _('# File: %(filename)s, line: %(lineno)d') % d
340 elif options.locationstyle == options.GNU:
341 # fit as many locations on one line, as long as the
342 # resulting line length doesn't exceeds 'options.width'
343 locline = '#:'
344 for filename, lineno in v:
345 d = {'filename': filename, 'lineno': lineno}
346 s = _(' %(filename)s:%(lineno)d') % d
347 if len(locline) + len(s) <= options.width:
348 locline = locline + s
349 else:
Barry Warsaw08a8a352000-10-27 04:56:28 +0000350 print >> fp, locline
Fred Drake33e2c3e2000-10-26 03:49:15 +0000351 locline = "#:" + s
352 if len(locline) > 2:
Barry Warsaw08a8a352000-10-27 04:56:28 +0000353 print >> fp, locline
Fred Drake33e2c3e2000-10-26 03:49:15 +0000354 # TBD: sorting, normalizing
Barry Warsaw08a8a352000-10-27 04:56:28 +0000355 print >> fp, 'msgid', normalize(k)
356 print >> fp, 'msgstr ""\n'
357
Barry Warsawe27db5a1999-08-13 20:59:48 +0000358
359
360def main():
Barry Warsawa17e0f12000-03-08 15:18:35 +0000361 global default_keywords
Barry Warsawe27db5a1999-08-13 20:59:48 +0000362 try:
363 opts, args = getopt.getopt(
364 sys.argv[1:],
Barry Warsaw08a8a352000-10-27 04:56:28 +0000365 'ad:DEhk:Kno:p:S:Vvw:x:',
Barry Warsawa17e0f12000-03-08 15:18:35 +0000366 ['extract-all', 'default-domain', 'escape', 'help',
367 'keyword=', 'no-default-keywords',
Barry Warsawc8f08922000-02-26 20:56:47 +0000368 'add-location', 'no-location', 'output=', 'output-dir=',
Barry Warsawa17e0f12000-03-08 15:18:35 +0000369 'style=', 'verbose', 'version', 'width=', 'exclude-file=',
Barry Warsaw08a8a352000-10-27 04:56:28 +0000370 'docstrings',
Barry Warsawc8f08922000-02-26 20:56:47 +0000371 ])
Barry Warsawe27db5a1999-08-13 20:59:48 +0000372 except getopt.error, msg:
373 usage(1, msg)
374
375 # for holding option values
376 class Options:
377 # constants
378 GNU = 1
379 SOLARIS = 2
380 # defaults
Barry Warsawc8f08922000-02-26 20:56:47 +0000381 extractall = 0 # FIXME: currently this option has no effect at all.
382 escape = 0
Barry Warsawe27db5a1999-08-13 20:59:48 +0000383 keywords = []
Barry Warsawc8f08922000-02-26 20:56:47 +0000384 outpath = ''
Barry Warsawe27db5a1999-08-13 20:59:48 +0000385 outfile = 'messages.pot'
Barry Warsawa17e0f12000-03-08 15:18:35 +0000386 writelocations = 1
387 locationstyle = GNU
Barry Warsaw5dbf5261999-11-03 18:47:52 +0000388 verbose = 0
Barry Warsawc8f08922000-02-26 20:56:47 +0000389 width = 78
390 excludefilename = ''
Barry Warsaw08a8a352000-10-27 04:56:28 +0000391 docstrings = 0
Barry Warsawe27db5a1999-08-13 20:59:48 +0000392
393 options = Options()
394 locations = {'gnu' : options.GNU,
395 'solaris' : options.SOLARIS,
396 }
397
398 # parse options
399 for opt, arg in opts:
400 if opt in ('-h', '--help'):
401 usage(0)
Barry Warsawc8f08922000-02-26 20:56:47 +0000402 elif opt in ('-a', '--extract-all'):
403 options.extractall = 1
404 elif opt in ('-d', '--default-domain'):
405 options.outfile = arg + '.pot'
406 elif opt in ('-E', '--escape'):
407 options.escape = 1
Barry Warsaw08a8a352000-10-27 04:56:28 +0000408 elif opt in ('-D', '--docstrings'):
409 options.docstrings = 1
Barry Warsawe27db5a1999-08-13 20:59:48 +0000410 elif opt in ('-k', '--keyword'):
Barry Warsawe27db5a1999-08-13 20:59:48 +0000411 options.keywords.append(arg)
Barry Warsawa17e0f12000-03-08 15:18:35 +0000412 elif opt in ('-K', '--no-default-keywords'):
413 default_keywords = []
Barry Warsawe27db5a1999-08-13 20:59:48 +0000414 elif opt in ('-n', '--add-location'):
Barry Warsawa17e0f12000-03-08 15:18:35 +0000415 options.writelocations = 1
Barry Warsawe27db5a1999-08-13 20:59:48 +0000416 elif opt in ('--no-location',):
Barry Warsawa17e0f12000-03-08 15:18:35 +0000417 options.writelocations = 0
418 elif opt in ('-S', '--style'):
419 options.locationstyle = locations.get(arg.lower())
420 if options.locationstyle is None:
421 usage(1, _('Invalid value for --style: %s') % arg)
Barry Warsawc8f08922000-02-26 20:56:47 +0000422 elif opt in ('-o', '--output'):
423 options.outfile = arg
424 elif opt in ('-p', '--output-dir'):
425 options.outpath = arg
Barry Warsaw5dbf5261999-11-03 18:47:52 +0000426 elif opt in ('-v', '--verbose'):
427 options.verbose = 1
Barry Warsawc8f08922000-02-26 20:56:47 +0000428 elif opt in ('-V', '--version'):
429 print _('pygettext.py (xgettext for Python) %s') % __version__
430 sys.exit(0)
431 elif opt in ('-w', '--width'):
432 try:
433 options.width = int(arg)
434 except ValueError:
Barry Warsawa17e0f12000-03-08 15:18:35 +0000435 usage(1, _('--width argument must be an integer: %s') % arg)
Barry Warsawc8f08922000-02-26 20:56:47 +0000436 elif opt in ('-x', '--exclude-file'):
437 options.excludefilename = arg
438
439 # calculate escapes
Barry Warsaw7733e122000-02-27 14:30:48 +0000440 make_escapes(options.escape)
Barry Warsawe27db5a1999-08-13 20:59:48 +0000441
442 # calculate all keywords
443 options.keywords.extend(default_keywords)
444
Barry Warsawc8f08922000-02-26 20:56:47 +0000445 # initialize list of strings to exclude
446 if options.excludefilename:
447 try:
448 fp = open(options.excludefilename)
449 options.toexclude = fp.readlines()
450 fp.close()
451 except IOError:
Barry Warsaw6e972412001-05-21 19:35:20 +0000452 print >> sys.stderr, _(
453 "Can't read --exclude-file: %s") % options.excludefilename
Barry Warsawc8f08922000-02-26 20:56:47 +0000454 sys.exit(1)
455 else:
456 options.toexclude = []
457
Barry Warsawe27db5a1999-08-13 20:59:48 +0000458 # slurp through all the files
459 eater = TokenEater(options)
460 for filename in args:
Barry Warsawa17e0f12000-03-08 15:18:35 +0000461 if filename == '-':
462 if options.verbose:
463 print _('Reading standard input')
464 fp = sys.stdin
465 closep = 0
466 else:
467 if options.verbose:
468 print _('Working on %s') % filename
469 fp = open(filename)
470 closep = 1
471 try:
472 eater.set_filename(filename)
Barry Warsaw75ee8f52001-02-26 04:46:53 +0000473 try:
474 tokenize.tokenize(fp.readline, eater)
475 except tokenize.TokenError, e:
Barry Warsaw6e972412001-05-21 19:35:20 +0000476 print >> sys.stderr, '%s: %s, line %d, column %d' % (
477 e[0], filename, e[1][0], e[1][1])
Barry Warsawa17e0f12000-03-08 15:18:35 +0000478 finally:
479 if closep:
480 fp.close()
Barry Warsawe27db5a1999-08-13 20:59:48 +0000481
Barry Warsawa17e0f12000-03-08 15:18:35 +0000482 # write the output
483 if options.outfile == '-':
484 fp = sys.stdout
485 closep = 0
486 else:
487 if options.outpath:
488 options.outfile = os.path.join(options.outpath, options.outfile)
489 fp = open(options.outfile, 'w')
490 closep = 1
491 try:
492 eater.write(fp)
493 finally:
494 if closep:
495 fp.close()
Barry Warsawe27db5a1999-08-13 20:59:48 +0000496
497
498if __name__ == '__main__':
499 main()
Barry Warsaw75a6e672000-05-02 19:28:30 +0000500 # some more test strings
501 _(u'a unicode string')