blob: 5a9d0eaa4441f80ca91b1fe3cda1ffce3cd02791 [file] [log] [blame]
Tim Peters21d7d4d2003-04-18 00:45:59 +00001#! /usr/bin/env python
2
3"""
4combinerefs path
5
6A helper for analyzing PYTHONDUMPREFS output.
7
8When the PYTHONDUMPREFS envar is set in a debug build, at Python shutdown
9time Py_Finalize() prints the list of all live objects twice: first it
10prints the repr() of each object while the interpreter is still fully intact.
11After cleaning up everything it can, it prints all remaining live objects
12again, but the second time just prints their addresses, refcounts, and type
13names.
14
15Save all this output into a file, then run this script passing the path to
16that file. The script finds both output chunks, combines them, then prints
17a line of output for each object still alive at the end:
18
19 address refcnt typename repr
20
21address is the address of the object, in whatever format the platform C
22produces for a %p format code.
23
24refcnt is of the form
25
26 "[" ref "]"
27
28when the object's refcount is the same in both PYTHONDUMPREFS output blocks,
29or
30
31 "[" ref_before "->" ref_after "]"
32
33if the refcount changed.
34
35typename is object->ob_type->tp_name, extracted from the second PYTHONDUMPREFS
36output block.
37
38repr is repr(object), extracted from the first PYTHONDUMPREFS output block.
39
40The objects are listed in allocation order, with most-recently allocated
41printed first, and the first object allocated printed last.
42
43
44Simple examples:
45
46 00857060 [14] str '__len__'
47
48The str object '__len__' is alive at shutdown time, and both PYTHONDUMPREFS
49output blocks said there were 14 references to it. This is probably due to
50C modules that intern the string "__len__" and keep a reference to it in a
51file static.
52
53 00857038 [46->5] tuple ()
54
5546-5 = 41 references to the empty tuple were removed by the cleanup actions
56between the times PYTHONDUMPREFS produced output.
57
58 00858028 [1025->1456] str '<dummy key>'
59
60The string '<dummy key>', which is used in dictobject.c as the name of the
61dummy key that overwrites a real key that gets deleted, actually grew
62several hundred references during cleanup. It suggests that stuff did get
63removed from dicts by cleanup, but that the dicts themselves are staying
64alive for some reason.
65"""
66
67import re
68import sys
69
70# Generate lines from fileiter. If whilematch is true, continue reading
71# while the regexp object pat matches line. If whilematch is false, lines
72# are read so long as pat doesn't match them. In any case, the first line
73# that doesn't match pat (when whilematch is true), or that does match pat
74# (when whilematch is false), is lost, and fileiter will resume at the line
75# following it.
76def read(fileiter, pat, whilematch):
Tim Peters21d7d4d2003-04-18 00:45:59 +000077 for line in fileiter:
78 if bool(pat.match(line)) == whilematch:
Tim Peters8d17a902003-04-18 01:02:37 +000079 yield line
Tim Peters21d7d4d2003-04-18 00:45:59 +000080 else:
81 break
Tim Peters21d7d4d2003-04-18 00:45:59 +000082
83def combine(fname):
84 f = file(fname)
85 fi = iter(f)
86
87 for line in read(fi, re.compile(r'^Remaining objects:$'), False):
88 pass
89
90 crack = re.compile(r'([a-zA-Z\d]+) \[(\d+)\] (.*)')
91 addr2rc = {}
92 addr2guts = {}
93 before = 0
94 for line in read(fi, re.compile(r'^Remaining object addresses:$'), False):
95 m = crack.match(line)
96 if m:
97 addr, addr2rc[addr], addr2guts[addr] = m.groups()
98 before += 1
99 else:
100 print '??? skipped:', line
101
102 after = 0
103 for line in read(fi, crack, True):
104 after += 1
105 m = crack.match(line)
106 assert m
107 addr, rc, guts = m.groups() # guts is type name here
108 if addr not in addr2rc:
Guido van Rossum68694582003-04-18 19:51:10 +0000109 print '??? new object created while tearing down:', line.rstrip()
Tim Peters21d7d4d2003-04-18 00:45:59 +0000110 continue
111 print addr,
112 if rc == addr2rc[addr]:
113 print '[%s]' % rc,
114 else:
115 print '[%s->%s]' % (addr2rc[addr], rc),
116 print guts, addr2guts[addr]
117
118 f.close()
119 print "%d objects before, %d after" % (before, after)
120
121if __name__ == '__main__':
122 combine(sys.argv[1])