Patch #943206:
`glob.glob()` currently calls itself recursively to build a list of matches of
the dirname part of the pattern and then filters by the basename part. This is
effectively BFS. ``glob.glob('*/*/*/*/*/foo')`` will build a huge list of all
directories 5 levels deep even if only a handful of them contain a ``foo``
entry. A generator-based recusion would never have to store these list at once
by implementing DFS. This patch converts the `glob` function to an `iglob`
recursive generator . `glob()` now just returns ``list(iglob(pattern))``.
I also cleaned up the code a bit (reduced duplicate `has_magic()` checks and
created a second `glob0` helper func so that the main loop need not be
duplicated).
Thanks to Cherniavsky Beni for the patch!
diff --git a/Doc/lib/libglob.tex b/Doc/lib/libglob.tex
index 0d0d712..f3f4fb7 100644
--- a/Doc/lib/libglob.tex
+++ b/Doc/lib/libglob.tex
@@ -16,7 +16,7 @@
\index{filenames!pathname expansion}
\begin{funcdesc}{glob}{pathname}
-Returns a possibly-empty list of path names that match \var{pathname},
+Return a possibly-empty list of path names that match \var{pathname},
which must be a string containing a path specification.
\var{pathname} can be either absolute (like
\file{/usr/src/Python-1.5/Makefile}) or relative (like
@@ -24,6 +24,12 @@
Broken symlinks are included in the results (as in the shell).
\end{funcdesc}
+\begin{funcdesc}{iglob}{pathname}
+Return an iterator which yields the same values as \function{glob()}
+without actually storing them all simultaneously.
+\versionadded{2.5}
+\end{funcdesc}
+
For example, consider a directory containing only the following files:
\file{1.gif}, \file{2.txt}, and \file{card.gif}. \function{glob()}
will produce the following results. Notice how any leading components