Python has a great function to walk a tree called os.walk(). It’s a simple generator (meaning that you just enumerate it), and, at each node (a specific child path) it gives you 1) the current path, 2) a list of child directories, and 3) a list of child files. You can even use it in such a way that you can adjust what child directories it will walk on-the-fly. However, it doesn’t take any filters. What if you just want to give it inclusion/exclusion rules and then see the matching results?
Enter pathscan. This library will silently start a background-worker (as a process) to scan the directory structure in parallel while forwarding results to the foreground. To install, just install the pathscan library. It requires Python 3.4.
The library runs as a generator:
import fss.constants
import fss.config.log
import fss.orchestrator
root_path = '/etc'
filter_rules = [
(fss.constants.FT_DIR, fss.constants.FILTER_INCLUDE, 'init'),
(fss.constants.FT_FILE, fss.constants.FILTER_INCLUDE, 'net*'),
(fss.constants.FT_FILE, fss.constants.FILTER_EXCLUDE, 'networking.conf'),
]
o = fss.orchestrator.Orchestrator(root_path, filter_rules)
for (entry_type, entry_filepath) in o.recurse():
if entry_type == fss.constants.FT_DIR:
print("Directory: [%s]" % (entry_filepath,))
else: # entry_type == fss.constants.FT_FILE:
print("File: [%s]" % (entry_filepath,))
# Directory: [/etc/init]
# File: [/etc/networks]
# File: [/etc/netconfig]
# File: [/etc/init/network-interface-container.conf]
# File: [/etc/init/networking.conf]
# File: [/etc/init/network-interface-security.conf]
# File: [/etc/init/network-interface.conf]
A command-line tool is also included:
$ pathscan -i "i*.h" -id php /usr/include F /usr/include/iconv.h F /usr/include/ifaddrs.h F /usr/include/inttypes.h F /usr/include/iso646.h D /usr/include/php