Repo: How to Parse and Use a Manifest Directly From Python

Repo is a tool from AOSP (Android) that allows you to manage a vast hierarchy of individual Git repositories. It’s basically a small Python tool that adds some abstraction around Git commands. The manifest that controls the project tree is written in XML, can include submanifests, can assign projects into different groups (so you do not have to clone all of them every time), can include additional command primitives to do file copies and tweak how the manifests are loaded, etc. The manifest is written against a basic specification but, still, it is a lot easier to find a way to avoid doing this yourself.

You can access the built-in manifest-parsing functionality directly from the Repo tool. We can also use the version of the tool that’s embedded directly in the Repo tree.

For example, to load a manifest:

/tree/.repo/repo$ python
>>> import manifest_xml
>>> xm = manifest_xml.XmlManifest('/tree/.repo')

Obviously, you’ll be [temporarily] manipulating the sys.path to load this from your integration.

To explore, you can play with the “projects” (list of project objects) and “paths” properties (a dictionary of paths to project objects).

Number of projects:

>>> print(len(xm.projects))
>>> print(len(xm.paths))

The path dictionary:

>>> print(xm.paths.__class__)
<type 'dict'>

A project object looks like:

>>> p = xm.projects[0]
>>> p
<project.Project object at 0x7f7f5b0837d0>

>>> dir(p)
['AbandonBranch', 'AddAnnotation', 'AddCopyFile', 'AddLinkFile', 'CheckoutBranch', 'CleanPublishedCache', 'CurrentBranch', 'Derived', 'DownloadPatchSet', 'Exists', 'GetBranch', 'GetBranches', 'GetCommitRevisionId', 'GetDerivedSubprojects', 'GetRegisteredSubprojects', 'GetRemote', 'GetRevisionId', 'GetUploadableBranch', 'GetUploadableBranches', 'HasChanges', 'IsDirty', 'IsRebaseInProgress', 'MatchesGroups', 'PostRepoUpgrade', 'PrintWorkTreeDiff', 'PrintWorkTreeStatus', 'PruneHeads', 'StartBranch', 'Sync_LocalHalf', 'Sync_NetworkHalf', 'UncommitedFiles', 'UploadForReview', 'UserEmail', 'UserName', 'WasPublished', '_ApplyCloneBundle', '_CheckDirReference', '_CheckForSha1', '_Checkout', '_CherryPick', '_CopyAndLinkFiles', '_ExtractArchive', '_FastForward', '_FetchArchive', '_FetchBundle', '_GetSubmodules', '_GitGetByExec', '_InitAnyMRef', '_InitGitDir', '_InitHooks', '_InitMRef', '_InitMirrorHead', '_InitRemote', '_InitWorkTree', '_IsValidBundle', '_LoadUserIdentity', '_Rebase', '_ReferenceGitDir', '_RemoteFetch', '_ResetHard', '_Revert', '_UpdateHooks', '__class__', '__delattr__', '__dict__', '__doc__', '__format__', '__getattribute__', '__hash__', '__init__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_allrefs', '_getLogs', '_gitdir_path', '_revlist', '_userident_email', '_userident_name', 'annotations', 'bare_git', 'bare_objdir', 'bare_ref', 'clone_depth', 'config', 'copyfiles', 'dest_branch', 'enabled_repo_hooks', 'getAddedAndRemovedLogs', 'gitdir', 'groups', 'is_derived', 'linkfiles', 'manifest', 'name', 'objdir', 'old_revision', 'optimized_fetch', 'parent', 'rebase', 'relpath', 'remote', 'revisionExpr', 'revisionId', 'shareable_dirs', 'shareable_files', 'snapshots', 'subprojects', 'sync_c', 'sync_s', 'upstream', 'work_git', 'working_tree_dirs', 'working_tree_files', 'worktree']

The relative path for the project:

>>> path = p.relpath
>>> xm.paths[path]
<project.Project object at 0x7f7f5b0837d0>

The revision for the project:

>>> p.revisionExpr

The remote for the project:

>>> p.GetRemote('origin').url

You can also get a config object representing the Git config for the bare archive of the project:

>>> p.config
<git_config.GitConfig object at 0x7f7f5b083810>

>>> dir(p.config)
['ForRepository', 'ForUser', 'GetBoolean', 'GetBranch', 'GetRemote', 'GetString', 'GetSubSections', 'Global', 'Has', 'HasSection', 'SetString', 'UrlInsteadOf', '_ForUser', '_Global', '_Read', '_ReadGit', '_ReadJson', '_SaveJson', '__class__', '__delattr__', '__dict__', '__doc__', '__format__', '__getattribute__', '__hash__', '__init__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_branches', '_cache', '_cache_dict', '_do', '_json', '_remotes', '_section_dict', '_sections', 'defaults', 'file']

>>> p.config.file

An example of how to efficiently establish a tree of projects to paths:


def get_repo_project_to_path_mapping(path):
        return _MAPPING_CACHE[path]
    except KeyError:

    repo_meta_path = os.path.join(path, '.repo')
    repo_tool_path = os.path.join(repo_meta_path, 'repo')

    if repo_tool_path not in sys.path:
        sys.path.insert(0, repo_tool_path)

    import manifest_xml

    xm = manifest_xml.XmlManifest(repo_meta_path)
    project_to_path_mapping = {}
    for path, p in xm.paths.items():
        project_to_path_mapping[str(] = str(path)

    _MAPPING_CACHE[path] = project_to_path_mapping
    return project_to_path_mapping

Go: Image-Processing Microservice

Imaginary is a fun little package that can be Dockerized and deployed next to the rest of your services to offload image-processing:

Fast HTTP microservice written in Go for high-level image processing backed by bimg and libvips. imaginary can be used as private or public HTTP service for massive image processing with first-class support for Docker & Heroku. It’s almost dependency-free and only uses net/http native package without additional abstractions for better performance.

Go: Functions that satisfy interfaces

I ran across this while sifting through the http package:

// The HandlerFunc type is an adapter to allow the use of
// ordinary functions as HTTP handlers. If f is a function
// with the appropriate signature, HandlerFunc(f) is a
// Handler that calls f.
type HandlerFunc func(ResponseWriter, *Request)

// ServeHTTP calls f(w, r).
func (f HandlerFunc) ServeHTTP(w ResponseWriter, r *Request) {
    f(w, r)

A function that satisfies http.HandlerFunc will also automatically get a function-on-a-function (who knew) that makes it look like a struct that satisfies the http.Handler interface as well.

I had looked for a footnote about this in the section on function declarations in the Go language specification but was not successful. The receiver is little more than just a repositioned first-argument. There’s nothing that says “MUST be a struct or a func”.

Add Timestamps to Your Bash History

Put this in your profile script (.bashrc):

export HISTTIMEFORMAT="%d/%m/%y %T "

I also find it helpful to increase the size of my command-line history:

export HISTFILESIZE=10000
export HISTSIZE=10000

The first controls how many are stored. The second controls how many are kept in memory. You can also tweak it to hold an actual unlimited number if you so wish.

Git: Get the Latest Commits in a Repository

Sometimes you have a large repository (many commits, many branches) and it either might be an old clone or you might be operating out of a secondary branch and not aware of which branch has the majority of activity. You can do a git-log that displays all commits without respect to the current branch and order them by parents before children, in descending commit-timestamp (as opposed to author-timestamp) order.

This prints the top five:

$ git log -5 --all --date-order

Browse Your Image Library With a Webpage

Easily install via PyPI and start a documentation server with RemoteImageBrowser. It also supports more robust/scalable installs (via uWSGI).

You can also reuse and augment the local Gnome thumbnail cache. This means that if you have a want to serve a large image library, you can precache all of your thumbnails from a scheduled process and benefit from them both when browsing/manipulating them directly as well as when you browse them from the website. A basic PIL-based thumbnailer is used by default.


$ git clone
$ cd RemoteImageBrowser
$ sudo pip install -r requirements.txt
$ rib/resources/scripts/development \
    --env IMAGE_ROOT_PATH=~/Downloads \
    --env THUMBNAIL_ROOT_PATH=/tmp/thumbnails