Python: Writing Hex Values into YAML

YAML has the ability to express hex-values, which are then decoded as numbers. However, when you want to dump a YAML document, strings will be quoted and numbers will be decimals. In order to write actual hex-values, you need to wrap your value in another type and then tell the YAML encoder how to handle it.

This is specifically possible with the ruamel YAML encoder (pypi).

An example of how to do this:

import sys

import ruamel.yaml


class HexInt(int):
    pass

def representer(dumper, data):
    return \
        ruamel.yaml.ScalarNode(
            'tag:yaml.org,2002:int',
            '0x{:04x}'.format(data))

ruamel.yaml.add_representer(HexInt, representer)

data = {
    'item1': {
        'string_value': 'some_string',
        'hex_value': HexInt(641),
    }
}

ruamel.yaml.dump(data, sys.stdout, default_flow_style=False)

Output:

item1:
  hex_value: 0x0281
  string_value: some_string

Please note that I require that my hex-values are two bytes and padded with zeroes, so the example above will print four characters (plus the prefix): 0x{:04x} . If this doesn’t work for you, change it to whatever you require.

Thanks to this post.

Advertisements

Git: Producing a Revert Commit for a Previous Change

Create an inverse commit to flip a previous change. Child’s play:

$ git revert <REFSPEC>

The new commit looks like:

commit 09cc98e3fa121774750728f5fa337befeb02d914
Author: Dustin Oprea <dustin@randomingenuity.com>
Date:   Tue Mar 27 16:02:25 2018 -0400

    Revert "What's the worst that could happen?"

    This reverts commit cf4fc9a50a20a633b82ee28ef9efa46df86db18d.

A lot more nicer and a lot more professional than copying-and-pasting into a new commit or dropping an old commit with a rebase.

It is nearly identical to similar, existing features provided by many version-control review systems.

GoogleAutoAuth: Automate the Google API Authentication Flow in Your Project

I write a ton of system software and tools. I’ve written a few independent tools against the Google APIs. They use OAuth, as most reputable APIs do.

Unfortunately, manually integrating the authentication flow in your system (read: headless, non-interactive) tools is painful after doing it a couple of times and is at least as painful for your users to deal with, especially when they have to log-in to a system that they do not usually have to touch just to periodically debug authentication.

The normal flow:

  1. Developer: Request a URL from the Google client-tools.
  2. Developer: Present the URL to the user and have them open it in a browser.
  3. User: Logs-in.
  4. User: Acknowledge that the tool will be able to access user’s data.
  5. Google: Authorization portal provides a code/token.
  6. User: Provides the code to the tool at the command-line.
  7. Developer: Does a final authorization with Google using the token.

With some mild wizardry in our Python tools, we can reduce this down to two basic steps:

  1. Developer: Initialize the auto-authentication framework with Google application-identity information at program-load.
  2. User: Call the tool and authenticate and authorize when prompted.

The tool will automatically launch a webserver on an open port, open the default browser with the Google login and authorization prompt, and then write the authorization to disk.

In the event that you want or need to do things manually, a generic tool is provided that can produce URLs and accept authorization codes.

For more information, see python-googleautoauth.

 

 

Repo: How to Parse and Use a Manifest Directly From Python

Repo is a tool from AOSP (Android) that allows you to manage a vast hierarchy of individual Git repositories. It’s basically a small Python tool that adds some abstraction around Git commands. The manifest that controls the project tree is written in XML, can include submanifests, can assign projects into different groups (so you do not have to clone all of them every time), can include additional command primitives to do file copies and tweak how the manifests are loaded, etc. The manifest is written against a basic specification but, still, it is a lot easier to find a way to avoid doing this yourself.

You can access the built-in manifest-parsing functionality directly from the Repo tool. We can also use the version of the tool that’s embedded directly in the Repo tree.

For example, to load a manifest:

/tree/.repo/repo$ python
>>> import manifest_xml
>>> xm = manifest_xml.XmlManifest('/tree/.repo')

Obviously, you’ll be [temporarily] manipulating the sys.path to load this from your integration.

To explore, you can play with the “projects” (list of project objects) and “paths” properties (a dictionary of paths to project objects).

Number of projects:

>>> print(len(xm.projects))
878
>>> print(len(xm.paths))
878

paths is a dictionary.

A project object looks like:

>>> p = xm.projects[0]
>>> p


>>> dir(p)
['AbandonBranch', 'AddAnnotation', 'AddCopyFile', 'AddLinkFile', 'CheckoutBranch', 'CleanPublishedCache', 'CurrentBranch', 'Derived', 'DownloadPatchSet', 'Exists', 'GetBranch', 'GetBranches', 'GetCommitRevisionId', 'GetDerivedSubprojects', 'GetRegisteredSubprojects', 'GetRemote', 'GetRevisionId', 'GetUploadableBranch', 'GetUploadableBranches', 'HasChanges', 'IsDirty', 'IsRebaseInProgress', 'MatchesGroups', 'PostRepoUpgrade', 'PrintWorkTreeDiff', 'PrintWorkTreeStatus', 'PruneHeads', 'StartBranch', 'Sync_LocalHalf', 'Sync_NetworkHalf', 'UncommitedFiles', 'UploadForReview', 'UserEmail', 'UserName', 'WasPublished', '_ApplyCloneBundle', '_CheckDirReference', '_CheckForSha1', '_Checkout', '_CherryPick', '_CopyAndLinkFiles', '_ExtractArchive', '_FastForward', '_FetchArchive', '_FetchBundle', '_GetSubmodules', '_GitGetByExec', '_InitAnyMRef', '_InitGitDir', '_InitHooks', '_InitMRef', '_InitMirrorHead', '_InitRemote', '_InitWorkTree', '_IsValidBundle', '_LoadUserIdentity', '_Rebase', '_ReferenceGitDir', '_RemoteFetch', '_ResetHard', '_Revert', '_UpdateHooks', '__class__', '__delattr__', '__dict__', '__doc__', '__format__', '__getattribute__', '__hash__', '__init__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_allrefs', '_getLogs', '_gitdir_path', '_revlist', '_userident_email', '_userident_name', 'annotations', 'bare_git', 'bare_objdir', 'bare_ref', 'clone_depth', 'config', 'copyfiles', 'dest_branch', 'enabled_repo_hooks', 'getAddedAndRemovedLogs', 'gitdir', 'groups', 'is_derived', 'linkfiles', 'manifest', 'name', 'objdir', 'old_revision', 'optimized_fetch', 'parent', 'rebase', 'relpath', 'remote', 'revisionExpr', 'revisionId', 'shareable_dirs', 'shareable_files', 'snapshots', 'subprojects', 'sync_c', 'sync_s', 'upstream', 'work_git', 'working_tree_dirs', 'working_tree_files', 'worktree']

The relative path for the project:

>>> path = p.relpath
>>> xm.paths[path]

The revision for the project:

>>> p.revisionExpr
u'master'

The remote for the project:

>>> p.GetRemote('origin').url
u'ssh://gerrit.company.com:2537/android/platform/external/lzma'

You can also get a config object representing the Git config for the bare archive of the project:

>>> p.config


>>> dir(p.config)
['ForRepository', 'ForUser', 'GetBoolean', 'GetBranch', 'GetRemote', 'GetString', 'GetSubSections', 'Global', 'Has', 'HasSection', 'SetString', 'UrlInsteadOf', '_ForUser', '_Global', '_Read', '_ReadGit', '_ReadJson', '_SaveJson', '__class__', '__delattr__', '__dict__', '__doc__', '__format__', '__getattribute__', '__hash__', '__init__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_branches', '_cache', '_cache_dict', '_do', '_json', '_remotes', '_section_dict', '_sections', 'defaults', 'file']

>>> p.config.file
u'/tree/.repo/projects/external/lzma.git/config'

An example of how to efficiently establish a tree of projects to paths:

_MAPPING_CACHE = {}

def get_repo_project_to_path_mapping(path):
    try:
        return _MAPPING_CACHE[path]
    except KeyError:
        pass

    repo_meta_path = os.path.join(path, '.repo')
    repo_tool_path = os.path.join(repo_meta_path, 'repo')

    if repo_tool_path not in sys.path:
        sys.path.insert(0, repo_tool_path)

    import manifest_xml

    xm = manifest_xml.XmlManifest(repo_meta_path)
    project_to_path_mapping = {}
    for path, p in xm.paths.items():
        project_to_path_mapping[str(p.name)] = str(path)

    _MAPPING_CACHE[path] = project_to_path_mapping
    return project_to_path_mapping

Go: Image-Processing Microservice

Imaginary is a fun little package that can be Dockerized and deployed next to the rest of your services to offload image-processing:

Fast HTTP microservice written in Go for high-level image processing backed by bimg and libvips. imaginary can be used as private or public HTTP service for massive image processing with first-class support for Docker & Heroku. It’s almost dependency-free and only uses net/http native package without additional abstractions for better performance.

Go: Functions that satisfy interfaces

I ran across this while sifting through the http package:

// The HandlerFunc type is an adapter to allow the use of
// ordinary functions as HTTP handlers. If f is a function
// with the appropriate signature, HandlerFunc(f) is a
// Handler that calls f.
type HandlerFunc func(ResponseWriter, *Request)

// ServeHTTP calls f(w, r).
func (f HandlerFunc) ServeHTTP(w ResponseWriter, r *Request) {
    f(w, r)
}

A function that satisfies http.HandlerFunc will also automatically get a function-on-a-function (who knew) that makes it look like a struct that satisfies the http.Handler interface as well.

I had looked for a footnote about this in the section on function declarations in the Go language specification but was not successful. The receiver is little more than just a repositioned first-argument. There’s nothing that says “MUST be a struct or a func”.

Add Timestamps to Your Bash History

Put this in your profile script (.bashrc):

export HISTTIMEFORMAT="%d/%m/%y %T "

I also find it helpful to increase the size of my command-line history:

export HISTFILESIZE=10000
export HISTSIZE=10000

The first controls how many are stored. The second controls how many are kept in memory. You can also tweak it to hold an actual unlimited number if you so wish.