Scriptable C++

What if C++ were a scripting language that you could eval from your native C++?

ChaiScript: http://chaiscript.com

Example (from the homepage):

#include <chaiscript/chaiscript.hpp>

std::string helloWorld(const std::string &t_name)
{
  return "Hello " + t_name + "!";
}

int main()
{
  chaiscript::ChaiScript chai;
  chai.add(chaiscript::fun(&helloWorld), 
           "helloWorld");

  chai.eval("puts(helloWorld("Bob"));");
}

Python: Recursive defaultdict

collections.defaultdict is a fun utility that is used to create an indexable collection that will implicitly create an entry if a key is read that doesn’t yet exist. The value to be used will be instantiated using the type passed.

Example:

import collections

c = collections.defaultdict(str)
c['missing_key']
print(dict(c))
#{'missing_key': ''}

What if you want to create a dictionary that recursively and implicitly creates dictionary-type members as far down as you’d like to go? Well, it turns out that you can also pass a factory-function as the argument to collections.defaultdict:

import collections

def dict_maker():
    return collections.defaultdict(dict_maker)

x = dict_maker()
x['a']['b']['c'] = 55
print(x)
#defaultdict(<function dict_maker at 0x10e1dbed8>, {'a': defaultdict(<function dict_maker at 0x10e1dbed8>, {'b': defaultdict(<function dict_maker at 0x10e1dbed8>, {'c': 55})})})

To make the result a little nicer:

import json

print(json.dumps(x))
#{"a": {"b": {"c": 55}}}

Subversion from Python

Generally, it’s preferable to bind to libraries rather than executables when given the option. In my case, I needed SVN access from Python and couldn’t, at that time, find a confidence-inspiring library to work with. So, I wrote svn.

It turns out that there is a Subversion-sponsored Python project. It looks to be SWIG-based.

This comes from the python-svn Apt package under Ubuntu.

The Programmer’s Guide has the following examples, among others:

cat:

import pysvn
client = pysvn.Client()
file_content = client.cat('file.txt')

ls:

import pysvn
client = pysvn.Client()
entry_list = client.ls('.')

info:

import pysvn
client = pysvn.Client()
entry = client.info('.')

Using inotify to watch for directory changes from Python

An inotify project is now available on PyPI. More documentation is available at the project homepage: PyInotify

Though the inotify functionality is uncomplicated to implement in C, it’s stupidly simple to implement in Python using this library.

To install:

$ sudo pip install inotify

This is the principal logic of the example provided in the project documentation:

i = inotify.adapters.Inotify()

i.add_watch('/tmp')

for event in i.event_gen():
    if event is not None:
        (header, type_names, watch_path, filename) = event

        _LOGGER.info("WD=(%d) MASK=(%d) COOKIE=(%d) LEN=(%d) MASK->NAMES=%s "
                     "WATCH-PATH=[%s] FILENAME=[%s]", 
                     header.wd, header.mask, header.cookie, header.len, type_names, 
                     watch_path, filename)

We ran the following operations on /tmp:

$ touch /tmp/aa
$ rm /tmp/aa
$ mkdir /tmp/dir1
$ rmdir /tmp/dir1

This was the corresponding output of the inotify process:

2015-04-24 05:02:06,667 - __main__ - INFO - WD=(1) MASK=(256) COOKIE=(0) LEN=(16) MASK->NAMES=['IN_CREATE'] FILENAME=[aa]
2015-04-24 05:02:06,667 - __main__ - INFO - WD=(1) MASK=(32) COOKIE=(0) LEN=(16) MASK->NAMES=['IN_OPEN'] FILENAME=[aa]
2015-04-24 05:02:06,667 - __main__ - INFO - WD=(1) MASK=(4) COOKIE=(0) LEN=(16) MASK->NAMES=['IN_ATTRIB'] FILENAME=[aa]
2015-04-24 05:02:06,667 - __main__ - INFO - WD=(1) MASK=(8) COOKIE=(0) LEN=(16) MASK->NAMES=['IN_CLOSE_WRITE'] FILENAME=[aa]
2015-04-24 05:02:17,412 - __main__ - INFO - WD=(1) MASK=(512) COOKIE=(0) LEN=(16) MASK->NAMES=['IN_DELETE'] FILENAME=[aa]
2015-04-24 05:02:22,884 - __main__ - INFO - WD=(1) MASK=(1073742080) COOKIE=(0) LEN=(16) MASK->NAMES=['IN_ISDIR', 'IN_CREATE'] FILENAME=[dir1]
2015-04-24 05:02:25,948 - __main__ - INFO - WD=(1) MASK=(1073742336) COOKIE=(0) LEN=(16) MASK->NAMES=['IN_ISDIR', 'IN_DELETE'] FILENAME=[dir1]

Lastly, this library also provides the ability to recursively watch a given directory. Just use the inotify.adapters.InotifyTree class instead of inotify.adapters.Inotify, and pass a path.

World’s Simplest Python epoll Example For Waiting on File/Socket Readiness

Once upon a time, the only way to wait to read or write on one or more sockets/descriptors in Linux was the select method, which was later superseded by poll, and then epoll. epoll is the most current and popular way to accomplish this, now. Note that this is only available for Linux, and not for Mac (though select and poll appear to be).

In Python, you can invoke this functionality in the built-in select package. You can use it on any standard system file-descriptor, whether it’s socket-oriented, inotify-related, etc.

import logging
import sys
import socket
import select

_MAX_CONNECTION_BACKLOG = 1
_PORT = 9999
_BINDING = ('0.0.0.0', _PORT)
_EPOLL_BLOCK_DURATION_S = 1

_DEFAULT_LOG_FORMAT = '%(asctime)s - %(name)s - %(levelname)s - %(message)s'

_LOGGER = logging.getLogger(__name__)

_CONNECTIONS = {}

_EVENT_LOOKUP = {
    select.POLLIN: 'POLLIN',
    select.POLLPRI: 'POLLPRI',
    select.POLLOUT: 'POLLOUT',
    select.POLLERR: 'POLLERR',
    select.POLLHUP: 'POLLHUP',
    select.POLLNVAL: 'POLLNVAL',
}

def _configure_logging():
    _LOGGER.setLevel(logging.DEBUG)

    ch = logging.StreamHandler()

    formatter = logging.Formatter(_DEFAULT_LOG_FORMAT)
    ch.setFormatter(formatter)

    _LOGGER.addHandler(ch)

def _get_flag_names(flags):
    names = []
    for bit, name in _EVENT_LOOKUP.items():
        if flags & bit:
            names.append(name)
            flags -= bit

            if flags == 0:
                break

    assert flags == 0, 
           "We couldn't account for all flags: (%d)" % (flags,)

    return names

def _handle_inotify_event(epoll, server, fd, event_type):
    # Common, but we're not interested.
    if (event_type & select.POLLOUT) == 0:
        flag_list = _get_flag_names(event_type)
        _LOGGER.debug("Received (%d): %s", 
                      fd, flag_list)

    # Activity on the master socket means a new connection.
    if fd == server.fileno():
        _LOGGER.debug("Received connection: (%d)", event_type)

        c, address = server.accept()
        c.setblocking(0)

        child_fd = c.fileno()

        # Start watching the new connection.
        epoll.register(child_fd)

        _CONNECTIONS[child_fd] = c
    else:
        c = _CONNECTIONS[fd]

        # Child connection can read.
        if event_type & select.EPOLLIN:
            b = c.recv(1024)
            sys.stdout.write(b)

def _create_server_socket():
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
    s.bind(_BINDING)
    s.listen(_MAX_CONNECTION_BACKLOG)
    s.setblocking(0)

    return s

def _run_server():
    s = _create_server_socket()

    e = select.epoll()

    # If not provided, event-mask defaults to (POLLIN | POLLPRI | POLLOUT). It 
    # can be modified later with modify().
    e.register(s.fileno())

    try:
        while True:
            events = e.poll(_EPOLL_BLOCK_DURATION_S)
            for fd, event_type in events:
                _handle_inotify_event(e, s, fd, event_type)
    finally:
        e.unregister(s.fileno())
        e.close()
        s.close()

if __name__ == '__main__':
    _configure_logging()
    _run_server()

Now, just connect via telnet to port 9999 on localhost. Submitted text in the client will be printed to the screen on the server:

$ python epoll.py 
2015-04-23 08:34:35,104 - __main__ - DEBUG - Received (3): ['POLLIN']
2015-04-23 08:34:35,104 - __main__ - DEBUG - Received connection: (1)
hello

Issues between Vagrant/VirtualBox and your Webserver

It turns out that there could be issues when you’re changing files on your local system and using them from a VirtualBox VM. This can/will you if you’re working with small, static files under Vagrant when using VirtualBox as a provider.

You might make changes that result in unexpected, non-sensical, character-encoding issues on the remote system or even any lack of any updates appearing whatsoever. For me, this affected my JavaScript and CSS files.

To fix this, add “sendfile off;” to the location-blocks (if using Nginx) that are responsible for your static files.

Reference: http://docs.vagrantup.com/v2/synced-folders/virtualbox.html

Brew and PyEnv

PyEnv is a solution, like virtualenv, that helps you maintain parallel environments. PyEnv, however, allows you to maintain parallel versions Python. It will also expose the same versions of the Python tools, like pip. As a bonus, all of your pip packages will be installed locally to your user (no more sudo, at all).

Recently, in order to control and debug a series of sudden environmental problems, I upgraded to Yosemite. Unfortunately, Python 2.7.8 came with it.

I manage a number of components that depend on gevent (for the awesome coroutine functionality), and gevent is not Python3 compatible. Unfortunately, gevent is broken in 2.7.8 (the TypeError: __init__() got an unexpected keyword argument 'server_hostname' error: https://github.com/asciimoo/searx/issues/120), and there are no strong bug-fixes. You can fix this by hacking-in a no-op parameter to the module on your system, but I’d rather go back to 2.7.6 for all of my local projects, by default, and be running the same thing as the servers.

PyEnv worked great for this:

  1. Install PyEnv:
$ brew install pyenv
  1. Add to your user’s environment script:
$ eval "$(pyenv init -)"
  1. Run the command in (2) directly, or start a new shell.
  2. Download and build 2.7.6 . We installed zlib via Brew, but we had to set the CFLAGS variable to prevent the The Python zlib extension was not compiled. Missing the zlib? message:
$ CFLAGS="-I$(xcrun --show-sdk-path)/usr/include" pyenv install 2.7.6
  1. Elect this as the default, system version:
$ pyenv global 2.7.6
  1. Update the current user’s PyEnv configuration to point to the new Python executables:
$ pyenv rehash

Finding the Mime-Type of a File in Subversion

I’m not a fan of Subversion but it exists in my life nonetheless. To that end, sometimes you may need to write tools against it. Sometimes these tools may need to differentiate between binary and text entries. Since SVN needs to know, at the very least, whether a file is text or binary (because most version-control systems depend on taking deltas of text-files), it’s reasonable to think that you can read this information from SVN.

This information may be stored as a property on each entry. Note that though there appears to be no guarantee that this information is available, I consider it to be reasonable to expect that a binary file will always have a non-empty mime-type.

The mime-type of an image:

$ svn propget svn:mime-type image.png
application/octet-stream
$ echo $?
0

The mime-type of a plain-text file:

$ svn proplist config.xml
$ echo $?
0

Notice that you’ll get a successful return (0) whether that property is or is not defined.

You can also read the property off remote files in the same fashion:

$ svn propget svn:mime-type https://subversion.host/image.png
application/octet-stream

Naming Your Webpage Download

The traditional way that a webpage provides a download for a user is by either opening it into a new window or redirecting to it. It may also choose to set the “Content-Disposition” response-header with a filename:

Content-Disposition: attachment; filename=your_filename.pdf

This is the common-way. However, this will force a download. What if you just want to present the document to the browser for it to be displayed to the user? Well, it turns out that RFC 2183 (“The Content-Disposition Header Field”) also provides you the “inline” type:

Content-Disposition: inline; filename=your_filename.pdf

This accomplishes what we want; The document will [probably] open in the browser, but, if the user wants to save it, it’ll default to the given filename.