Using the “Tig” Git Console UI

At its simplest, Tig allows you to navigate your Git projects from the console (it internally invokes commands to git). It has nearly all of the browsing functionality of Github while readily running locally. At it’s most-complicated, it looks to be as flexible as Git itself.

The two simplest ways to run Tig (from within our Git project):

  • Piping: git log | tig
  • Calling directly: tig

In the case of piping, you’re really just benefiting by coloring the output and pumping it through pagination. If you’re going to call Tig directly, the experience will be more interactive. The default “view” is the log.

You can also specify other views:

$ tig -h
tig 1.2.1 (Nov 29 2013)

Usage: tig        [options] [revs] [--] [paths]
   or: tig log    [options] [revs] [--] [paths]
   or: tig show   [options] [revs] [--] [paths]
   or: tig blame  [options] [rev] [--] path
   or: tig stash
   or: tig status
   or: tig <      [git command output]

Options:
  +<number>       Select line <number> in the first view
  -v, --version   Show version and exit
  -h, --help      Show help message and exit

An example of the commit browser. I’ve clicked on a commit to show its diffs:

Git "Tig" Commit Browser

An example of blaming:

Git "Tig" Blame Browser
For more information:

Screenshots
Manual

Using etcd as a Highly Available and Innovative Key-Value Storage

etcd was created as the primary building-block on which CoreOS is built. It uses the Raft algorithm to keep changes consistent throughout a cluster by electing a leader and distributing a log of operations (“commands”) from the leader to the other systems. Due to these features and others, etcd to be used for robust service-discovery and cluster configuration, replacing ZooKeeper. Entries are referred-to as “nodes”.

 

Distributed Locks

Every update automatically increments the “index”, which is a global, monotonically-increasing value, incremented for every operation:

c.set('/a/b/c', 5).index
# 66
c.set('/a/b/c', 5).index
# 67

The index increases for every operation, not just those with side-effects. Per the mailing list (2013-11-29), the reason for this is:

That’s a side effect of how Raft works. When new commands come in they get sent to Raft immediately which increments the index. We’re not able to check the current value of the key before insert because Raft batches commands so there may be uncommitted changes between the current state and the state at the time when the command is being committed. That’s also why changes that cause errors can increment the index even though no change was made.

etcd also gives us a “CAS” (“compare and swap”) call (“test_and_set” in the Python client). This allows us to assign a value to a key, but only when the existing value meets one or more conditions:

  1. The existing value is set to something specific (a “previous value” condition).
  2. The existing index is set to something specific (a “previous index” condition).
  3. The key either currently exists or doesn’t (a “previously exists” condition).

The existence of a monotonic, atomic counter and a CAS function happen to be the exact dependencies required to establish distributed locking. The process might be the following:

  1. Initialize a node for the specific lock (“lock node”). Use CAS with a “prevExists” of “false” and a value of “0”.
  2. Assign some value to some dummy key used for the purpose of incrementing and grabbing the index. This index will be used as a unique ID for the current thread/instance (“instance ID”).
  3. Do a CAS on the lock node with a “prevValue” of “0”, a value of the instance-ID, and a TTL of whatever maximum lock time we should allow.
    • If error, watch the lock node. Give the HTTP client a timeout. Try again after long-polling returns or timeout hits.
    • If no error, do whatever logic is required, and, to release, use a CAS to set the lock-node to “0” with a “prevValue” of the instance-ID. If this fails (ValueError), then the lock has been reowned by another instance after having timed-out.

It’s important to mention that the “test_and_set” operation in the Python client only currently supports the “prevValue” condition. With the “prevValue” condition, you’ll get a KeyError if the key doesn’t exist. If the real existing value does not match the stated existing value, you’ll get a ValueError (which is a standard consideration when using this call).

 

Additional Features

Aside from being so consistent and having easy access to the operations via REST, there are two non-traditional operations that you’ll see with etcd but not with [most] other KV solutions:

  1. Entries can be stored in a hierarchy
  2. Long-polling to wait on a change to a key or folder (“watch”)

With (2), you can monitor a key that doesn’t yet exist, or even a folder (in which case, it’ll block until any value inside the folder changes, recursively). You can use this to achieve event-driven scripts (a neat usage mentioned on the mailing list).

Lastly, before moving on to the example, the cluster should be kept small:

Every command the client sends to the master is broadcast to all of the 
followers. The command is not committed until the majority of the cluster peers 
receive that command.

Because of this majority voting property, the ideal cluster should be kept 
small to keep speed up and be made up of an odd number of peers.

(what size cluster should I use)

etcd is based on Google’s Chubby (which uses Paxos rather than Raft).

 

Quick Start

For this example, we’re going to establish and interact with etcd using three different terminals on the same system. etcd requires Go 1.1+. You’ll probably have to build it (via a “Git” clone call, and a build), as it’s not yet available via many package managers (Ubuntu, specifically).

Run etcd:

$ etcd
[etcd] Nov 28 13:02:20.849 INFO      | Wrote node configuration to 'info'
[etcd] Nov 28 13:02:20.849 INFO      | etcd server [name default-name, listen on 127.0.0.1:4001, advertised url http://127.0.0.1:4001]
[etcd] Nov 28 13:02:20.850 INFO      | raft server [name default-name, listen on 127.0.0.1:7001, advertised url http://127.0.0.1:7001]

Creating a cluster is as easy as simply launching additional instances of the daemon on new hosts. Now, install Python’s python-etcd:

sudo pip install python-etcd

Connect the client:

from etcd import Client
c = Client(host='127.0.0.1')

Set a value (notice that we have to specify a folder, even if it’s only the root):

c.set('/test_entry', 'abc')

EtcdResult(action=u'SET', index=9, key=u'/test_entry', prevValue=None, value=u'abc', expiration=None, ttl=None, newKey=True)
# Actions available on EtcdResult: action, count, expiration, index, key, newKey, prevValue, ttl, value

Get the value:

r = c.get('/test_entry')
print(r.value)
# Prints "abc"

In a second terminal, connect the client and run the following to block for a change to the given folder (it doesn’t currently exist):

r = c.watch('/test_folder')

Back in the first terminal, run:

c.set('/test_folder/test_inner_folder/deep_test', 'abc')

The command waiting in the second terminal has now returned. Examine “r”:

print(r)
EtcdResult(action=u'SET', index=15, key=u'/test_folder/test_inner_folder/deep_test', prevValue=None, value=u'abc', expiration=None, ttl=None, newKey=True)

Get a listing of children. This may or may not work on “/”, depending on your python-etcd version:

from pprint import pprint
c.set('/test_folder/entry_1', 'test_value_1')
c.set('/test_folder/entry_2', 'test_value_2')
list_ = c.get('/test_folder')
pprint(list_)
#[EtcdResult(action=u'GET', index=4, key=u'/test_folder/entry_1', prevValue=None, value=u'test_value_1', expiration=None, ttl=None, newKey=None),
# EtcdResult(action=u'GET', index=4, key=u'/test_folder/entry_2', prevValue=None, value=u'test_value_2', expiration=None, ttl=None, newKey=None)]

etcd also allows for TTLs (in seconds) on “put” operations:

from time import sleep
c.set('/disappearing_entry', 'inconsequential_value', ttl=5)
sleep(5)
c.get('/disappearing_entry')

You’ll get the following error (a proper KeyError):

Traceback (most recent call last):
  File "", line 1, in 
  File "/Library/Python/2.7/site-packages/etcd/client.py", line 284, in get
    response = self.api_execute(self.key_endpoint + key, self._MGET)
  File "/Library/Python/2.7/site-packages/etcd/client.py", line 357, in api_execute
    raise error_exception(message)
KeyError: u'Key Not Found : get: /disappearing_entry'

Miscellaneous functions:

c.machines
# ['http://127.0.0.1:4001']
c.leader
# 'http://127.0.0.1:7001'

As a final note, you don’t have to choose between cURL requests and the API. Rather, there’s also etcdctl for command-line control:

$ etcdctl set /foo/bar "Hello world"
Hello world

 

FAQ

Leaders are elected using elections. However, there’s a chance that a leader won’t be elected, and the elections will have to be reattempted. From the mailing list (2013-11-29):

Q: What would cause a leader candidate to not receive a majority of votes from nodes, during elections?
A: The common case election failure would be due to either a network partition causing less than a quorum to vote, or another candidate being elected first.

Q: Is there any decision-making involved during elections, such as the consideration of the CPU utilizations of individual machines?
A: Not at this time. It might make sense to add some sort of fitness to the leader proposal decision later.

Using Bitly’s NSQ Job Queue

I’ve recently been impressed by Bitly’s NSQ server, written in Go. Aside from the part about Go capturing my attention, the part that most interested me was 1) they claim that it achieves 90,000 messages/second (which is decent), and 2) it’s relatively easy to set-up, and it’s self-managing.

The topology for NSQ is straight forward: N queue servers (nsqd), 0+ lookup servers (nsqlookupd), and an optional admin (dashboard) server (nsqadmin). The lookup servers are optional, but they allow auto-discovery of which hosts are managing which topics. Bitly recommends that a cluster of three are used in production. To start multiple instances, just launch them. You’ll have to pass in a list of nsqlookupd hosts to the consumer client, and a list of nsqd hosts to the producer client.

The message pipeline is intuitive: messages are pushed along with topics/classifiers, and consumers listen for topics and channels. A channel is a named grouping of consumers that work on similar tasks, where the “channel” is presented as a string to the consumer instance. NSQ uses the concepts of topics and channels to drive multicast and distributed delivery.

As far as optimization goes, there are about three dozen parameters for nsqd, but you need not concern yourself with most of them, here.

This example resembles the one from the NSQ website, plus some additional info. All four processes can be run from the same system.

Quick Start

Get and build the primary components. $GOPATH needs to either be set to your Go workspace (mine is ~/.go, below), or an empty directory that will be used for it. $GOPATH/bin needs to be in the path.

go get github.com/kr/godep

godep get github.com/bitly/nsq/nsqd
godep get github.com/bitly/nsq/nsqlookupd
godep get github.com/bitly/nsq/nsqadmin

To start, run each of the following services in a different terminal on the same system.

A lookup server instance:

nsqlookupd

A queue instance:

nsqd --lookupd-tcp-address=127.0.0.1:4160

An admin server instance:

nsqadmin --template-dir=~/.go/src/github.com/bitly/nsq/nsqadmin/templates --lookupd-http-address=127.0.0.1:4161

To push test-items:

curl -d 'hello world 1' 'http://127.0.0.1:4151/put?topic=test'
curl -d 'hello world 2' 'http://127.0.0.1:4151/put?topic=test'
curl -d 'hello world 3' 'http://127.0.0.1:4151/put?topic=test'

The “apps” aren’t built, apparently, by default. We’ll need these so we can get a message-dumper, for testing:

~/.go/src/github.com/bitly/nsq$ make
cd build/apps

To dump data that’s already waiting in the queues:

./nsq_to_file --topic=test --output-dir=/tmp --lookupd-http-address=127.0.0.1:4161

Display queue data:

cat /tmp/test.*.log
hello world 1
hello world 2
hello world 3

Python Library

Matt Reiferson wrote pynsq, which is a Python client that employs Tornado for it’s message-loops. The gotcha is that both the consumers -and- producers both require you to use IOLoop, Tornado’s message-loop. This is because pynsq not only allows you to define a “receive” callback, but a post-send callback as well. Though you don’t have to define one, there is an obscure, but real, chance that a send will fail, per Matt, and should always be checked for.

Because of this design, you should be prepared to put all of your core loop logic into the Tornado loop.

To install the client:

sudo pip install pynsq tornado

A producer example from the “pynsq” website:

import nsq
import tornado.ioloop
import time

def pub_message():
    writer.pub('test', time.strftime('%H:%M:%S'), finish_pub)

def finish_pub(conn, data):
    print data

writer = nsq.Writer(['127.0.0.1:4150'])
tornado.ioloop.PeriodicCallback(pub_message, 1000).start()
nsq.run()

An asynchronous consumer example from the “pynsq” website (doesn’t correspond to the producer example):

import nsq

buf = []

def process_message(message):
    global buf
    message.enable_async()
    # cache the message for later processing
    buf.append(message)
    if len(buf) >= 3:
        for msg in buf:
            print msg
            msg.finish()
        buf = []
    else:
        print 'deferring processing'

r = nsq.Reader(message_handler=process_message,
        lookupd_http_addresses=['http://127.0.0.1:4161'],
        topic='nsq_reader', channel='async', max_in_flight=9)
nsq.run()

Give it a try.

FAQ

(Courtesy of a dialogue with Matt Reiferson)

Q: Most job-queues allow you send messages without imposing a loop. Is the 
   IOLoop required for both receiving -and- sending in pynsq?
A: Yes. pynsq supports the notion of completion-callbacks to signal when a send 
   finishes. Even if you don't use it, it's accounted-for in the mechanics. If 
   you want to send synchronous messages without the loop, hit the HTTP 
   endpoint. However, facilitating both the receive and send IOLoops allows for 
   the fasted possible dialogue, especially when the writers and readers are 
   paired to the same hosts.

Q: An IOLoop is even required for asynchronous sends?
A: Yes. If you want to simply send one-off asynchronous messages, 
   consider opening a worker process that manages delivery. It can apply its 
   own callback to catch failures, and transmit successes, failures, etc.. to 
   an IPC queue (if you need this info).

Q: Are there any delivery guarantees (like in ZeroMQ)?
A: No. It's considered good-practice by the NSQ guys to always check the 
   results of message-sends in any situation (in any kind of messaging, in 
   general). You'd do this from the callbacks, with pynsq.

    The reasons that a send would fail are the following:

    1: The topic name is not formatted correctly (to character/length 
       restrictions). There is no official documentation of this, however.
    2: The message is too large (this can be set via a parameter to nsqd).
    3: There is a breakdown related to a race-condition with a publish and a 
       delete happening on a specific topic. This is rare.
    4: Client connection-related failures.

Q: In scenario (3) of the potential reasons for a send-failure, can I mitigate 
   the publish/delete phenomena if I am either not deleting topics or have 
   orchestrated deletions such that writes eliciting topic creations will never 
   be done until a sufficient amount of time has elapsed since a deletion?
A: Largely. Though, if nowhere else, this can also happen internally to NSQ at 
   shutdown.

Q: How are new topics announced to the cluster?
A: The first writer or reader request for a topic will be applied on the 
   upstream nsqd host, and will then propagate to the nsqlookupd hosts. They will 
   eventually spread to the other readers from there. The same thing applies to 
   a new topic, as well as a previously-deleted one.

Manager Namespaces for IPC Between Python Process Pools

Arguably, one of the functionalities that best represent why Python has so many multidisciplinary uses is its multiprocessing library. This library allows Python to maintain pools of processes and communication between these processes with most of the simplicity of a standard multithreaded application (asynchronously invoking a function, locking, and IPC). This is not to say that Python can’t do threads, too, but, due to being able to quickly run map/reduce operations or asynchronous tasks using a very simple set of functions combined with the disadvantages of having to consider the GIL when doing multithreaded development, I believe the multiprocess design to be more popular by a landslide.

There are mountains of examples for how to use multiprocessing, along with sufficient documentation for most of the IPC mechanisms that can be used to communicate between processes: queues, pipes, “manager”-based shares and proxy objects, shared ctypes types, multiprocessing-based “client” and “listener” sockets, etc..

There is a very subtle IPC mechanism called a “namespace” (which is actual part of Manager), whose presence in the documentation only speaks for a couple of lines of the thousands that are there. It’s easy, and worth special mention.

from multiprocessing import Pool, Manager
from os import getpid
from time import sleep

def _worker(ns):
    pid = getpid()
    print("%d: Worker started." % (pid))

    while ns.is_running is True:
        sleep(1)

    print("%d: Worker terminating." % (pid))

m = Manager()
ns = m.Namespace()
ns.is_running = True

num_workers = 5
p = Pool(num_workers)

for i in xrange(num_workers):
    p.apply_async(_worker, (ns,))

sleep(10)
print("Shutting down.")

ns.is_running = False
p.close()
p.join()

print("All workers joined.")

The output:

52893: Worker started.
52894: Worker started.
52895: Worker started.
52896: Worker started.
52897: Worker started.
Shutting down.
52894: Worker terminating.
52893: Worker terminating.
52895: Worker terminating.
52896: Worker terminating.
52897: Worker terminating.
All workers joined.

A namespace is very much like a bulletin board, where attributes can be assigned by one process, and read by others. This works for immutable values like strings and primitive types. Otherwise, updates can’t be tracked properly:

from multiprocessing import Manager

m = Manager()
ns = m.Namespace()

ns.test_value = 'original value'
ns.test_list = [5]

print("test_value (master, original): %s" % (ns.test_value))
print("test_list (master, original): %s" % (ns.test_list))

ns.test_value = 'new value'
ns.test_list.append(10)

print("test_value (master, updated): %s" % (ns.test_value))
print("test_list (master, updated): %s" % (ns.test_list))

Output:

test_value (master, original): original value
test_list (master, original): [5]
test_value (master, updated): new value
test_list (master, updated): [5]

Though not working for some types, namespaces are a terrific mechanism for sharing counters and flags between processes.

New Tesseract OCR C Library

Tesseract is a terrific, trainable (optionally) OCR library currently maintained by Google. However, the only currently-sufficient way to use it from Python is via python-tesseract (a third-party library), and it has two flaws.

The first flaw is that python-tesseract is based on SWIG, and it introduces a lot more code. The second is that the functions may not be functionally compatible. For example, Tesseract will let you iterate through a document by “level” (word, line, paragraph, block, etc..), and allow you to incorporate its layout analysis into your application. This is useful if you need to extract parts of a document based on proximity (or, possibly, location). However, python-tesseract does not currently let you iterate through parts of the document: GetIterator() does not accept a level argument.

So, as a first step to producing a leaner and more analogous Python library, I just released CTesseract: a C-based adapter shared-library that connects to the C++ Tesseract shared-library.

OCR a Document by Section

A previous post described how to use Tesseract to OCR a document to a single string. This post will describe how to take advantage of Tesseract’s internal layout processing to iterate through the documents sections (as determined by Tesseract).

This is the core logic to iterate through the document. Each section is referred to as a “paragraph”:

if(api->Recognize(NULL) < 0)
{
    api->End();
    pixDestroy(&image);

    return 3;
}

tesseract::ResultIterator *it = api->GetIterator();

do 
{
    if(it->Empty(tesseract::RIL_PARA))
        continue;

    char *para_text = it->GetUTF8Text(tesseract::RIL_PARA);

// para_text is the recognized text. It [usually] has a 
// newline on the end.

    delete[] para_text;
} while (it->Next(tesseract::RIL_PARA));

delete it;

To add some validation to the recognized content, we’ll also check the recognition confidence. In my experience, it seems like any recognition scoring less than 80% turned out as gibberish. In that case, you’ll have to do some preprocessing to remove some of the risk.

int confidence = api->MeanTextConf();
printf("Confidence: %d\n", confidence);
if(confidence < 80)
    printf("Confidence is low!\n");

The whole program:

#include <stdio.h>

#include <tesseract/baseapi.h>
#include <tesseract/resultiterator.h>

#include <leptonica/allheaders.h>

int main()
{
    tesseract::TessBaseAPI *api = new tesseract::TessBaseAPI();

    // Initialize tesseract-ocr with English, without specifying tessdata path.
    if (api->Init(NULL, "eng")) 
    {
        api->End();

        return 1;
    }

    // Open input image with leptonica library
    Pix *image;
    if((image = pixRead("receipt.png")) == NULL)
    {
        api->End();

        return 2;
    }
    
    api->SetImage(image);
    
    if(api->Recognize(NULL) < 0)
    {
        api->End();
        pixDestroy(&image);

        return 3;
    }

    int confidence = api->MeanTextConf();
    printf("Confidence: %d\n", confidence);
    if(confidence < 80)
        printf("Confidence is low!\n");

    tesseract::ResultIterator *it = api->GetIterator();

    do 
    {
        printf("=================\n");
        if(it->Empty(tesseract::RIL_PARA))
            continue;

        char *para_text = it->GetUTF8Text(tesseract::RIL_PARA);
        printf("%s", para_text);
        delete[] para_text;
    } while (it->Next(tesseract::RIL_PARA));
  
    delete it;

    api->End();
    pixDestroy(&image);

    return 0;
}

Applying this routine to an invoice (randomly found with Google), it is far easier to identify the address, tax, total, etc.. then with the previous method (which was naive about layout):

Confidence: 89
=================
Invoice |NV0010 '
To
Jackie Kensington
18 High St
Wickham
 Electrical
Sevices Limited

=================
Certificate Number CER/123-34 From
  1”” E'e°‘''°‘'’‘'Se”'°‘*’
17 Harold Grove, Woodhouse, Leeds,

=================
West Yorkshire, LS6 2EZ

=================
Email: info@ mj-e|ectrcia|.co.uk

=================
Tel: 441132816781

=================
Due Date : 17th Mar 2012

=================
Invoice Date : 16th Feb 2012

=================
Item Quantity Unit Price Line Price
Electrical Labour 4 £33.00 £132.00

=================
Installation carried out on flat 18. Installed 3 new
brass effect plug fittings. Checked fuses.
Reconnected light switch and fitted dimmer in living
room. 4 hours on site at £33 per hour.

=================
Volex 13A 2G DP Sw Skt Wht Ins BB Round Edge Brass Effect 3 £15.57 £46.71
Volex 4G 1W 250W Dimmer Brushed Brass Round Edge 1 £32.00 £32.00
Subtotal £210.71

=================
VAT £42.14

=================
Total £252.85

=================
Thank you for your business — Please make all cheques payable to ‘Company Name’. For bank transfers ‘HSBC’, sort code
00-00-00, Account no. 01010101.
MJ Electrical Services, Registered in England & Wales, VAT number 9584 158 35
 Q '|'.~..a::

OCR a Document with C++

There is an OCR library developed by HP and maintained by Google called Tesseract. It works immediately, and does not require training.

Building it is trivial. What’s more trivial is just installing it from packages:

$ sudo apt-get install libtesseract3 libtesseract-dev
$ sudo apt-get install liblept3 libleptonica-dev
$ sudo apt-get install tesseract-ocr-eng

Note that this installs the data for recognizing English.

Now, go and get the example code from the Google Code wiki for the project, and paste it into a file called ocr-test.cpp . Also, right-click and save the example document image (a random image I found with Google). You don’t have to use this particular document, as long as what is used is sufficiently clear at a high-enough resolution (the example is about 1500×2000).

Now, change the location of the file referred-to by the example code:

Pix *image = pixRead("letter.jpg");

Compile/link it:

$ g++ -o ocr-test ocr-test.cpp -ltesseract -llept

Run the example:

./ocr-test

You’re done. The following will be displayed:

OCR output:
fie’  1?/2440
Brussels,
BARROSO (2012) 1300171
BARROSO (2012)

Dear Lord Tugendhat.

Thank you for your letter of 29 October and for inviting the European Commission to

contribute in the context of the Economic Aflairs Committee's inquiry into "The

Economic lmplicationsfirr the United Kingdom of Scottish Independence ".

The Committee will understand that it is not the role of the European Commission to

express a position on questions of internal organisation related to the constitutional

arrangements of a particular Member State.

Whilst refraining from comment on possible fitture scenarios. the European Commission

has expressed its views in general in response to several parliamentary questions from

Members of the European Parliament. In these replies the European Commission has 
noted that scenarios such as the separation of one part of a Member State or the creation 
of a new state would not be neutral as regards the EU Treaties. The European 
Commission would express its opinion on the legal consequences under EU law upon ;
requestfiom a Member State detailing a precise scenario. :
The EU is founded on the Treaties which apply only to the Member States who have 3
agreed and ratified them. if part of the territory of a Member State would cease to be ,
part of that state because it were to become a new independent state, the Treaties would

no longer apply to that territory. In other words, a new independent state would, by the
fact of its independence, become a third country with respect to the E U and the Treaties

would no longer apply on its territory. ‘

../.

The Lord TUGENDHAT
Acting Chairman

House of Lords q
Committee Oflice
E-mail: economicaflairs@par1igment.ttk

Cheap and Easy Virtual Machine Containers with LXC

The thing that most virtual-machine solutions have in common is the overhead required to run them. There’s a substantial resource cost for the management of these containers, aside from that for the machine itself.

There’s an alternative to all of that, called LXC. In fact, though it used to require an impossible number of steps to configure, it’s now sufficient to just pass in a couple of command-line parameters and leave everything else to defaults.

LXC is somewhere between chroot and QEMU. It imposes resource control using the cgroups functionality that comes packaged in the kernel, which is, essentially, the next evolution of ulimit. Although the resource-control is somewhat disabled by default, you can set limits even so far as disk I/O rates.

It’s important to know that, though LXC works terrifically, it should only be used in either personal systems or any other system that’s sufficiently fenced-off from outside threats. This is because it doesn’t benefit from 100% isolation like most VM’s do (a tradeoff for its lightweightedness). An example of this is that the container shares the same sysfs as the host, due to limitations in sysfs. Therefore, changing sysfs from the container will affect the host.

Though there are security concerns, I have been told authoritatively that there is a less likely chance of a rogue application causing issues for the larger host than any other critical problem that systems usually encounter in production. So, a couple of built-in security concessions are the only plausible risks.

System Containers

There’s an easy way and a hard way to create system containers. The hard way is to create and populate it with all of the configuration that is required of any new system (see here). The easy way is to simply use the “lxc-create” tool and tell it to follow a template.

These are the templates available in my installation:

$ ls -1 /usr/share/lxc/templates
alpine
altlinux
archlinux
busybox
debian
fedora
opensuse
oracle
sshd
ubuntu
ubuntu-cloud

You can only use a template that’s compatible with the system on which you are working. Otherwise, you’ll find that “yum”, for instance, is missing if you try to build a Fedora instance on Ubuntu, as well as categorically-similar issues with the other templates. On my Ubuntu, I can create containers with the “busybox” (which creates instantaneously), “debian” and “ubuntu” (7 minutes), “ubuntu-cloud” (6 minutes), and “sshd” (see below) templates. Note that any required, downloaded images are cached, and subsequent builds only take a minute or two.

The current steps to build an Ubuntu container (from my Ubuntu box, after installing the lxc package):

$ sudo lxc-create -t ubuntu -n <container name>

Depending on the template, you might see something like this upon completion:

# The default user is 'ubuntu' with password 'ubuntu'!
# Use the 'sudo' command to run tasks as root in the container.

I named my container “ubuntu-2”. The container directories get created in /var/lib/lxc, and have a reasonable size:

$ ls -l /var/lib/lxc
total 4
drwxr-xr-x 3 root root 4096 Nov  6 02:32 ubuntu-2

$ sudo du -sh /var/lib/lxc/ubuntu-2/
263M    /var/lib/lxc/ubuntu-2/

To start the container as a daemon:

$ sudo lxc-start -n ubuntu-2 -d

Or, to start the container as a foreground machine, complete with console (using another, BusyBox-based, container, which shows this better):

$ sudo lxc-start -n busybox-1
udhcpc (v1.20.2) started
Sending discover...
Sending select for 10.0.3.79...
Lease of 10.0.3.79 obtained, lease time 3600

Please press Enter to activate this console. 

To list the currently-running containers:

$ sudo lxc-ls --fancy
NAME       STATE    IPV4        IPV6  AUTOSTART  
-----------------------------------------------
busybox-1  STOPPED  -           -     NO         
debian-1   RUNNING  10.0.3.247  -     NO         
ubuntu-1   RUNNING  10.0.3.217  -     NO         
ubuntu-2   RUNNING  10.0.3.249  -     NO         

Very cool. To connect via SSH:

$ ssh ubuntu@10.0.3.249
The authenticity of host '10.0.3.249 (10.0.3.249)' can't be established.
ECDSA key fingerprint is 73:8c:31:53:76:36:93:6e:59:ee:3f:d3:6f:27:13:c7.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '10.0.3.249' (ECDSA) to the list of known hosts.
ubuntu@10.0.3.249's password: 

The programs included with the Ubuntu system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by
applicable law.

Welcome to Ubuntu 13.04 (GNU/Linux 3.8.0-30-generic i686)

 * Documentation:  https://help.ubuntu.com/
ubuntu@ubuntu-2:~$ 

To stop the container:

$ sudo lxc-stop -n ubuntu-2

If you don’t want, or need, to build a new system, you can also configure an “ssh container”, where a read-only mount of the current filesystem is combined with an SSH server to create the facade of a separate machine instance. It’s unclear whether there’s a provision to allow changes (such as implementing a ramdisk to produce the illusion of a read-write experience similar to a disc-based “live” Linux distribution).

Application Containers

In addition to hosting “system” containers, LXC can also host “application” containers. Quite obviously, the latter simply host applications with all of the benefits of the resource-control that we’ve already mentioned, as well as, most likely, its security limitations.

$ sudo lxc-execute -n <container name> <command>

You might see an error like the following:

$ sudo lxc-execute -n app_container_1 touch /tmp/aa
lxc-execute: Permission denied - failed to change apparmor profile to lxc-container-default
lxc-execute: invalid sequence number 1. expected 4
lxc-execute: failed to spawn 'app_container_1'

The workaround is:

$ cat > test.conf <<EOF
lxc.aa_profile = unconfined
lxc.rootfs = /
EOF

$ sudo lxc-execute -f test.conf -n app-container-1 touch /tmp/aa

When the application container launches, you’ll be able to see it in the lxc-ls list (above). You’ll also be able to find it in the ps list. Obviously the command-above just touches a file before returning, so it won’t be alive long-enough for you to be able to see it running.

Development Support

Naturally, everything we’ve mentioned can be done from code (Python, Lua, and Go, currently). This is a Python example mentioned on the LXC homepage (whose link was at the beginning of the article):

import lxc
container = lxc.Container("p1")
container.create("ubuntu")
container.start()
container.get_ips()
container.stop()

As mentioned, LXC isn’t the right-kind of container for serving from the DMZ in a corporate environment, but it is awesome as a fast, easily-constructed, hold-no-prisoners system container, where you want to run a dozen on a commodity box with minimal resource consumption.

Better Resource Throttling for Processes with cgroups

Traditionally, the only solution for resource control for processes has been ulimit. It allows you a short list of constraints that apply to every process in the system:

$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 30890
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 30890
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

However, this does nothing for per-process settings or anything related to throttling velocities. This is where the cgroups kernel extension comes in. In addition to the above, it can even allow hierarchies of settings to be defined (among many other features).

Though cgroups can be used in everyday system administration, it is also used in projects such as LXC to produce light-weight “system” and “application” containers, where the former simulates a virtual-machine with less overhead (and slightly less isolation), and the latter can run individual processes.

As a comparison between the granularity of the ulimit settings (above) and the cgroups settings, here is a list of the latter:

blkio.io_merged
blkio.io_queued
blkio.io_service_bytes
blkio.io_serviced
blkio.io_service_time
blkio.io_wait_time
blkio.reset_stats
blkio.sectors
blkio.throttle.io_service_bytes
blkio.throttle.io_serviced
blkio.throttle.read_bps_device
blkio.throttle.read_iops_device
blkio.throttle.write_bps_device
blkio.throttle.write_iops_device
blkio.time
blkio.weight
blkio.weight_device
cgroup.clone_children
cgroup.event_control
cgroup.procs
cpuacct.stat
cpuacct.usage
cpuacct.usage_percpu
cpu.cfs_period_us
cpu.cfs_quota_us
cpu.rt_period_us
cpu.rt_runtime_us
cpuset.cpu_exclusive
cpuset.cpus
cpuset.mem_exclusive
cpuset.mem_hardwall
cpuset.memory_migrate
cpuset.memory_pressure
cpuset.memory_pressure_enabled
cpuset.memory_spread_page
cpuset.memory_spread_slab
cpuset.mems
cpuset.sched_load_balance
cpuset.sched_relax_domain_level
cpu.shares
cpu.stat
devices.allow
devices.deny
devices.list
hugetlb.2MB.failcnt
hugetlb.2MB.limit_in_bytes
hugetlb.2MB.max_usage_in_bytes
hugetlb.2MB.usage_in_bytes
memory.failcnt
memory.force_empty
memory.limit_in_bytes
memory.max_usage_in_bytes
memory.memsw.failcnt
memory.memsw.limit_in_bytes
memory.memsw.max_usage_in_bytes
memory.memsw.usage_in_bytes
memory.move_charge_at_immigrate
memory.oom_control
memory.soft_limit_in_bytes
memory.stat
memory.swappiness
memory.usage_in_bytes
memory.use_hierarchy
notify_on_release
release_agent
tasks

Though cgroups just sits behind a sysfs interface (/sys/fs/cgroup), and settings are configured and processes are enrolled by simply echoing them into the right interface files, a detailed description of how to configure cgroups lays outside the scope of this article. You might go here for more info.

Create a Development OpenStack Instance in Two Steps

OpenStack is the result of collaboration between Rackspace and NASA. As of this year, it’s the hottest cloud platform available. The whole thing is API driven (including a subset of APIs that are Amazon compatible). It’s also built on Python, so extensibility comes packaged.

Though building a cloud is a timely task, developers have DevStack: a rapid utility with which to build a full, development (read: non-secure) OpenStack instance on a single box. As of right now, it finishes downloading and configuring in about seven-minutes on commodity hardware. Since they build on the principle of Git-clones rather than packaging, there are no/minimal dependency problems, nor are there any compilations. It’s a joy.

Installation steps:

git clone https://github.com/openstack-dev/devstack.git
cd devstack && ./stack.sh

The only input required during the standard set-up is a few passwords near the beginning. Officially, Ubuntu 12.04 (Precise), Fedora 18, and CentOS/RHEL 6.4 are supported, at this time, but I was effortlessly able to build on Ubuntu 13.04, as well.

Once a few hundred megabytes have been downloaded and configured, the process will output something similar to the following and quit. The platform will be started and good to go to. You might try to deploy a machine instance using one of the packaged images. You shouldn’t have any problem, and, if you’re familiar with AWS, you will already know the steps.

Horizon is now available at http://192.168.5.13/
Keystone is serving at http://192.168.5.13:5000/v2.0/
Examples on using novaclient command line is in exercise.sh
The default users are: admin and demo
The password: test
This is your host ip: 192.168.5.13
stack.sh completed in 422 seconds.

Horizon is the dashboard. The web-server is going to be running on :80, so if you already have a web-server running there, then stop it first.

OpenStack - 1 - Login

The dashboard will have two default users: “admin” and “demo”. Their password will be the password you gave during initial startup.

To stop the cloud, run “./unstack.sh”.

For a walkthrough of what the setup (“stack.sh”) script does, go here.

Selected Screenshots

OpenStack - 2 - Hypervisors

OpenStack - 3 - Instances

OpenStack - 3 - Instance Details

OpenStack - 4 - Flavors

OpenStack - 5 - Images

OpenStack - 6 - System Info 3

OpenStack - 6 - System Info 2

OpenStack - 6 - System Info 1