Installing Xcode Command Line Tools for Mavericks (Problems)

I had a perfectly running development environment under Mavericks 10.9.1 . I’m not, by nature, someone who would prefer to use a Mac, but sometimes we have to take what we’re given, and, at least, it’s Unix-based.

I was surviving without having to install Xcode, until recently when I had to investigate Apple’s illegally-modified “pngcrush” utility. I required Xcode in order to get the iPhone optimizations. Otherwise, I just got the standard version of the open-source utility. So, I installed it.

Yesterday, I had to install/build the Python “cryptography” module, which requires a C build-environment. Now, I had some cc/gcc discrepancies, and one unsupported command-line argument. Obviously, it’s Xcode. So, I uninstalled it. I innocently also installed the 10.9.1->10.9.2 Mavericks update at the same time.

Catastrophe. Now, the same stuff is broken, and I get warnings every single time I invoke Brew:

Warning: No developer tools installed.
You should install the Command Line Tools.
Run `xcode-select --install` to install them.

When I run xcode-select, I get a dialog that says the command-line tools are required, one button for installing them, and another for the full Xcode install. When I click to install the tools, I got the EULA and then a progress-bar that said “Finding Software”, only to give me a message:

Can't install the software because it is not currently available from the Software Update server.

I had to physically go and download the dmg package: http://developer.apple.com/downloads

However, when I dragged the pkg file into Applications and ran it, I inevitably ran into the following message, every time:

The installation failed.

The Installer can't locate the data it needs to install the software. Check your install media or Internet connection and try again, or contact the software manufacturer for assistance.

It turns out that it expected to be run directly from the dmg container. It looks like everything is working now (with only the command-line tools, and not requiring the whole Xcode install).

Though I’m still investigating the build errors I now having, the emphasis of this post is how to remedy the Xcode/tools errors that I was seeing.

Tool to Quickly Create Upstart Jobs

Upstart is a monumental improvement over the classical SysV mechanism for Unix/Linux process/daemon management. Still, it’s a somewhat manual process to create jobs. I’ve previously written about the Upstart library that provides the ability to start and stop jobs (using D-Bus), as well as build jobs.

However, the Upstart library also provides two command-line tools:

  • upstart-create: Create Upstart jobs using reasonable defaults.
  • upstart-reload: Send a signal to Upstart to reload jobs.

Of particular note is the first tool. It’ll take a couple of options, and write a new job file (in /etc/init). The example from the project website (which displays to the screen rather than write a job file):

$ upstart-create test-job /bin/sh -j -d "some description" -a "some author "
description "some description"
author "some author "
exec /bin/sh
start on runlevel [2345]
stop on runlevel [016]
respawn 

Reading Keypresses Under Python

An elegant solution for reading a individual keypresses under Python.

import termios, sys, os

def read_keys():
    fd = sys.stdin.fileno()
    old = termios.tcgetattr(fd)
    new = termios.tcgetattr(fd)
    new[3] = new[3] & ~termios.ICANON & ~termios.ECHO
    new[6][termios.VMIN] = 1
    new[6][termios.VTIME] = 0
    termios.tcsetattr(fd, termios.TCSANOW, new)
    try:
        while 1:
            yield os.read(fd, 1)
    finally:
        termios.tcsetattr(fd, termios.TCSAFLUSH, old)

Example:

>>> for key in read_keys():
...   print("KEY: %s" % (key))
... 
KEY: g
KEY: i
KEY: f
KEY: d
KEY: s
KEY: w
KEY: e

Inspired by this.

Simplified Protocol Buffers for Socket Communication

Protocol Buffers (“protobuf”) is a Google technology that lets you define messages declaratively, and then build library code for a myriad of different programming-languages. The way that messages are serialized is efficient and effortless, and protobuf allows for simple string assignment (without predefining a length), arrays and optional values, and sub-messages.

The only tough part comes during implementation. As protobuf is only concerned with serialization/unserialization, it’s up to you to deal with the logistics of sending the message, and this means that, for socket communication, you often have to:

  1. Copy and paste the code to prepend a length.
  2. Copy/paste/adapt existing code that embeds a type-identifier on outgoing requests, and reads the type-identifier on incoming requests in order to automatically handle/route messages (if this is something that you want, which I often do).

This quickly becomes redundant and mundane, and it’s why we’re about to introduce protobufp (“Protocol Buffers Processor”).

We can’t improve on the explanation on the project-page. Therefore, we’ll just provide the example.

We’re going to build some messages, push into a StringIO-based byte-stream (later to be whatever type of stream you wish), read them into the protobufp “processor” object, and retrieve one fully-unserialized message at a time until depleted:

from test_msg_pb2 import TestMsg

from protobufp.processor import Processor

def get_random_message():
    rand = lambda: randint(11111111, 99999999)

    t = TestMsg()
    t.left = rand()
    t.center = "abc"
    t.right = rand()

    return t

messages = [get_random_message() for i in xrange(5)]

Create an instance of the processor, and give it a list of valid message-types (the order of this list should never change, though you can append new types to the end):

msg_types = [TestMsg]
p = Processor(msg_types)

Use the processor to serialize each message and push them into the byte-stream:

s = StringIO()

for msg in messages:
    s.write(p.serializer.serialize(msg))

Feed the data from the byte stream into the processor (normally, this might be chunked-data from a socket):

p.push(s.getvalue())

Pop one decoded message at a time:

j = 0
while 1:
    in_msg = p.read_message()
    if in_msg is None:
        break

    assert messages[j].left == in_msg.left
    assert messages[j].center == in_msg.center
    assert messages[j].right == in_msg.right

    j += 1

Now there’s one less annoying task to distract you from your critical path.

Creating and Controlling OS Services from Python

One important deployment task of server software is to not only deploy the software and then start it, but to enable it to be automatically started and monitored by the OS at future reboots. The most modern solution for this type of management is Upstart. You access Upstart every time you call “sudo service apache2 restart”, and whatnot. Upstart is sponsored by Ubuntu (more specifically, Canonical).

Upstart configs are located in /etc/init (we’re slowly, slowly approaching the point where we might one day be able to get rid of the System-V init scripts, in /etc/init.d). To create a job, you drop a “xyz.conf” file into /etc/init, and Upstart should automatically become aware of it via inotify. To query Upstart (including starting and stopping jobs), you emit a D-Bus message.

So, what about elegantly automating the creation of a job for the service from your Python deployment code? There is exactly one solution for doing so, and it’s a Swiss Army Knife for such a task.

We’re going to use the Python upstart library to build a job and then write it (in fact, we’re just going to share one of their examples, for your convenience). The library also allows for listing the jobs on the system, getting statuses, and starting/stopping jobs, among other things, but we’ll leave it to you to experiment with this, when you’re ready.

Build a job that starts and stops on the normal run-levels, respawns when it terminates, and runs a single command (a non-forking process, otherwise we’d have to add the ‘expect’ stanza as well):

from upstart.job import JobBuilder

jb = JobBuilder()

# Build the job to start/stop with default runlevels to call a command.
jb.description('My test job.').\
   author('Dustin Oprea <dustin@nowhere.com>').\
   start_on_runlevel().\
   stop_on_runlevel().\
   run('/usr/bin/my_daemon')

with open('/etc/init/my_daemon.conf', 'w') as f:
    f.write(str(jb))

Remember to run this as root. The job output looks like this:

description "My test job."
author "Dustin Oprea <dustin@nowhere.com>"
start on runlevel [2345]
stop on runlevel [016]
respawn 
exec /usr/bin/my_daemon

Inspecting JSON at the Command-Line

This is a simple tool to pull specific values out of JSON, or to pull JSON from JSON, at the command-line:

JsonPare

It’s useful to pull configuration values from within a Bash script.

Example data:

{"a": [9, 6, {"b": [99, 88, 77, "text", 55]}]}

Example commands:

$ cat example.json | jp a.2.b.3
"text"

$ cat example.json | jp a.2 | jp b.3
"text"

$ cat example.json | jp a.2 | jp -p b.3
text

ZFS for Volume Management and RAID

ZFS is an awesome filesystem, developed by Sun and ported to Linux. Although not distributed, it emphasizes durability and simplicity. It’s essentially an alternative to the common combination of md and LVM.

I’m not going to actually go into a RAID configuration, here, but the following should be intuitive-enough to send you on your way. I’m using Ubuntu 13.10 .

$ sudo apt-get install zfs-fuse 
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Suggested packages:
  nfs-kernel-server kpartx
The following NEW packages will be installed:
  zfs-fuse
0 upgraded, 1 newly installed, 0 to remove and 34 not upgraded.
Need to get 1,258 kB of archives.
After this operation, 3,302 kB of additional disk space will be used.
Get:1 http://us.archive.ubuntu.com/ubuntu/ saucy/universe zfs-fuse amd64 0.7.0-10.1 [1,258 kB]
Fetched 1,258 kB in 1s (750 kB/s)   
Selecting previously unselected package zfs-fuse.
(Reading database ... 248708 files and directories currently installed.)
Unpacking zfs-fuse (from .../zfs-fuse_0.7.0-10.1_amd64.deb) ...
Processing triggers for ureadahead ...
Processing triggers for man-db ...
Setting up zfs-fuse (0.7.0-10.1) ...
 * Starting zfs-fuse zfs-fuse                                                                                               [ OK ] 
 * Immunizing zfs-fuse against OOM kills and sendsigs signals...                                                            [ OK ] 
 * Mounting ZFS filesystems...                                                                                              [ OK ] 
Processing triggers for ureadahead ...

$ sudo zpool list
no pools available

$ dd if=/dev/zero of=/home/dustin/zfs1.part bs=1M count=64
64+0 records in
64+0 records out
67108864 bytes (67 MB) copied, 0.0588473 s, 1.1 GB/s

$ sudo zpool create zfs_test /home/dustin/zfs1.part 

$ sudo zpool list
NAME       SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
zfs_test  59.5M    94K  59.4M     0%  1.00x  ONLINE  -

$ sudo dd if=/dev/zero of=/zfs_test/dummy_file bs=1M count=10
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 1.3918 s, 7.5 MB/s

$ ls -l /zfs_test/
total 9988
-rw-r--r-- 1 root root 10485760 Mar  7 21:51 dummy_file

$ sudo zpool list
NAME       SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
zfs_test  59.5M  10.2M  49.3M    17%  1.00x  ONLINE  -

$ sudo zpool status zfs_test
  pool: zfs_test
 state: ONLINE
 scrub: none requested
config:

	NAME                      STATE     READ WRITE CKSUM
	zfs_test                  ONLINE       0     0     0
	  /home/dustin/zfs1.part  ONLINE       0     0     0

errors: No known data errors

So, now we have one pool with one disk. However, ZFS also allows hot reconfiguration. Add (stripe) another disk to the pool:

$ dd if=/dev/zero of=/home/dustin/zfs2.part bs=1M count=64
64+0 records in
64+0 records out
67108864 bytes (67 MB) copied, 0.0571095 s, 1.2 GB/s

$ sudo zpool add zfs_test /home/dustin/zfs2.part 
$ sudo zpool status zfs_test
  pool: zfs_test
 state: ONLINE
 scrub: none requested
config:

	NAME                      STATE     READ WRITE CKSUM
	zfs_test                  ONLINE       0     0     0
	  /home/dustin/zfs1.part  ONLINE       0     0     0
	  /home/dustin/zfs2.part  ONLINE       0     0     0

errors: No known data errors

$ sudo dd if=/dev/zero of=/zfs_test/dummy_file2 bs=1M count=70
70+0 records in
70+0 records out
73400320 bytes (73 MB) copied, 12.4728 s, 5.9 MB/s

$ sudo zpool list
NAME       SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
zfs_test   119M  80.3M  38.7M    67%  1.00x  ONLINE  -

I should mention that there is some diskspace overhead, or, at least, some need for explicitly optimizing the disk (if possible). Though I assigned two 64M “disks” to the pool, I received “out of space” errors when I first wrote a 10M file and then attempted to write a 80M file. It was successful when I chose to write a 70M file, instead.

You can also view IO stats:

$ sudo zpool iostat -v zfs_test
                             capacity     operations    bandwidth
pool                      alloc   free   read  write   read  write
------------------------  -----  -----  -----  -----  -----  -----
zfs_test                  80.5M  38.5M      0     11    127   110K
  /home/dustin/zfs1.part  40.4M  19.1M      0      6    100  56.3K
  /home/dustin/zfs2.part  40.1M  19.4M      0      5     32  63.0K
------------------------  -----  -----  -----  -----  -----  -----

For further usage examples, look at these tutorials: