Using etcd as a Clusterized, Immediately-Consistent Key-Value Storage

The etcd project was one of the first popular, public platforms built on the Raft algorithm (a relatively simple consensus algorithm, used to allow several nodes to remain in sync). Raft represents a shift away from its predecessor, Paxos, which is considerably more difficult to understand, and usually requires shortcuts to implement. As an added bonus, etcd is also implemented in Go.

etcd looks and smells like every other KV store, with three especially-notable differences:

  • You can maintain a heirarchy of keys.
  • You can long-poll for changes on keys.
  • Distributed-locks are built-in.

We’re going to use Python’s etcd package (project is here). This package presents a very intuitive interface that completely manages responses from the server and is built in such a way that future API changes should be backward-compatible (to within reason). These things are important, as other clients have historically allowed the application too much direct access to the actual server requests, and left too much of the interpretation of the responses to the application as well.

To connect the client (assuming the same machine with the default port):

from etcd import Client

c = Client()

To set a value:

c.node.set('/test/key', 5)

To get a value:

r = c.node.get('/test/key')
print(r.node.value)

Which outputs:

5

To wait on a value to change, run this from another terminal:

r = c.node.wait('/test/key')

Try setting the node to something else using a command similar to before. The wait call will return with the same result as the instance of the client that actually made the request.

To work with distributed locks, just wrap the code that needs to be synchronized in a with statements:

with c.module.lock.get_lock('test_lock_1', ttl=10):
    print("In lock 1.")

It’s worth mentioning that the response objects have a consistent and informative interface no matter what the operation. You can see a number properties just by printing it. This is from the set operation above:

<RESPONSE: <NODE(ResponseV2AliveNode) [set] [/test/key] IS_HID=[False] IS_DEL=[False] IS_DIR=[False] IS_COLL=[False] TTL=[None] CI=(2) MI=(2)>>

This is from the get operation:

<RESPONSE: <NODE(ResponseV2AliveNode) [get] [/test/key] IS_HID=[False] IS_DEL=[False] IS_DIR=[False] IS_COLL=[False] TTL=[None] CI=(2) MI=(2)>>

I’ll omit the examples of working with heirarchical keys because the functionality is every bit as intuitive as it should be.

There’s a lot of functionality in the Python etcd package, but it’s built to be lightweight and obvious. The GitHub page is extremely thorough, and the API is also completely documented at ReadTheDocs.