Doing Proper Encryption with Rijndael using [Pure] Python

When it comes to writing code that portably encrypts and decrypts without concern for the platform, you’ll have to overcome a couple of small factors with Python:

  • You’ll have to install a cryptography library
  • You’ll need code that doesn’t need a build environment (no C-code)

I personally prefer Rijndael (also called “AES” when you constrain yourself to certain key-sizes and/or block-sizes) as an affordable, ubiquitous algorithm. There was a remarkable amount of difficulty in finding a pure-Python algorithm that worked in both Python 2 and Python 3.

By mentioning “proper” encryption in the title, I was referring to “key derivation” (also called “key expansion”). It’s not enough to have a password and an algorithm. Unless you already have a sufficiently random key of exactly the right number of bits, you’ll need to submit the password to a key-expansion algorithm in order to generate a key of the right length. This is often done using the PBKDF2 algorithm.

It was difficult to find a Python 2 and 3 version of PBKDF2 as well.

I’ve posted the pprp package, which packages Python 2 and Python 3 versions of Rijndael and PBKDF2 and automatically chooses the right version during module-load. As I implement PKCS7 for padding, I also include a utility function to trim the padding.

The main documentation provides the following as an example:

import io
import os.path
import hashlib

import pprp

def trans(text):
    return text.encode('ASCII') if sys.version_info[0] >= 3 else text

passphrase = trans('password')
salt = trans('salt')
block_size = 16
key_size = 32
data = "this is a test" * 100

key = pprp.pbkdf2(passphrase, salt, key_size)

def source_gen():
    for i in range(0, len(data), block_size):
        block = data[i:i + block_size]
        len_ = len(block)

        if len_ > 0:
            yield block.encode('ASCII')

        if len_ < block_size:
            break

# Pump the encrypter output into the decrypter.
encrypted_gen = pprp.rjindael_encrypt_gen(key, source_gen(), block_size)

# Run, and sink the output into an IO stream. Trim the padding off the last 
# block.

s = io.BytesIO()
ends_at = 0
for block in pprp.rjindael_decrypt_gen(key, encrypted_gen, block_size):
    ends_at += block_size
    if ends_at >= len(data):
        block = pprp.trim_pkcs7_padding(block)

    s.write(block)

decrypted = s.getvalue()
assert data == decrypted.decode('ASCII')

Generating Signatures with 25519 Curves in Python

It’s not very often that you need to do encryption in pure Python. Obviously, you’d want to have this functionality off-loaded to C-routines whenever possible. However, in some situations, often when you require code portability or that there not be any C-compilation phase, a pure-Python implementation can be convenient.

The 25519 elliptic curve is a special curve defined by Bernstein, rather than NIST. It is fairly well-embraced as an alternative to NIST-generated curves, which are assumed to be either influenced or compromised by the NSA. See Aris’ notable 25519-patch contribution to OpenSSH.

There exists a pure-Python implementation of 25519 at this site, but it’s too expensive to use in a high-volume platform (because of the math involved). However, it’s perfect if it’s to be driven by infrequent events, and a delay of a few seconds is fine. The benefits of an EC, if you don’t already know, is that the factors are usually considerably smaller for strength comparable to traditional, non-EC algorithms. Therefore, your trade-off for a couple of more seconds of computation is an enigmatically-shorter signature.

To use it, grab the Python source, included here for posterity:

import hashlib

b = 256
q = 2**255 - 19
l = 2**252 + 27742317777372353535851937790883648493

def H(m):
  return hashlib.sha512(m).digest()

def expmod(b,e,m):
  if e == 0: return 1
  t = expmod(b,e/2,m)**2 % m
  if e & 1: t = (t*b) % m
  return t

def inv(x):
  return expmod(x,q-2,q)

d = -121665 * inv(121666)
I = expmod(2,(q-1)/4,q)

def xrecover(y):
  xx = (y*y-1) * inv(d*y*y+1)
  x = expmod(xx,(q+3)/8,q)
  if (x*x - xx) % q != 0: x = (x*I) % q
  if x % 2 != 0: x = q-x
  return x

By = 4 * inv(5)
Bx = xrecover(By)
B = [Bx % q,By % q]

def edwards(P,Q):
  x1 = P[0]
  y1 = P[1]
  x2 = Q[0]
  y2 = Q[1]
  x3 = (x1*y2+x2*y1) * inv(1+d*x1*x2*y1*y2)
  y3 = (y1*y2+x1*x2) * inv(1-d*x1*x2*y1*y2)
  return [x3 % q,y3 % q]

def scalarmult(P,e):
  if e == 0: return [0,1]
  Q = scalarmult(P,e/2)
  Q = edwards(Q,Q)
  if e & 1: Q = edwards(Q,P)
  return Q

def encodeint(y):
  bits = [(y >> i) & 1 for i in range(b)]
  return ''.join([chr(sum([bits[i * 8 + j] << j for j in range(8)])) for i in range(b/8)])

def encodepoint(P):
  x = P[0]
  y = P[1]
  bits = [(y >> i) & 1 for i in range(b - 1)] + [x & 1]
  return ''.join([chr(sum([bits[i * 8 + j] << j for j in range(8)])) for i in range(b/8)])

def bit(h,i):
  return (ord(h[i/8]) >> (i%8)) & 1

def publickey(sk):
  h = H(sk)
  a = 2**(b-2) + sum(2**i * bit(h,i) for i in range(3,b-2))
  A = scalarmult(B,a)
  return encodepoint(A)

def Hint(m):
  h = H(m)
  return sum(2**i * bit(h,i) for i in range(2*b))

def signature(m,sk,pk):
  h = H(sk)
  a = 2**(b-2) + sum(2**i * bit(h,i) for i in range(3,b-2))
  r = Hint(''.join([h[i] for i in range(b/8,b/4)]) + m)
  R = scalarmult(B,r)
  S = (r + Hint(encodepoint(R) + pk + m) * a) % l
  return encodepoint(R) + encodeint(S)

def isoncurve(P):
  x = P[0]
  y = P[1]
  return (-x*x + y*y - 1 - d*x*x*y*y) % q == 0

def decodeint(s):
  return sum(2**i * bit(s,i) for i in range(0,b))

def decodepoint(s):
  y = sum(2**i * bit(s,i) for i in range(0,b-1))
  x = xrecover(y)
  if x & 1 != bit(s,b-1): x = q-x
  P = [x,y]
  if not isoncurve(P): raise Exception("decoding point that is not on curve")
  return P

def checkvalid(s,m,pk):
  if len(s) != b/4: raise Exception("signature length is wrong")
  if len(pk) != b/8: raise Exception("public-key length is wrong")
  R = decodepoint(s[0:b/8])
  A = decodepoint(pk)
  S = decodeint(s[b/8:b/4])
  h = Hint(encodepoint(R) + pk + m)
  if scalarmult(B,S) != edwards(R,scalarmult(A,h)):
    raise Exception("signature does not pass verification")

These routines are not Python 3 compatible.

For reference, other implementations in other languages can be found at the Wikipedia page, under “External links”.

To implement the Bernstein version above, you must first generate a signing (private) key. This comes from a random 32-bit number with some bits flipped:

import os

signing_key_bytes = [ord(os.urandom(1)) for i in range(32)]

signing_key_bytes[0] &= 248
signing_key_bytes[31] &= 127
signing_key_bytes[31] |= 64

signing_key = ''.join([chr(x) for x in signing_key_bytes])

To produce the signature, and then check it:

import ed25519

message = "some data"

public_key = ed25519.publickey(signing_key)

signature = ed25519.signature(message, signing_key, public_key)
ed25519.checkvalid(signature, message, public_key)

The signing and the ensuing check take about six-seconds, together.

The ed25519 module is the one whose source is displayed above. The name “Ed25519” (the official name of the algorithm) stands for “Edwards Curve”, the mathematical foundation on which Bernstein designed the 25519 curve.

Python Function Annotations

Python 3.x introduced a relatively unknown feature called “function annotations”. This introduces a way to tag parameters and your return value with arbitrary information at the function-definition level.

You can annotate using strings or any other type that you’d like:

>>> def some_function(parm1: "Example parameter"):
...   pass
... 
>>> some_function.__annotations__
{'parm1': 'Example parameter'}
>>> x = 5
>>> def some_function_2(parm1: x * 20):
...   pass
... 
>>> some_function_2.__annotations__
{'parm1': 100}

You can also annotate the return:

>>> def some_function_3() -> 'return-value tag':
...   pass
... 
>>> some_function_3.__annotations__
{'return': 'return-value tag'}

It’s important to note that there are already strong conventions in how to document your parameters, thanks to Sphinx. Therefore, the utility of annotations will most likely be entirely in terms of functionality. For example, you can annotate closures on-the-fly:

import random

c_list = []
for i in range(10):
    def closure_() -> random.random():
        pass

    c_list.append(closure_)

list(map(lambda x: print(x.__annotations__), c_list))

This is the output:

{'return': 0.9644971188983055}
{'return': 0.8639746158842893}
{'return': 0.18610468531065305}
{'return': 0.8528801446167985}
{'return': 0.3022338513329076}
{'return': 0.6455491244718428}
{'return': 0.09106740460937834}
{'return': 0.16987808849543917}
{'return': 0.9136478506241527}
{'return': 0.41691681086623544}

Absolutely nothing in the Python language or library is dependent on annotations, so they’re yours to play with or implement as you see fit.

Creating Those Neat OpenSSH RSA Public Keys in OpenSSL (along with DSA and ECDSA Keys)

Not too long ago, I wanted to know how to generate the public-keys that OpenSSH generates, as a transitional step to understanding how to decode them. In this fashion, I might apply the knowledge, some day, to a dynamically configurable SSH platform to allow a client to connect via SSH and to authenticate against a database, such as with Github and Bitbucket.

What I did not learn while painfully trolling and reverse-engineering all of the key-generation code for ssh-keygen was that this representation is described by PKCS8. Yes, more than a little time was unnecessarily wasted. In fact, you can generate PKCS8-formatted public keys with OpenSSL.

Needless to say, I never finished reverse-engineering the code. This wasn’t so much because it became a monster, but rather the code was unfathomably ugly. It would’ve taken an enormous amount of time to hack something together, and the same amount of time to refactor it into being something elegant. It required too much of a commitment for something without an immediate need. It would’ve been quicker (and funnier) to write a FUSE wrapper to read the database, and just point OpenSSH to the mountpoint.

Generating an RSA public-key and converting it to PKCS8:

$ openssl genrsa -out mykey.pem 2048
Generating RSA private key, 2048 bit long modulus
.................+++
.........................................................................+++
e is 65537 (0x10001)

$ openssl rsa -in mykey.pem -pubout | ssh-keygen -f /dev/stdin -i -m PKCS8
writing RSA key
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCb7G51EdPBy8Np3x84DoWB5moC/ijLqDKr7ipvajNs6rICIEgX70D5S5jPAiC7HCcubHbxAEV3rRlQJpsKyiy85vKHL787IRhmHu9ILFRm+7OxYY783QCBNP+NcZJVFWsHASxm05hOB8Z4pC0d2ql/eajqUOvk83TdiJnJA3lg+3/LCTd6mfQAKybycllBZxE8W6FLXlpsva2J1hW3bxs5FcvsoqzVjboPmbXNuC0sKk5gWPq/Rrpc7E9N8bEPt/jk/CAUtUPxFtFMe48odHvXnb4YAkt2sTKHzH9BWgn4jSjPsjwYBiG2xWk0n5RcAi2GoPHCwveD2rArY+fQW8+5

The parameters to ssh-keygen simply indicate that the data is coming from STDIN, we’re doing an import, and the key-format is PKCS8.

Also, with DSA:

$ openssl dsaparam -out dsaparam.pem 2048
Generating DSA parameters, 2048 bit long prime
This could take some time
..............+.........+.................+...........+++++++++++++++++++++++++++++++++++++++++++++++++++*
.......+....+...........+......+...+................+.....+.....................+...+...................................+..........+....+..........+.........+..+.+...........+.....+..+......+.......+.....+........+.............+.....+.....+...+..+..............+++++++++++++++++++++++++++++++++++++++++++++++++++*

$ openssl gendsa dsaparam.pem -out dsa_priv.pem
Generating DSA key, 2048 bits

$ openssl dsa -in dsa_priv.pem -pubout | ssh-keygen -f /dev/stdin -i -m PKCS8
read DSA key
writing DSA key
ssh-dss AAAAB3NzaC1kc3MAAAEBAMq+cozwNedgZotyF85hu2yTtdm2EPQhZkSR3iC1qsidkO99HTt4Gpx5vrkdPs13D/RSJj7EJYwvsWSOI/OgY+OsqymR7Erqpf0NGdwzHpTz8nYe2oQNZgpaxh9iTMsfCQwdfchGT4QwewVf4fAGXRZRtmTSbNZIXOJMBiI9h6qW29DvI1TH/x4iWL3j1XA6NDGbVECYU4CE3bQUt/F3wuYTcy6JZU9JzbCLwO6lIkFQmVk4HyMODwM5YKectFh38a7r18oTosnewu9bKoiXCVmOK6RoEgiz/+Sqjxgi8GZ/LeJtLJGxPZxuUmRe8HGMprNW4VmaxoRkNgVL5LoyBNkAAAAVANmGyUL5j1iOp5ABkqKJuiq1sqEXAAABADzHG6NVAzVvZ8xFHc4G2gs56/HvkmvsyKyyQE5o1sV5Yp9yZq9LDW6egrCSRi1Ny6PeBCuSBPU1piZwObX7ZDJCvMvQZf55EONuZYVYBhYvV0Cxw0aet94Qz/u9RA9h1CKl7321d8SDWw4DDNAmQ7nS+NuS6bkAV9djpzK8WFh6PHjARF4DdefbjaYIVig8iWxxu3qju1iQQta2/XZ23OJxZb8eUE+Dgeq5Qdosb508VIRPByovRL3o9RbQ2y50Zp2OXmbib94nW1CpFzo/84RQfnP3RG2LPpmeK+KBTqbctj43B+ATvuIw5vHYj67o3P1MgOHvFpGnDt/McdUo1BoAAAEACqjSYFC9Zj5w5P2uMRhj+o9Gc7z3bYC/OphTcWhjZSs2OAF8+PGjpEjQYSpHPmbLhdbPsZpf2NAnAtz1WxCRVvOFeuSl9zdp4/JeRj3SKrxk0L/r5ZPGCkx+FdWV65sSwK8usaFHyKXeges1N8nmzTegphW3xVb4qI9OazanQ5sxqr+TzDa2yRAMK/PwW8mxyaYkofyct7uHOR0CD6/WjkYHHOEDzubPo1/nh2oKGC2eQnluNP8xBAg7Z5sqm6ZBSIWmOhpaOgIDNcwYla9gJJHrMecgbHkb6HuPcpaJvd2dCZpujFfB0+2jFUgPUsgygIMfV5mjUI3Zajmgmwtqiw==

I included DSA because generating a key-pair with DSA for general PKE is a bit more obscure. I don’t know why. It is still possible, however.

Lastly, ECDSA (elliptic-curve DSA), for completeness. This is really the one that you want to be using, if given a choice. See how cute/short the public-keys are for an algorithm with considerably-more strength? A 256-bit curve is equivalent to a traditional 3072-bit RSA key.

$ openssl ecparam -genkey -name prime256v1 -noout -out prime256v1.key.pem
$ openssl ec -in prime256v1.key.pem -pubout | ssh-keygen -f /dev/stdin -i -m PKCS8
read EC key
writing EC key
ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBC3FrhznL2pQ/titzgnWrbznR3ve2eNEgevog/aS7SszS9Vkq0uFefavBF4M2Txc34sIQ5TPiZxYdm9uO1siXSw=

Make Subversion Tags in Record Time

I, like most developers nowadays, despise Subversion. This is in no small part due to how much of a manual effort it is to properly tag your software. I still do it, but it makes me seeth with rage.

The svncl tool streamlines this. You simply give it the working-directory and the URL of the tags path within the repository, and it gives you a bulleted list of commit-messages that have occurred between the last tag and the current HEAD.

For example:

$ svncl . https://svnserver.com/tags/project
- Setup fix.
- Removed obsolete references to collections package.
- Updates are now stored in S3.
- Added parallel S3 downloading.

Fast, Multipart Downloads from S3

Using the “Range” header and greenlets, you can do very fast downloads from S3. The speed improvements are considerably higher than with uploads. With a 130M file, it was more than three-times faster to parallelize a download.

I’ve put the code in RandomUtility.

Usage:

$ python s3_parallel_download.py (access key) (secret key) (bucket name) (key name)
...
2014-06-23 11:45:06,896 - __main__ - DEBUG -  19%   6%  13%   6%   6%  13%  27%
2014-06-23 11:45:16,896 - __main__ - DEBUG -  52%  26%  26%  26%  39%  26%  68%
2014-06-23 11:45:26,897 - __main__ - DEBUG -  85%  32%  52%  39%  52%  45% 100%
2014-06-23 11:45:36,897 - __main__ - DEBUG - 100%  78%  78%  59%  65%  65% 100%
2014-06-23 11:45:46,897 - __main__ - DEBUG - 100% 100% 100%  78%  91%  91% 100%
Downloaded: /var/folders/qk/t5991kt11cb2y6qgmzrzm_g00000gp/T/tmpU7pL8I

Doing Fast Multipart Uploads to S3 Using Greenlets

S3 allows you to upload pieces of large files in parallel. Unfortunately, most/all of the examples that I’ve seen online are inefficient or inconvenient. For example:

  • Physical file splits of the original file: If you couldn’t guess that S3 would have a way to work off a single copy of the source file, than you probably shouldn’t be using this functionality.
  • Threading: Threads don’t truly run in parallel (in Python).
  • Function-based designs (as opposed to class-based): I’ve never been a fan of this in Python. Too much context info has to be curried.
  • Using multiprocessing: For every upload, you’ll have a number of processes, and all will still be in competition for the network device.

None of these strategies hold a candle to Greenlets (running off different file-pointers to the same physical copy of the file).

This example is located at RandomUtility: s3_parallel.

This is the principal class. Go to the original source for the imports and the couple module-level constants.

class ParallelUpload(object):
    def __init__(self, ak, sk, bucket_name, filepath, 
                 chunk_size_b=_DEFAULT_CHUNK_SIZE_B,
                 monitor_interval_s=_DEFAULT_MONITOR_INTERVAL_S):
        self.__ak = ak
        self.__sk = sk
        self.__bucket_name = bucket_name
        self.__filepath = filepath
        self.__s3_key_name = os.path.basename(filepath)
        self.__chunk_size_b = chunk_size_b
        self.__coverage = 0.0
        self.__monitor_interval_s = _DEFAULT_MONITOR_INTERVAL_S

        self.__filesize_b = os.path.getsize(self.__filepath)
        self.__chunks = int(math.ceil(float(self.__filesize_b) / 
                                      float(self.__chunk_size_b)))

        self.__progress = [0.0] * self.__chunks

    def __get_bucket(self, bucket_name):
        conn = boto.s3.connection.S3Connection(self.__ak, self.__sk)
        return conn.lookup(bucket_name)

    def __standard_upload(self):
        bucket = self.__get_bucket(self.__bucket_name)
        new_s3_item = bucket.new_key(self.__s3_key_name)
        new_s3_item.set_contents_from_filename(
            self.__filepath, 
            cb=self.__standard_cb, 
            num_cb=20)

    def __standard_cb(self, current, total):
        _logger.debug("Status: %.2f%%", float(current) / float(total) * 100.0)

    def __multipart_cb(self, i, current, total):
        self.__progress[i] = float(current) / float(total) * 100.0

    def __transfer_part(self, (mp_info, i, offset)):
        (mp_id, mp_key_name, mp_bucket_name) = mp_info

        bucket = self.__get_bucket(mp_bucket_name)
        mp = boto.s3.multipart.MultiPartUpload(bucket)
        mp.key_name = mp_key_name
        mp.id = mp_id

        # At any given time, this will describe the farther percentage into the 
        # file that we're actively working on.
        self.__coverage = max(
                            (float(offset) / float(self.__filesize_b) * 100.0), 
                            self.__coverage)

        # The last chunk might be shorter than the rest.
        eff_chunk_size = min(offset + self.__chunk_size_b, 
                             self.__filesize_b) - \
                         offset

        with open(filepath, 'rb') as f:
            f.seek(offset)
            mp.upload_part_from_file(
                f, 
                i + 1, 
                size=eff_chunk_size, 
                cb=functools.partial(self.__multipart_cb, i), 
                num_cb=100)

    def __mp_show_progress(self):
        while 1:
            columns = [("%3d%% " % self.__progress[i]) 
                       for i 
                       in range(self.__chunks)]

            pline = ' '.join(columns)
            _logger.debug(pline)

            gevent.sleep(self.__monitor_interval_s)

    def __multipart_upload(self):
        bucket = self.__get_bucket(self.__bucket_name)

        mp = bucket.initiate_multipart_upload(self.__s3_key_name)
        mp_info = (mp.id, mp.key_name, mp.bucket_name)
        chunk_list = range(0, self.__filesize_b, self.__chunk_size_b)

        try:
            gen = ((mp_info, i, offset) 
                   for (i, offset) 
                   in enumerate(chunk_list))

            f = functools.partial(gevent.spawn, self.__transfer_part)

            if self.__monitor_interval_s > 0:
                p = gevent.spawn(self.__mp_show_progress)

            g_list = map(f, gen)

            gevent.joinall(g_list)

            if self.__monitor_interval_s > 0:
                p.kill()
                p.join()
        except:
            mp.cancel_upload()
            raise
        else:
            mp.complete_upload()

    def start(self):
        if self.__filesize_b < _MIN_MULTIPART_SIZE_B:
            self.__standard_upload()
        else:
            self.__multipart_upload()

The output when called as a command will look like this:

$ python s3_parallel.py (access key) (secret key) (bucket name) (file-path)
2014-06-17 10:16:48,458 - __main__ - DEBUG -   0%    0%    0%    0%    0%    0%    0% 
2014-06-17 10:16:58,459 - __main__ - DEBUG -   3%    3%    2%    2%    2%    1%    7% 
2014-06-17 10:17:08,460 - __main__ - DEBUG -   6%    5%    5%    4%    5%    4%   14% 
2014-06-17 10:17:18,461 - __main__ - DEBUG -  10%    7%    8%    8%    7%    6%   18% 
2014-06-17 10:17:28,461 - __main__ - DEBUG -  16%   10%   13%   11%   10%    8%   26% 
2014-06-17 10:17:38,462 - __main__ - DEBUG -  21%   14%   20%   15%   14%   12%   35% 
2014-06-17 10:17:48,462 - __main__ - DEBUG -  26%   17%   27%   19%   19%   15%   48% 
2014-06-17 10:17:58,463 - __main__ - DEBUG -  32%   20%   33%   24%   24%   18%   59% 
2014-06-17 10:18:08,463 - __main__ - DEBUG -  37%   24%   39%   29%   28%   22%   70% 
2014-06-17 10:18:18,464 - __main__ - DEBUG -  43%   28%   44%   34%   32%   26%   82% 
2014-06-17 10:18:28,464 - __main__ - DEBUG -  48%   31%   50%   39%   36%   31%   91% 
2014-06-17 10:18:38,465 - __main__ - DEBUG -  52%   35%   55%   44%   43%   36%  100% 
2014-06-17 10:18:48,465 - __main__ - DEBUG -  60%   39%   63%   47%   47%   40%  100% 
2014-06-17 10:18:58,466 - __main__ - DEBUG -  68%   44%   69%   53%   53%   45%  100% 
2014-06-17 10:19:08,466 - __main__ - DEBUG -  77%   49%   75%   58%   57%   49%  100% 
2014-06-17 10:19:18,467 - __main__ - DEBUG -  83%   54%   84%   65%   62%   52%  100% 
2014-06-17 10:19:28,467 - __main__ - DEBUG -  88%   58%   90%   71%   69%   58%  100% 
2014-06-17 10:19:38,468 - __main__ - DEBUG -  96%   61%   96%   77%   74%   63%  100% 
2014-06-17 10:19:48,468 - __main__ - DEBUG - 100%   67%  100%   83%   83%   70%  100% 
2014-06-17 10:19:58,469 - __main__ - DEBUG - 100%   73%  100%   93%   93%   76%  100% 
2014-06-17 10:20:08,469 - __main__ - DEBUG - 100%   83%  100%  100%  100%   86%  100% 
2014-06-17 10:20:18,470 - __main__ - DEBUG - 100%   95%  100%  100%  100%  100%  100%