# Doing Proper Encryption with Rijndael using [Pure] Python

When it comes to writing code that portably encrypts and decrypts without concern for the platform, you’ll have to overcome a couple of small factors with Python:

• You’ll have to install a cryptography library
• You’ll need code that doesn’t need a build environment (no C-code)

I personally prefer Rijndael (also called “AES” when you constrain yourself to certain key-sizes and/or block-sizes) as an affordable, ubiquitous algorithm. There was a remarkable amount of difficulty in finding a pure-Python algorithm that worked in both Python 2 and Python 3.

By mentioning “proper” encryption in the title, I was referring to “key derivation” (also called “key expansion”). It’s not enough to have a password and an algorithm. Unless you already have a sufficiently random key of exactly the right number of bits, you’ll need to submit the password to a key-expansion algorithm in order to generate a key of the right length. This is often done using the PBKDF2 algorithm.

It was difficult to find a Python 2 and 3 version of PBKDF2 as well.

I’ve posted the pprp package, which packages Python 2 and Python 3 versions of Rijndael and PBKDF2 and automatically chooses the right version during module-load. As I implement PKCS7 for padding, I also include a utility function to trim the padding.

The main documentation provides the following as an example:

```import io
import os.path
import hashlib

import pprp

def trans(text):
return text.encode('ASCII') if sys.version_info[0] >= 3 else text

salt = trans('salt')
block_size = 16
key_size = 32
data = "this is a test" * 100

key = pprp.pbkdf2(passphrase, salt, key_size)

def source_gen():
for i in range(0, len(data), block_size):
block = data[i:i + block_size]
len_ = len(block)

if len_ > 0:
yield block.encode('ASCII')

if len_ < block_size:
break

# Pump the encrypter output into the decrypter.
encrypted_gen = pprp.rjindael_encrypt_gen(key, source_gen(), block_size)

# Run, and sink the output into an IO stream. Trim the padding off the last
# block.

s = io.BytesIO()
ends_at = 0
for block in pprp.rjindael_decrypt_gen(key, encrypted_gen, block_size):
ends_at += block_size
if ends_at >= len(data):

s.write(block)

decrypted = s.getvalue()
assert data == decrypted.decode('ASCII')
```

# Generating Signatures with 25519 Curves in Python

It’s not very often that you need to do encryption in pure Python. Obviously, you’d want to have this functionality off-loaded to C-routines whenever possible. However, in some situations, often when you require code portability or that there not be any C-compilation phase, a pure-Python implementation can be convenient.

The 25519 elliptic curve is a special curve defined by Bernstein, rather than NIST. It is fairly well-embraced as an alternative to NIST-generated curves, which are assumed to be either influenced or compromised by the NSA. See Aris’ notable 25519-patch contribution to OpenSSH.

There exists a pure-Python implementation of 25519 at this site, but it’s too expensive to use in a high-volume platform (because of the math involved). However, it’s perfect if it’s to be driven by infrequent events, and a delay of a few seconds is fine. The benefits of an EC, if you don’t already know, is that the factors are usually considerably smaller for strength comparable to traditional, non-EC algorithms. Therefore, your trade-off for a couple of more seconds of computation is an enigmatically-shorter signature.

To use it, grab the Python source, included here for posterity:

```import hashlib

b = 256
q = 2**255 - 19
l = 2**252 + 27742317777372353535851937790883648493

def H(m):
return hashlib.sha512(m).digest()

def expmod(b,e,m):
if e == 0: return 1
t = expmod(b,e/2,m)**2 % m
if e & 1: t = (t*b) % m
return t

def inv(x):
return expmod(x,q-2,q)

d = -121665 * inv(121666)
I = expmod(2,(q-1)/4,q)

def xrecover(y):
xx = (y*y-1) * inv(d*y*y+1)
x = expmod(xx,(q+3)/8,q)
if (x*x - xx) % q != 0: x = (x*I) % q
if x % 2 != 0: x = q-x
return x

By = 4 * inv(5)
Bx = xrecover(By)
B = [Bx % q,By % q]

def edwards(P,Q):
x1 = P[0]
y1 = P[1]
x2 = Q[0]
y2 = Q[1]
x3 = (x1*y2+x2*y1) * inv(1+d*x1*x2*y1*y2)
y3 = (y1*y2+x1*x2) * inv(1-d*x1*x2*y1*y2)
return [x3 % q,y3 % q]

def scalarmult(P,e):
if e == 0: return [0,1]
Q = scalarmult(P,e/2)
Q = edwards(Q,Q)
if e & 1: Q = edwards(Q,P)
return Q

def encodeint(y):
bits = [(y >> i) & 1 for i in range(b)]
return ''.join([chr(sum([bits[i * 8 + j] << j for j in range(8)])) for i in range(b/8)])

def encodepoint(P):
x = P[0]
y = P[1]
bits = [(y >> i) & 1 for i in range(b - 1)] + [x & 1]
return ''.join([chr(sum([bits[i * 8 + j] << j for j in range(8)])) for i in range(b/8)])

def bit(h,i):
return (ord(h[i/8]) >> (i%8)) & 1

def publickey(sk):
h = H(sk)
a = 2**(b-2) + sum(2**i * bit(h,i) for i in range(3,b-2))
A = scalarmult(B,a)
return encodepoint(A)

def Hint(m):
h = H(m)
return sum(2**i * bit(h,i) for i in range(2*b))

def signature(m,sk,pk):
h = H(sk)
a = 2**(b-2) + sum(2**i * bit(h,i) for i in range(3,b-2))
r = Hint(''.join([h[i] for i in range(b/8,b/4)]) + m)
R = scalarmult(B,r)
S = (r + Hint(encodepoint(R) + pk + m) * a) % l
return encodepoint(R) + encodeint(S)

def isoncurve(P):
x = P[0]
y = P[1]
return (-x*x + y*y - 1 - d*x*x*y*y) % q == 0

def decodeint(s):
return sum(2**i * bit(s,i) for i in range(0,b))

def decodepoint(s):
y = sum(2**i * bit(s,i) for i in range(0,b-1))
x = xrecover(y)
if x & 1 != bit(s,b-1): x = q-x
P = [x,y]
if not isoncurve(P): raise Exception("decoding point that is not on curve")
return P

def checkvalid(s,m,pk):
if len(s) != b/4: raise Exception("signature length is wrong")
if len(pk) != b/8: raise Exception("public-key length is wrong")
R = decodepoint(s[0:b/8])
A = decodepoint(pk)
S = decodeint(s[b/8:b/4])
h = Hint(encodepoint(R) + pk + m)
if scalarmult(B,S) != edwards(R,scalarmult(A,h)):
raise Exception("signature does not pass verification")
```

These routines are not Python 3 compatible.

For reference, other implementations in other languages can be found at the Wikipedia page, under “External links”.

To implement the Bernstein version above, you must first generate a signing (private) key. This comes from a random 32-bit number with some bits flipped:

```import os

signing_key_bytes = [ord(os.urandom(1)) for i in range(32)]

signing_key_bytes[0] &= 248
signing_key_bytes[31] &= 127
signing_key_bytes[31] |= 64

signing_key = ''.join([chr(x) for x in signing_key_bytes])
```

To produce the signature, and then check it:

```import ed25519

message = "some data"

public_key = ed25519.publickey(signing_key)

signature = ed25519.signature(message, signing_key, public_key)
ed25519.checkvalid(signature, message, public_key)
```

The signing and the ensuing check take about six-seconds, together.

The ed25519 module is the one whose source is displayed above. The name “Ed25519” (the official name of the algorithm) stands for “Edwards Curve”, the mathematical foundation on which Bernstein designed the 25519 curve.

# Python Function Annotations

Python 3.x introduced a relatively unknown feature called “function annotations”. This introduces a way to tag parameters and your return value with arbitrary information at the function-definition level.

You can annotate using strings or any other type that you’d like:

```>>> def some_function(parm1: "Example parameter"):
...   pass
...
>>> some_function.__annotations__
{'parm1': 'Example parameter'}
>>> x = 5
>>> def some_function_2(parm1: x * 20):
...   pass
...
>>> some_function_2.__annotations__
{'parm1': 100}
```

You can also annotate the return:

```>>> def some_function_3() -> 'return-value tag':
...   pass
...
>>> some_function_3.__annotations__
{'return': 'return-value tag'}
```

It’s important to note that there are already strong conventions in how to document your parameters, thanks to Sphinx. Therefore, the utility of annotations will most likely be entirely in terms of functionality. For example, you can annotate closures on-the-fly:

```import random

c_list = []
for i in range(10):
def closure_() -> random.random():
pass

c_list.append(closure_)

list(map(lambda x: print(x.__annotations__), c_list))
```

This is the output:

```{'return': 0.9644971188983055}
{'return': 0.8639746158842893}
{'return': 0.18610468531065305}
{'return': 0.8528801446167985}
{'return': 0.3022338513329076}
{'return': 0.6455491244718428}
{'return': 0.09106740460937834}
{'return': 0.16987808849543917}
{'return': 0.9136478506241527}
{'return': 0.41691681086623544}
```

Absolutely nothing in the Python language or library is dependent on annotations, so they’re yours to play with or implement as you see fit.

# Creating Those Neat OpenSSH RSA Public Keys in OpenSSL (along with DSA and ECDSA Keys)

Not too long ago, I wanted to know how to generate the public-keys that OpenSSH generates, as a transitional step to understanding how to decode them. In this fashion, I might apply the knowledge, some day, to a dynamically configurable SSH platform to allow a client to connect via SSH and to authenticate against a database, such as with Github and Bitbucket.

What I did not learn while painfully trolling and reverse-engineering all of the key-generation code for ssh-keygen was that this representation is described by PKCS8. Yes, more than a little time was unnecessarily wasted. In fact, you can generate PKCS8-formatted public keys with OpenSSL.

Needless to say, I never finished reverse-engineering the code. This wasn’t so much because it became a monster, but rather the code was unfathomably ugly. It would’ve taken an enormous amount of time to hack something together, and the same amount of time to refactor it into being something elegant. It required too much of a commitment for something without an immediate need. It would’ve been quicker (and funnier) to write a FUSE wrapper to read the database, and just point OpenSSH to the mountpoint.

Generating an RSA public-key and converting it to PKCS8:

```\$ openssl genrsa -out mykey.pem 2048
Generating RSA private key, 2048 bit long modulus
.................+++
.........................................................................+++
e is 65537 (0x10001)

\$ openssl rsa -in mykey.pem -pubout | ssh-keygen -f /dev/stdin -i -m PKCS8
writing RSA key
```

The parameters to ssh-keygen simply indicate that the data is coming from STDIN, we’re doing an import, and the key-format is PKCS8.

Also, with DSA:

```\$ openssl dsaparam -out dsaparam.pem 2048
Generating DSA parameters, 2048 bit long prime
This could take some time
..............+.........+.................+...........+++++++++++++++++++++++++++++++++++++++++++++++++++*
.......+....+...........+......+...+................+.....+.....................+...+...................................+..........+....+..........+.........+..+.+...........+.....+..+......+.......+.....+........+.............+.....+.....+...+..+..............+++++++++++++++++++++++++++++++++++++++++++++++++++*

\$ openssl gendsa dsaparam.pem -out dsa_priv.pem
Generating DSA key, 2048 bits

\$ openssl dsa -in dsa_priv.pem -pubout | ssh-keygen -f /dev/stdin -i -m PKCS8
writing DSA key
```

I included DSA because generating a key-pair with DSA for general PKE is a bit more obscure. I don’t know why. It is still possible, however.

Lastly, ECDSA (elliptic-curve DSA), for completeness. This is really the one that you want to be using, if given a choice. See how cute/short the public-keys are for an algorithm with considerably-more strength? A 256-bit curve is equivalent to a traditional 3072-bit RSA key.

```\$ openssl ecparam -genkey -name prime256v1 -noout -out prime256v1.key.pem
\$ openssl ec -in prime256v1.key.pem -pubout | ssh-keygen -f /dev/stdin -i -m PKCS8
writing EC key
ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBC3FrhznL2pQ/titzgnWrbznR3ve2eNEgevog/aS7SszS9Vkq0uFefavBF4M2Txc34sIQ5TPiZxYdm9uO1siXSw=
```

# Make Subversion Tags in Record Time

I, like most developers nowadays, despise Subversion. This is in no small part due to how much of a manual effort it is to properly tag your software. I still do it, but it makes me seeth with rage.

The svncl tool streamlines this. You simply give it the working-directory and the URL of the tags path within the repository, and it gives you a bulleted list of commit-messages that have occurred between the last tag and the current HEAD.

For example:

```\$ svncl . https://svnserver.com/tags/project
- Setup fix.
- Removed obsolete references to collections package.
- Updates are now stored in S3.
```

Using the “Range” header and greenlets, you can do very fast downloads from S3. The speed improvements are considerably higher than with uploads. With a 130M file, it was more than three-times faster to parallelize a download.

I’ve put the code in RandomUtility.

Usage:

```\$ python s3_parallel_download.py (access key) (secret key) (bucket name) (key name)
...
2014-06-23 11:45:06,896 - __main__ - DEBUG -  19%   6%  13%   6%   6%  13%  27%
2014-06-23 11:45:16,896 - __main__ - DEBUG -  52%  26%  26%  26%  39%  26%  68%
2014-06-23 11:45:26,897 - __main__ - DEBUG -  85%  32%  52%  39%  52%  45% 100%
2014-06-23 11:45:36,897 - __main__ - DEBUG - 100%  78%  78%  59%  65%  65% 100%
2014-06-23 11:45:46,897 - __main__ - DEBUG - 100% 100% 100%  78%  91%  91% 100%
```

# Doing Fast Multipart Uploads to S3 Using Greenlets

S3 allows you to upload pieces of large files in parallel. Unfortunately, most/all of the examples that I’ve seen online are inefficient or inconvenient. For example:

• Physical file splits of the original file: If you couldn’t guess that S3 would have a way to work off a single copy of the source file, than you probably shouldn’t be using this functionality.
• Function-based designs (as opposed to class-based): I’ve never been a fan of this in Python. Too much context info has to be curried.
• Using multiprocessing: For every upload, you’ll have a number of processes, and all will still be in competition for the network device.

None of these strategies hold a candle to Greenlets (running off different file-pointers to the same physical copy of the file).

This example is located at RandomUtility: s3_parallel.

This is the principal class. Go to the original source for the imports and the couple module-level constants.

```class ParallelUpload(object):
def __init__(self, ak, sk, bucket_name, filepath,
chunk_size_b=_DEFAULT_CHUNK_SIZE_B,
monitor_interval_s=_DEFAULT_MONITOR_INTERVAL_S):
self.__ak = ak
self.__sk = sk
self.__bucket_name = bucket_name
self.__filepath = filepath
self.__s3_key_name = os.path.basename(filepath)
self.__chunk_size_b = chunk_size_b
self.__coverage = 0.0
self.__monitor_interval_s = _DEFAULT_MONITOR_INTERVAL_S

self.__filesize_b = os.path.getsize(self.__filepath)
self.__chunks = int(math.ceil(float(self.__filesize_b) /
float(self.__chunk_size_b)))

self.__progress = [0.0] * self.__chunks

def __get_bucket(self, bucket_name):
conn = boto.s3.connection.S3Connection(self.__ak, self.__sk)
return conn.lookup(bucket_name)

bucket = self.__get_bucket(self.__bucket_name)
new_s3_item = bucket.new_key(self.__s3_key_name)
new_s3_item.set_contents_from_filename(
self.__filepath,
cb=self.__standard_cb,
num_cb=20)

def __standard_cb(self, current, total):
_logger.debug("Status: %.2f%%", float(current) / float(total) * 100.0)

def __multipart_cb(self, i, current, total):
self.__progress[i] = float(current) / float(total) * 100.0

def __transfer_part(self, (mp_info, i, offset)):
(mp_id, mp_key_name, mp_bucket_name) = mp_info

bucket = self.__get_bucket(mp_bucket_name)
mp.key_name = mp_key_name
mp.id = mp_id

# At any given time, this will describe the farther percentage into the
# file that we're actively working on.
self.__coverage = max(
(float(offset) / float(self.__filesize_b) * 100.0),
self.__coverage)

# The last chunk might be shorter than the rest.
eff_chunk_size = min(offset + self.__chunk_size_b,
self.__filesize_b) - \
offset

with open(filepath, 'rb') as f:
f.seek(offset)
f,
i + 1,
size=eff_chunk_size,
cb=functools.partial(self.__multipart_cb, i),
num_cb=100)

def __mp_show_progress(self):
while 1:
columns = [("%3d%% " % self.__progress[i])
for i
in range(self.__chunks)]

pline = ' '.join(columns)
_logger.debug(pline)

gevent.sleep(self.__monitor_interval_s)

bucket = self.__get_bucket(self.__bucket_name)

mp_info = (mp.id, mp.key_name, mp.bucket_name)
chunk_list = range(0, self.__filesize_b, self.__chunk_size_b)

try:
gen = ((mp_info, i, offset)
for (i, offset)
in enumerate(chunk_list))

f = functools.partial(gevent.spawn, self.__transfer_part)

if self.__monitor_interval_s > 0:
p = gevent.spawn(self.__mp_show_progress)

g_list = map(f, gen)

gevent.joinall(g_list)

if self.__monitor_interval_s > 0:
p.kill()
p.join()
except:
raise
else:

def start(self):
if self.__filesize_b < _MIN_MULTIPART_SIZE_B:
else:
```

The output when called as a command will look like this:

```\$ python s3_parallel.py (access key) (secret key) (bucket name) (file-path)
2014-06-17 10:16:48,458 - __main__ - DEBUG -   0%    0%    0%    0%    0%    0%    0%
2014-06-17 10:16:58,459 - __main__ - DEBUG -   3%    3%    2%    2%    2%    1%    7%
2014-06-17 10:17:08,460 - __main__ - DEBUG -   6%    5%    5%    4%    5%    4%   14%
2014-06-17 10:17:18,461 - __main__ - DEBUG -  10%    7%    8%    8%    7%    6%   18%
2014-06-17 10:17:28,461 - __main__ - DEBUG -  16%   10%   13%   11%   10%    8%   26%
2014-06-17 10:17:38,462 - __main__ - DEBUG -  21%   14%   20%   15%   14%   12%   35%
2014-06-17 10:17:48,462 - __main__ - DEBUG -  26%   17%   27%   19%   19%   15%   48%
2014-06-17 10:17:58,463 - __main__ - DEBUG -  32%   20%   33%   24%   24%   18%   59%
2014-06-17 10:18:08,463 - __main__ - DEBUG -  37%   24%   39%   29%   28%   22%   70%
2014-06-17 10:18:18,464 - __main__ - DEBUG -  43%   28%   44%   34%   32%   26%   82%
2014-06-17 10:18:28,464 - __main__ - DEBUG -  48%   31%   50%   39%   36%   31%   91%
2014-06-17 10:18:38,465 - __main__ - DEBUG -  52%   35%   55%   44%   43%   36%  100%
2014-06-17 10:18:48,465 - __main__ - DEBUG -  60%   39%   63%   47%   47%   40%  100%
2014-06-17 10:18:58,466 - __main__ - DEBUG -  68%   44%   69%   53%   53%   45%  100%
2014-06-17 10:19:08,466 - __main__ - DEBUG -  77%   49%   75%   58%   57%   49%  100%
2014-06-17 10:19:18,467 - __main__ - DEBUG -  83%   54%   84%   65%   62%   52%  100%
2014-06-17 10:19:28,467 - __main__ - DEBUG -  88%   58%   90%   71%   69%   58%  100%
2014-06-17 10:19:38,468 - __main__ - DEBUG -  96%   61%   96%   77%   74%   63%  100%
2014-06-17 10:19:48,468 - __main__ - DEBUG - 100%   67%  100%   83%   83%   70%  100%
2014-06-17 10:19:58,469 - __main__ - DEBUG - 100%   73%  100%   93%   93%   76%  100%
2014-06-17 10:20:08,469 - __main__ - DEBUG - 100%   83%  100%  100%  100%   86%  100%
2014-06-17 10:20:18,470 - __main__ - DEBUG - 100%   95%  100%  100%  100%  100%  100% ```

# Using a REST-Based Pipe to Keep Systems Connected

You’ll eventually need a bidirectional pipe/bridge between environments/subnets, and an infrastructure-level pipe/VPN connection would be overkill. You might consider using SSH multiplexing, which allows you to:

• Track the state of a named SSH connection.
• Reuse the same connection for subsequent calls into the same server.

However, multiplexing has two fairly large disadvantages:

• In order to get bidirectional communication, you’ll have to start stacking forward- and reverse-tunnels on top of the connection, and this gets complicated.
• If you need to access the pipe from an application, then there’s a degree of risk in depending on an elaborately-configured console utility in order for your application to work correctly. There is no API.

To a lesser degree, you might also have to adhere to certain security restrictions. For example, you might only allowed to connect in one direction, to one port.

Instead of writing your own socket server, forming your own socket protocol, writing your own heartbeat mechanism, and writing adapters for your applications on both the client and server systems, you might consider RestPipe.

### RestPipe

RestPipe is a solution that aggressively maintains a bidirectional connection from one or more client machines to a single server. If the client needs to talk to the server, the client talks to a local webserver that translates the request to a message over an SSL-authenticated socket (written using coroutines/greenlets and Protocol Buffers), the server passes the request to your event-handler, and the response is forwarded back as a response to the original web-request. The same process also works in reverse if the server wants to talk to the client, and provides the hostname as a part of the URL.

### Setup

The documentation is fairly complete. To get it going quickly on a development system:

1. Use CaKit to generate a CA identity, server identity, and client identity.
2. Install the restpipe package using PyPI.
3. Start the server.
4. Start the client.
5. Use cURL to make a request to either the server (which will query the client), or the client (which will query the server).

### Examples Queries (Available by Default)

• ```\$ curl http://rpclient.local/server/time && echo
{"time_from_server": 1402897823.882672}
```
• ```\$ curl http://rpserver.local/client/localhost/time && echo
{"time_from_client": 1402897843.879908}
```
• ```\$ curl http://rpclient.local/server/cat//hello%20/world && echo
{"result_from_server": "hello world"}
```
• ```\$ curl http://rpserver.local/client/localhost/cat//hello%20/world && echo
{"result_from_client": "hello world"}
```

# Using Route53 for Cheap and Easy Dynamic DNS

The awsdd Python package allows you to use a free third-party public-IP lookup and a cheap Route53 zone to always keep your domain-name pointing to an accurate IP.

```\$ sudo pip install awsdd
\$ awsdd -d <domain name> -z <zone ID> -a <access key> -s <secret key>
2014-06-15 12:09:30,836 - add.aws - INFO - IP update complete.
```