The Numb-Nuts Tutorial to the Celery Distributed Task Queue (using Python)

Celery is a distributed queue that is very easy to pick-up. I’ll do two quick examples: one that sends a job and returns and another that sends a job and then retrieves a result. I’m going to use SQLite for this example (which is interfaced via SQLAlchemy). Since Celery seems to have some issues importing SQLite under Python 3, we’ll use Python 2.7 . Make sure that you install the “celery” and “sqlalchemy” Python packages.

Without Results

Define the tasks module and save it as sqlite_queue_without_results.py:

import celery

_BACKEND_URI = 'sqla+sqlite:///tasks.sqlite'
_APP = celery.Celery('sqlite_queue_without_results', broker=_BACKEND_URI)

@_APP.task
def some_task(incoming_message):
    return "ECHO: {0}".format(incoming_message)

Start the server:

$ celery -A sqlite_queue_without_results worker --loglevel=info

Execute the following Python to submit a job:

import sqlite_queue_without_results

_ARG1 = "Test message"
sqlite_queue_without_results.some_task.delay(_ARG1)

That’s it. You’ll see something similar to the following in the server window:

Celery (Without Result)

With Results

This time, when we define the tasks module, we’ll provide Celery a results backend. Call this module sqlite_queue_with_results.py:

import celery

_RESULT_URI = 'db+sqlite:///results.sqlite'
_BACKEND_URI = 'sqla+sqlite:///tasks.sqlite'

_APP = celery.Celery('sqlite_queue_with_results', broker=_BACKEND_URI, backend=_RESULT_URI)

@_APP.task
def some_task(incoming_message):
    return "ECHO: {0}".format(incoming_message)

Start the server:

$ celery -A sqlite_queue_with_results worker --loglevel=info

Execute the following Python to submit a job:

import sqlite_queue_with_results

_ARG1 = "Test message"
r = sqlite_queue_with_results.some_task.delay(_ARG1)
value = r.get(timeout=2)

print("Result: [{0}]".format(value))

Since we’re using a traditional DBMS (albeit a fast, local one) to store our results, we’ll be internally polling for a state change and then fetching the result. Therefore, it is a more costly operation and I’ve used a two-second timeout to accommodate this.

The server output will be similar to the following:

Celery (With Result)

The client output will look like:

Result: [ECHO: Test message]

Celery has many more features not explored by this tutorial, including:

  • exception propagation
  • custom task states (including providing metadata that can be read by the client)
  • task ignore/reject/retry responses
  • HTTP-based tasks (for calling your tasks in another place or language)
  • task routing
  • periodic/scheduled tasks
  • workflows
  • drawing visible graphs in order to inspect behavior

For more information, see the user guide.

Create a Video From Your Processing Sketch (Using the IDE)

A quick example to show how to create a video from Processing 2.0 . Here, I’m using the Python mode. We write each frame out to a file and then use the built-in Movie Maker to create a QuickTime video (you can also attach audio if you desire).

Example code:

def setup():
    size(500, 500)
    fill(0)

def draw():
    background(255, 255, 255)

    if mousePressed:
        ellipse(mouseX, mouseY, 80, 80)

    saveFrame("frames/frame-#####.png")

For those that aren’t familiar, this just configures the canvas and then constantly clears the canvas with each redraw. If you’re pressing the mouse-button, it’ll draw a circle whereever the cursor is. After the redraw, it’ll capture one PNG image. It’ll implicitly create the “frames” folder if it doesn’t already exist.

Now, we open Movie Maker:

Open Move Maker

Click on the top “Choose…” button to elect your “frames” directory (or whatever you called it):

Dialog

Click “Create Movie…”, select your video-file name/path, and watch it go:

Make Movie

The final result (in my case):

Final Result

For a library-based approach, look into GSVideo.

Drawing to a Video Using OpenCV and Python

I ran into a considerable amount of difficulty writing a video-file using OpenCV (under Python). Almost every video-writing example on the Internet is only concerned with capturing from a webcam, and, even for the relevant examples, I kept getting an empty/insubstantial file.

In order to write a video-file, you need to declare the FOURCC code that you require. I prefer H.264, so I [unsuccessfully] gave it “H264”. I also heard somewhere that since H.264 is actually the standard, I needed to use “X264” to refer to the codec. This didn’t work either. I also tried “XVID” and “DIVX”. I eventually resorted to trying to pass (-1), as this will allegedly prompt you to make a choice (thereby showing you what options are available). Naturally, no prompt was given and yet it still seemed to execute to the end. There doesn’t appear to be a way to show the available codecs. I was out of options.

It turns out that you still have one or more raw-format codecs available. For example, “8BPS” and “IYUV” are available. MJPEG (“MJPG”) also ended-up working, too. This is the best option (so that we can get compression).

It’s important to note that the nicer codecs might’ve not been available simply due to dependencies. At one point, I reinstalled OpenCV (using Brew) with the “–with-ffmpeg” option. This seemed to pull-down XVID and other codecs. However, I still had the same problems. Note that, since this was installed at the time that I tested “MJPG”, the latter may actually require the former.

Code, using MJPEG:

import cv2
import cv
import numpy as np

_CANVAS_WIDTH = 500
_CANVAS_HEIGHT = 500
_COLOR_DEPTH = 3
_CIRCLE_RADIUS = 40
_STROKE_THICKNESS = -1
_VIDEO_FPS = 1

def _make_image(x, y, b, g, r):
    img = np.zeros((_CANVAS_WIDTH, _CANVAS_HEIGHT, _COLOR_DEPTH), np.uint8)
    position = (x, y)
    color = (b, g, r)
    cv2.circle(img, position, _CIRCLE_RADIUS, color, _STROKE_THICKNESS)

    return img

def _make_video(filepath):
    # Works without FFMPEG.
    #fourcc = cv.FOURCC('8', 'B', 'P', 'S')

    # Works, but we don't have a viewer for it.
    #fourcc = cv.CV_FOURCC('i','Y','U', 'V')

    # Works (but might require FFMPEG).
    fourcc = cv.CV_FOURCC('M', 'J', 'P', 'G')

    # Prompt. This never works, though (the prompt never shows).
    #fourcc = -1

    w = cv2.VideoWriter(
            filepath,
            fourcc,
            _VIDEO_FPS,
            (_CANVAS_WIDTH, _CANVAS_HEIGHT))

    img = _make_image(100, 100, 0, 0, 255)
    w.write(img)

    img = _make_image(200, 200, 0, 255, 0)
    w.write(img)

    img = _make_image(300, 300, 255, 0, 0)
    w.write(img)

    w.release()

if __name__ == '__main__':
    _make_video('video.avi')

Processing Text for Sentiment and Other Good Stuff

textblob integrates nltk and pattern. It allows you to easily extract and derive information from a passage of text.

To install:

$ sudo pip install textblob

Based on the example, here:

import textblob
import pprint

text = '''
The titular threat of The Blob has always struck me as the ultimate movie monster: an insatiably hungry, amoeba-like mass able to penetrate virtually any safeguard, capable of--as a doomed doctor chillingly describes it--"assimilating flesh on contact. Snide comparisons to gelatin be darned, it's a concept with the most devastating of potential consequences, not unlike the grey goo scenario proposed by technological theorists fearful of artificial intelligence run rampant.
'''

blob = textblob.TextBlob(text)

# Get parts of speech.
blob.tags

# Get list of individual noun-phrases.
blob.noun_phrases

# Print sentence and sentiment polarity:
for sentence in blob.sentences:
    print(sentence)
    print('')
    print(sentence.sentiment.polarity)
    print('')
    print('--')
    print('')

# Convert to Spanish.
blob.translate(to="es")

Output:

>>> import textblob
>>> import pprint
>>> 
>>> text = '''
... The titular threat of The Blob has always struck me as the ultimate movie monster: an insatiably hungry, amoeba-like mass able to penetrate virtually any safeguard, capable of--as a doomed doctor chillingly describes it--"assimilating flesh on contact. Snide comparisons to gelatin be darned, it's a concept with the most devastating of potential consequences, not unlike the grey goo scenario proposed by technological theorists fearful of artificial intelligence run rampant.
... '''
>>> 
>>> blob = textblob.TextBlob(text)
>>>
>>> # Get parts of speech.
>>> blob.tags
[(u'The', u'DT'), (u'titular', u'JJ'), (u'threat', u'NN'), (u'of', u'IN'), (u'The', u'DT'), (u'Blob', u'NNP'), (u'has', u'VBZ'), (u'always', u'RB'), (u'struck', u'VBD'), (u'me', u'PRP'), (u'as', u'IN'), (u'the', u'DT'), (u'ultimate', u'JJ'), (u'movie', u'NN'), (u'monster', u'NN'), (u'an', u'DT'), (u'insatiably', u'RB'), (u'hungry', u'JJ'), (u'amoeba-like', u'JJ'), (u'mass', u'NN'), (u'able', u'JJ'), (u'to', u'TO'), (u'penetrate', u'VB'), (u'virtually', u'RB'), (u'any', u'DT'), (u'safeguard', u'VB'), (u'capable', u'JJ'), (u'of--as', u'JJ'), (u'a', u'DT'), (u'doomed', u'VBN'), (u'doctor', u'NN'), (u'chillingly', u'RB'), (u'describes', u'VBZ'), (u'it', u'PRP'), (u'assimilating', u'VBG'), (u'flesh', u'NN'), (u'on', u'IN'), (u'contact', u'NN'), (u'Snide', u'NNP'), (u'comparisons', u'NNS'), (u'to', u'TO'), (u'gelatin', u'NN'), (u'be', u'VB'), (u'darned', u'JJ'), (u'it', u'PRP'), (u"'", u'POS'), (u's', u'PRP'), (u'a', u'DT'), (u'concept', u'NN'), (u'with', u'IN'), (u'the', u'DT'), (u'most', u'RBS'), (u'devastating', u'JJ'), (u'of', u'IN'), (u'potential', u'JJ'), (u'consequences', u'NNS'), (u'not', u'RB'), (u'unlike', u'IN'), (u'the', u'DT'), (u'grey', u'JJ'), (u'goo', u'NN'), (u'scenario', u'NN'), (u'proposed', u'VBN'), (u'by', u'IN'), (u'technological', u'JJ'), (u'theorists', u'NNS'), (u'fearful', u'JJ'), (u'of', u'IN'), (u'artificial', u'JJ'), (u'intelligence', u'NN'), (u'run', u'VB'), (u'rampant', u'JJ')]
>>>
>>> # Get list of individual noun-phrases.
>>> blob.noun_phrases
WordList([u'titular threat', 'blob', u'ultimate movie monster', u'amoeba-like mass', 'snide', u'potential consequences', u'grey goo scenario', u'technological theorists fearful', u'artificial intelligence run rampant'])
>>>
>>> # Print sentence and sentiment polarity:
>>> for sentence in blob.sentences:
...     print(sentence)
...     print('')
...     print(sentence.sentiment.polarity)
...     print('')
...     print('--')
...     print('')
... 

The titular threat of The Blob has always struck me as the ultimate movie monster: an insatiably hungry, amoeba-like mass able to penetrate virtually any safeguard, capable of--as a doomed doctor chillingly describes it--"assimilating flesh on contact.

0.06

--

Snide comparisons to gelatin be darned, it's a concept with the most devastating of potential consequences, not unlike the grey goo scenario proposed by technological theorists fearful of artificial intelligence run rampant.


-0.341666666667

--

>>>
>>> # Convert to Spanish.
>>> blob.translate(to="es")
TextBlob("La amenaza titular de The Blob siempre me ha parecido como el último monstruo de la película: una, la masa insaciablemente hambriento ameba capaz de penetrar prácticamente cualquier salvaguardia, capaz de - como médico condenado escalofriantemente lo describe - "asimilar carne en contacto. comparaciones Snide a la gelatina ser condenados, es un concepto con el más devastador de las posibles consecuencias, no muy diferente del escenario plaga gris propuesta por los teóricos tecnológicos temerosos de la inteligencia artificial ejecutar rampante.")

Awesome, right?

Build an R-Tree in Python for Fun and Profit

There might come a time when you will prefer to stylishly load spatial data into a memory-structure rather than clumsily integrating a database just to quickly answer a question over a finite amount of data. You can use an R-tree by way of the rtree Python package that wraps the libspatialindex native library.

It’s both Python 2 and 3 compatible.

Building libspatialindex:

  1. Download it (using either Github or an archive.
  2. Configure, build, and install it (the shared-library won’t be created unless you do the install):
$ ./configure
$ make
$ sudo make install
  1. Install the Python package:
$ sudo pip install rtree
  1. Run the example code, which is based on their example code:
import rtree.index

idx2 = rtree.index.Rtree()

locs = [
    (14, 10, 14, 10),
    (16, 10, 16, 10),
]

for i, (minx, miny, maxx, maxy) in enumerate(locs):
    idx2.add(i, (minx, miny, maxx, maxy), obj={'a': 42})

for distance in (1, 2):
    print("Within distance of: ({0})".format(distance))
    print('')

    r = [
        (i.id, i.object) 
        for i 
        in idx2.nearest((13, 10, 13, 10), distance, objects=True)
    ]

    print(r)
    print('')

Output:

Within distance of: (1)

[(0, {'a': 42})]

Within distance of: (2)

[(0, {'a': 42}), (1, {'a': 42})]

NOTE: You need to represent your points as bounding-boxes, which is the basic structure of an R-tree (polygons inside of polygons inside of polygons).

In this case, we assign arbitrary objects that are associated with each bounding box. When we do a search, we get the objects back, too.