Installing the Nginx Long-Polling/Comet Module on a Mac Using Homebrew

If you’re developing software on a Mac that’s targeted for use in a Linux environment, you’re not alone. You might be lucky-enough to be working in a scripting language-based project, so the difference between environments isn’t nearly as brutal as it would be if you actually had to perform builds. Still, there is the occasional environmental difference.

One such difference is what baked-in Nginx modules you’ll get on an Ubuntu host versus your Mavericks host. What if you need the same push/Comet module on your Mac that you get when you install the nginx-extras package? This is the nginx-push-stream-module module (there’s at least one other module with a similar name, which is usually an additional source of confusion).

You might think that you need to download the source to that module, potentially have to deal with build-dependencies, clone the standard nginx Homebrew formula, modify it, and build. You’d be wrong.

It’s very simple. After you’ve uninstalled your previous nginx formula, run:

$ brew install nginx-full --with-push-stream-module
==> Installing nginx-full from homebrew/homebrew-nginx
==> Installing nginx-full dependency: push-stream-nginx-module
==> Downloading https://github.com/wandenberg/nginx-push-stream-module/archive/0.4.1.tar.gz
Already downloaded: /Library/Caches/Homebrew/push-stream-nginx-module-0.4.1.tar.gz
🍺  /usr/local/Cellar/push-stream-nginx-module/0.4.1: 75 files, 1.1M, built in 2 seconds
==> Installing nginx-full
==> Downloading http://nginx.org/download/nginx-1.6.2.tar.gz
######################################################################## 100.0%
==> ./configure --prefix=/usr/local/Cellar/nginx-full/1.6.2 --with-http_ssl_module --with-pcre --with-ipv6 --sbin-path=/
==> make
==> make install
...

This requires the nginx-full formula because, besides push-stream-nginx-module, it has a massive number of definitions for third-party modules, whereas the normal nginx formula has very few. Your configuration will be largely preserved from the old Nginx to the new (only the launch configs should change).

For reference, this is the list of modules that nginx-full is configured for:

  def self.third_party_modules
    {
      "lua" => "Compile with support for LUA module",
      "echo" => "Compile with support for Echo Module",
      "auth-digest" => "Compile with support for Auth Digest Module",
      "set-misc" => "Compile with support for Set Misc Module",
      "redis2" => "Compile with support for Redis2 Module",
      "array-var" => "Compile with support for Array Var Module",
      "accept-language" => "Compile with support for Accept Language Module",
      "accesskey" => "Compile with support for HTTP Access Key Module",
      "auth-ldap" => "Compile with support for Auth LDAP Module",
      "auth-pam" => "Compile with support for Auth PAM Module",
      "cache-purge" => "Compile with support for Cache Purge Module",
      "ctpp2" => "Compile with support for CT++ Module",
      "headers-more" => "Compile with support for Headers More Module",
      "tcp-proxy" => "Compile with support for TCP proxy",
      "dav-ext" => "Compile with support for HTTP WebDav Extended Module",
      "eval" => "Compile with support for Eval Module",
      "fancyindex" => "Compile with support for Fancy Index Module",
      "mogilefs" => "Compile with support for HTTP MogileFS Module",
      "mp4-h264" => "Compile with support for HTTP MP4/H264 Module",
      "notice" => "Compile with support for HTTP Notice Module",
      "subs-filter" => "Compile with support for Substitutions Filter Module",
      "upload" => "Compile with support for Upload module",
      "upload-progress" => "Compile with support for Upload Progress module",
      "php-session" => "Compile with support for Parse PHP Sessions module",
      "anti-ddos" => "Compile with support for Anti-DDoS module",
      "captcha" => "Compile with support for Captcha module",
      "autols" => "Compile with support for Flexible Auto Index module",
      "auto-keepalive" => "Compile with support for Auto Disable KeepAlive module",
      "ustats" => "Compile with support for Upstream Statistics (HAProxy style) module",
      "extended-status" => "Compile with support for Extended Status module",
      "upstream-hash" => "Compile with support for Upstream Hash Module",
      "consistent-hash" => "Compile with support for Consistent Hash Upstream module",
      "healthcheck" => "Compile with support for Healthcheck Module",
      "log-if" => "Compile with support for Log-if Module",
      "txid" => "Compile with support for Sortable Unique ID",
      "upstream-order" => "Compile with support for Order Upstream module",
      "unzip" => "Compile with support for UnZip module",
      "var-req-speed" => "Compile with support for Var Request-Speed module",
      "http-flood-detector" => "Compile with support for Var Flood-Threshold module",
      "http-remote-passwd" => "Compile with support for Remote Basic Auth password module",
      "realtime-req" => "Compile with support for Realtime Request module",
      "counter-zone" => "Compile with support for Realtime Counter Zone module",
      "mod-zip" => "Compile with support for HTTP Zip Module",
      "rtmp" => "Compile with support for RTMP Module",
      "dosdetector" => "Compile with support for detecting DoS attacks",
      "push-stream" => "Compile with support for http push stream module",
    }
  end

As a result, you can use a similar command to install each of these modules.

The guy who’s responsible for this is easily worth his weight in donations.

Using Nginx for Long-Polling (Comet)

Long-polling is the strategy of checking for updates or messages from a server by allowing a client to connect but block until data is available. Once data is available, the client processes the data and reads again, potentially blocking again. This is considerably more efficient, in all of the ways that blocking is when compared with polling regularly in the absence of data.

Although it’s not complicated to implement this on your own, it can potentially introduce complexity to what might otherwise be a simple website. For example, to implement this, you might have to provide the following features yourself:

  • Server process that manages messaging.
  • A connection-management framework to maintain a dictionary of mailboxes to a list of their corresponding waiting connections.
  • Providing for the necessary accounting if you want to queue the incoming messages, so reoccurring clients won’t miss any, and then providing the ability for clients to determine what messages have already been seen.
  • All of the required thread-safety for managing connections and message exchange.

Enter the all-powerful, all-seeing, all-caching Nginx web-server. It has a couple of modules that reduce the factors above down to a couple of API calls to Nginx: HttpStreamPushModule and HttpPushModule.

Though HttpStreamPushModule is, reportedly, the latest of the two modules, only HttpPushModule is available with Ubuntu (as of 13.04). So, that’s the one that we’ll work with, here.

 

Nginx Configuration

To install the HttpPushModule module, install nginx-extras (again, as of 13.04).

Configuration is very straightforward. We’ll define two location blocks: one for publishers and one for subscribers. In the common scenario, the publisher will be what your application code pushes messages to and the subscriber will be what your Javascript reads from (which will regularly block). When publisher and subscriber requests are received, Nginx will expect an ID to indicate which “channel” should be used. A channel is just another name for a mailbox, and, by default, doesn’t have to already exist.

The endpoints defined in our example (taken from here):

location /publish {
    set $push_channel_id $arg_id;      # The channel ID is expected as "id".
    push_publisher;

    push_store_messages on;            # enable message queueing
    push_message_timeout 2h;           # messages expire after 2 hours, set to 0 to never expire
    push_message_buffer_length 10;     # store 10 messages
}

location /subscribe {
    push_subscriber;

    # Any number of clients can listen.
    push_subscriber_concurrency broadcast;

    set $push_channel_id $arg_id;
    default_type  text/plain;
}

 

Javascript Code

In our simple example, we’ll play the parts of both the publisher and subscriber. We’ll wait on messages from the subscriber endpoint, while allowing the user to publish messages into the publisher endpoint.

The example also accounts for which messages are too old. If we were to just naively start reading messages, two things will happen:

  • We’ll see the first message that Nginx has knowledge of, for the given channel.
  • We’ll see the same message repeatedly.

What’s happening here is that Nginx relies on the client to keep track of what messages it has already seen, so, unless given parameters, Nginx will always start at the beginning.

Our Javascript takes care of this. On each request, we grab the values of the “Etag” and “Last-Modified” response headers, and pass them into future requests as the “If-None-Match” and “If-Modified-Since” request headers, respectively. Notice that if we were to set the initial value of the last-modified timestamp to the epoch (the early midnight of New Years, 1970, GMT), we’d initially receive all queued messages. We chose to set it to the “now” timestamp so that we’d only see messages from the point that we loaded the webpage.

That’s all.

Example (based on the same reference, above, but refactored for jQuery):

<html>
<head> 
    <script src="http://code.jquery.com/jquery-1.10.1.min.js"></script>
    <script type="text/javascript">
var channelId = "asdf";

// We use these to tell Nginx which messages we've seen.
var etag = 0;
var lm = (new Date()).toGMTString();

function add_message(msg) {
     var d = new Date();
     var msg = d.toString() + ": " + msg;
     $('#data').append(msg + "<br />");
}

function do_request() {
    add_message("Doing long-poll: (" + etag + ") [" + lm + "]");
    $.ajax('/subscribe?id=' + channelId, {
            type: 'GET',
            success: handle_response,
            error: handle_error,
            headers: {
                    'If-None-Match': etag,
                    'If-Modified-Since': lm
                }
        });
    }

function handle_response(txt, textStatus, response) {
     add_message('Long-poll has returned.');
     add_message(txt);
     
     etag = response.getResponseHeader("Etag") || 0;
     lm = response.getResponseHeader("Last-Modified") || lm;
    
     do_request();
}

function handle_error(response, textStatus, errorThrown) {
     add_message(errorThrown);
}

function publish_message() {
    var txt = $.trim($('#message').val());
    if (txt.length == 0)
        alert("You must enter text to publish");
    else
        $.post('/publish?id=' + channelId, {
                data: txt
            });
}
    </script>
</head>
<body>
    Messages:
    <div id="data">
    </div>

    <input type="text" id="message" />
    <input type="button" id='send' value="Send Message" />
</body>
</html>
<script type="text/javascript">
function boot_page()
{
    $('#send').click(publish_message);
    do_request();
}

$(boot_page);
</script>