Beautiful Python Charts Using Seaborn

Of the five or six most well-known charting packages, none really impressed me (being a devoted user of Highcharts, in Javascript). The exception to this is plot.py but it’s a remote service and I’d rather not couple myself to a service.

In the end, I went with Seaborn (Stanford). This seemed to look the best of all of the options, even if it was tough to get some of the features working right. They recently added a chart for the exclusive purpose of plotting categorical/factor-based data: factorplot.

An example of Seaborn’s factorplot:

Seaborn - factorplot

I put together an example of a bar-chart using data from data.gov. I used pandas to read the CSV data. Since Seaborn is built on top of matplotlib and matplotlib appears to have issues displaying graphics in my local environment, I had to rely on running the example via ipython using the matplotlib magic function (which loads everything necessary and worked as expected).

To run the example, you’ll need the following packages (in addition to the ipython environment):

  • pandas
  • numpy
  • seaborn
  • matplotlib

The code:

# Tell ipython to load the matplotlib environment.
%matplotlib

import itertools

import pandas
import numpy
import seaborn
import matplotlib.pyplot

_DATA_FILEPATH = 'datagovdatasetsviewmetrics.csv'
_ROTATION_DEGREES = 90
_BOTTOM_MARGIN = 0.35
_COLOR_THEME = 'coolwarm'
_LABEL_X = 'Organizations'
_LABEL_Y = 'Views'
_TITLE = 'Organizations with Most Views'
_ORGANIZATION_COUNT = 10
_MAX_LABEL_LENGTH = 20

def get_data():
    # Read the dataset.

    d = pandas.read_csv(_DATA_FILEPATH)

    # Group by organization.

    def sum_views(df):
        return sum(df['Views per Month'])

    g = d.groupby('Organization Name').apply(sum_views)

    # Sort by views (descendingly).

    g.sort(ascending=False)

    # Grab the first N to plot.

    items = g.iteritems()
    s = itertools.islice(items, 0, _ORGANIZATION_COUNT)

    s = list(s)

    # Sort them in ascending order, this time, so that the larger ones are on 
    # the right (in red) in the chart. This has a side-effect of flattening the 
    # generator while we're at it.
    s = sorted(s, key=lambda (n, v): v)

    # Truncate the names (otherwise they're unwieldy).

    distilled = []
    for (name, views) in s:
        if len(name) > (_MAX_LABEL_LENGTH - 3):
            name = name[:17] + '...'

        distilled.append((name, views))

    return distilled

def plot_chart(distilled):
    # Split the series into separate vectors of labels and values.

    labels_raw = []
    values_raw = []
    for (name, views) in distilled:
        labels_raw.append(name)
        values_raw.append(views)

    labels = numpy.array(labels_raw)
    values = numpy.array(values_raw)

    # Create one plot.

    seaborn.set(style="white", context="talk")

    (f, ax) = matplotlib.pyplot.subplots(1)

    b = seaborn.barplot(
        labels, 
        values,
        ci=None, 
        palette=_COLOR_THEME, 
        hline=0, 
        ax=ax,
        x_order=labels)

    # Set labels.

    ax.set_title(_TITLE)
    ax.set_xlabel(_LABEL_X)
    ax.set_ylabel(_LABEL_Y)

    # Rotate the x-labels (otherwise they'll overlap). Seaborn also doesn't do 
    # very well with diagonal labels so we'll go vertical.
    b.set_xticklabels(labels, rotation=_ROTATION_DEGREES)

    # Add some margin to the bottom so the labels aren't cut-off.
    matplotlib.pyplot.subplots_adjust(bottom=_BOTTOM_MARGIN)

distilled = get_data()
plot_chart(distilled)

To run the example, save it to a file and load it into the ipython environment. If you were to save it as “barchart.ipy” (using the “ipy” extension so it processes the %matplotlib directive properly) and then start ipython using the “ipython” executable, you’d load it like this:

%run barchart.ipy

The graphic will be displayed in another window:

Seaborn barchart

I should also mention that I really like pygal but didn’t consider it an option because I wanted a traditional, flat image, not an SVG. Even so, here’s an example from their website:

import pygal                                                       # First import pygal
bar_chart = pygal.Bar()                                            # Then create a bar graph object
bar_chart.add('Fibonacci', [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55])  # Add some values
bar_chart.render_to_file('bar_chart.svg')                          # Save the svg to a file

Output:

pygal SVN Chart

Notice that the resulting SVG even has hover effects.

Advertisements