Converting Infix Expressions to Postfix in Python

A simplified Python algorithm for converting infix expressions to postfix expressions using Dijkstra’s “shunting-yard” algorithm. We omit support for functions and their arguments but support parenthesis as expected. For the purpose of this example, we support simple mathematical expressions.

OP_LPAREN = '('
OP_RPAREN = ')'
OP_MULTIPLY = '*'
OP_DIVIDE = '/'
OP_ADD = '+'
OP_SUBTRACT = '-'

OPERATORS_S = set([
    OP_MULTIPLY, 
    OP_DIVIDE, 
    OP_ADD, 
    OP_SUBTRACT, 
    OP_LPAREN, 
    OP_RPAREN,
])

PRECEDENCE = {
    OP_MULTIPLY: 7,
    OP_DIVIDE: 7,
    OP_ADD: 5,
    OP_SUBTRACT: 5,
    OP_LPAREN: 1,
    OP_RPAREN: 1,
}

LEFT_ASSOCIATIVE_S = set([
    OP_MULTIPLY,
    OP_DIVIDE,
    OP_ADD, 
    OP_SUBTRACT, 
    OP_LPAREN, 
    OP_RPAREN,
])

def _convert(expression_phrase):
    expression_phrase = expression_phrase.replace(' ', '')

    stack = []
    output = []
    for c in expression_phrase:
        if c not in OPERATORS_S:
            # It's an operand.
            output += [c]
        elif c not in (OP_LPAREN, OP_RPAREN):
            # It's an operator. Pop-and-add all recent operators that win over 
            # the current operator via precendence/associativity.

            current_prec = PRECEDENCE[c]
            is_left_assoc = c in LEFT_ASSOCIATIVE_S
            while len(stack):
                top_value = stack[-1]
                top_prec = PRECEDENCE[top_value]

                if is_left_assoc is True and current_prec <= top_prec or \
                   is_left_assoc is False and current_prec < top_prec:
                    stack.pop()
                    output += [top_value]
                else:
                    break

            stack.append(c)

        elif c == OP_LPAREN:
            # It's a left paren.

            stack.append(c)
        else: #if c == OP_RPAREN:
            # It's a right paren. Pop-and-add everything since the last left 
            # paren.

            found = False
            while len(stack):
                top_value = stack.pop()
                if top_value == OP_LPAREN:
                    found = True
                    break

                output += [top_value]

            if found is False:
                raise ValueError("Mismatched parenthesis (1).")

    if stack and stack[-1] in (OP_LPAREN, OP_RPAREN):
        raise ValueError("Mismatched parenthesis (2).")

    # Flush everything left on the stack.
    while stack:
        c = stack.pop()
        output += [c]

    return ' '.join(output)

def _main():
    infix_phrase = 'a * (b * c) / d * (e + f + g) * h - i'
    print(infix_phrase)

    postfix_phrase = _convert(infix_phrase)
    print(postfix_phrase)

if __name__ == '__main__':
    _main()

Output:

a * (b * c) / d * (e + f + g) * h - i
a b c * * d / e f + g + * h * i -

You may download the code here.

SQLite as an Archive (or an Easy, Compressed Filesystem)

The sqlar project was an experiment to test the practicality of a SQLite archive format by the lead SQLite developer. I say “was” because, like sqlite, it seldom needs modification and appears to be stable.

What makes it fun is that the result is not internalized and inaccessible. Rather, it’s an intact SQLite database that you can readily inspect from the SQLite client. Therefore, the compression is per-blob and the purpose of the tool is merely to make it convenient to add records corresponding to files. The reverse should be true as well: if you create a sqlite DB and populate it with zlib-compressed data, you should simply be able to dump it using the sqlar tool.

To test sqlar out, either checkout or download a copy of the source (which embeds SQLite within it). You may download it here: SQLite Archiver.

Extract it and build:

$ mkdir sqlar
$ cd sqlar/
$ tar xzf ../sqlar-src-15adeb2f9a.tar.gz 
$ cd sqlar-src-15adeb2f9a/

$ make
gcc -g -I. -D_FILE_OFFSET_BITS=64 -Wall -Werror -c sqlite3.c -DSQLITE_THREADSAFE=0 -DSQLITE_OMIT_LOAD_EXTENSION sqlite3.c
gcc -g -I. -D_FILE_OFFSET_BITS=64 -Wall -Werror -o sqlar sqlar.c sqlite3.o -lz

Run a test:

$ mkdir test_root
$ echo &quot;test1&quot; &gt; test_root/test1
$ echo &quot;test2&quot; &gt; test_root/test2
$ echo &quot;test3&quot; &gt; test_root/test3

$ ./sqlar -v test_root.sqlar test_root
  added: test_root
  added: test_root/test1
  added: test_root/test3
  added: test_root/test2

$ mkdir output
$ cd output/

$ ../sqlar -x -v ../test_root.sqlar 
test_root
test_root/test1
test_root/test3
test_root/test2

Visually inspect the database file:

$ cd ..

$ sqlite3 test_root.sqlar 
SQLite version 3.8.2 2013-12-06 14:53:30
Enter &quot;.help&quot; for instructions
Enter SQL statements terminated with a &quot;;&quot;

sqlite&gt; .schema
CREATE TABLE sqlar(
  name TEXT PRIMARY KEY,
  mode INT,
  mtime INT,
  sz INT,
  data BLOB
);

sqlite&gt; select * from sqlar;
test_root|16893|1455064403|0|
test_root/test1|33204|1455064393|6|test1

test_root/test3|33204|1455064403|6|test3

test_root/test2|33204|1455064399|6|test2

sqlite&gt; 

Notice that no compression was performed because the files are so trivial. zlib is used for compression every time unless you explicitly turn it off using “-n”.

You can also mount the archive using FUSE:

$ # This requires libfuse-dev to be installed.
$ make sqlarfs
gcc -g -I. -D_FILE_OFFSET_BITS=64 -Wall -Werror -o sqlarfs sqlarfs.c sqlite3.o -lz -lfuse
$ mkdir mount
$ ./sqlarfs test_root.sqlar `pwd`/mount

In another terminal, list the contents:

$ ls -l
total 0
dr-xr-xr-x 1 dustin dustin 0 Feb  9 19:51 test_root

$ cd test_root/

$ ls -l
total 0
-r--r--r-- 1 dustin dustin   6 Feb  9 19:33 test1
-r--r--r-- 1 dustin dustin   6 Feb  9 19:33 test2
-r--r--r-- 1 dustin dustin   6 Feb  9 19:33 test3
-r--r--r-- 1 dustin dustin 576 Feb  9 19:51 test4

$ cat test4 
Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.

This was a later version of our earlier archive where I added a highly compressible text-file (test4).

It’s a fun little tool.

Accessing Sales History in Magento From Python With XML-RPC

Install the python-magento package and simply pass in database table columns with some criteria. I had difficulty figuring out the structure of the filters because there’s so little documentation/examples anywhere online and, where there are, they all differ format/version.

So, here’s an example:

import magento
import json

_USERNAME = "apiuser"
_PASSWORD = "password"
_HOSTNAME = "dev.bugatchi.com"
_PORT = 80

m = magento.MagentoAPI(_HOSTNAME, _PORT, _USERNAME, _PASSWORD)

filters = {
    'created_at': {
        'from': '2013-05-29 12:38:43',
        'to': '2013-05-29 12:55:33',
    },
}

l = m.sales_order_invoice.list(filters)
print(json.dumps(l))

Output:

[
    {
        "created_at": "2013-05-29 12:38:43",
        "grand_total": "447.1400",
        "increment_id": "100000036",
        "invoice_id": "38",
        "order_currency_code": "USD",
        "order_id": "186",
        "state": "2"
    },
    {
        "created_at": "2013-05-29 12:52:44",
        "grand_total": "333.2100",
        "increment_id": "100000037",
        "invoice_id": "39",
        "order_currency_code": "USD",
        "order_id": "185",
        "state": "2"
    },
    {
        "created_at": "2013-05-29 12:55:33",
        "grand_total": "432.2600",
        "increment_id": "100000038",
        "invoice_id": "40",
        "order_currency_code": "USD",
        "order_id": "184",
        "state": "2"
    }
]

Note that debugging is fairly simple since a failure will return the failed query (unless the server is configured not to). So, you can use that to feel out many of the column names and comparison operators.

Efficiently Processing GPX Files in Go

Use gpxreader to process a GPX file of any size without reading the whole thing into memory. This also avoids Go’s issue where the decoder can decode one node at a time, but, when you do that, it implicitly ignores all child nodes (because it automatically seeks to the matching close tag for validation without any ability to disable this behavior).

An excerpt of the test-script from the project:

//...

func (gv *gpxVisitor) GpxOpen(gpx *gpxreader.Gpx) error {
    fmt.Printf("GPX: %s\n", gpx)

    return nil
}

func (gv *gpxVisitor) GpxClose(gpx *gpxreader.Gpx) error {
    return nil
}

func (gv *gpxVisitor) TrackOpen(track *gpxreader.Track) error {
    fmt.Printf("Track: %s\n", track)

    return nil
}

func (gv *gpxVisitor) TrackClose(track *gpxreader.Track) error {
    return nil
}

func (gv *gpxVisitor) TrackSegmentOpen(trackSegment *gpxreader.TrackSegment) error {
    fmt.Printf("Track segment: %s\n", trackSegment)

    return nil
}

func (gv *gpxVisitor) TrackSegmentClose(trackSegment *gpxreader.TrackSegment) error {
    return nil
}

func (gv *gpxVisitor) TrackPointOpen(trackPoint *gpxreader.TrackPoint) error {
    return nil
}

func (gv *gpxVisitor) TrackPointClose(trackPoint *gpxreader.TrackPoint) error {
    fmt.Printf("Point: %s\n", trackPoint)

    return nil
}

//...

func main() {
    var gpxFilepath string

    o := readOptions()

    gpxFilepath = o.GpxFilepath

    f, err := os.Open(gpxFilepath)
    if err != nil {
        panic(err)
    }

    defer f.Close()

    gv := newGpxVisitor()
    gp := gpxreader.NewGpxParser(f, gv)

    err = gp.Parse()
    if err != nil {
        print("Error: %s\n", err.Error())
        os.Exit(1)
    }
}

Output:

$ gpxreadertest -f 20140909.gpx
GPX: GPX
Track: Track
Track segment: TrackSegment
Point: TrackPoint
Point: TrackPoint
Point: TrackPoint

Go Offline Documentation

Go comes with a built-in documentation server. Its homepage actually resembles the main Go homepage, but it pulls most of this content from external sources. Most/all of the documentation, however (even documentation related to old releases) is locally hosted.

go1.5.3/bin$ ./godoc -http=:6060

Then, visit it at http://127.0.0.1:6060.

Split a Media File by a List of Time Offsets

We’ll split a single audio file containing the whole Quake soundtrack using SplitMedia.

The list file:

0:00:00 Quake Theme
0:05:08 Aftermath
0:07:34 The Hall of Souls
...
1:08:21 Scourge of Armagon 4
1:11:34 Scourge of Armagon 5
...
1:39:57 Dissolution of Eternity 6 
1:43:01 Dissolution of Eternity 7
1:46:07 Dissolution of Eternity 8

The command:

$ splitmedia Quake\ Soundtrack.m4a list_file.quake quake_output
OFF 000:00:00.000 DUR 000308.000 01_QuakeTheme.m4a
OFF 000:05:08.000 DUR 000146.000 02_Aftermath.m4a
OFF 000:07:34.000 DUR 000500.000 03_TheHallofSouls.m4a
...
OFF 001:08:21.000 DUR 000193.000 14_ScourgeofArmagon4.m4a
OFF 001:11:34.000 DUR 000193.000 15_ScourgeofArmagon5.m4a
...
OFF 001:39:57.000 DUR 000184.000 24_DissolutionofEternity6.m4a
OFF 001:43:01.000 DUR 000186.000 25_DissolutionofEternity7.m4a
OFF 001:46:07.000 DUR 000000.000 26_DissolutionofEternity8.m4a

Calculating a Hash for a Path (Recursively)

PathFingerprint allows you to recursively generate hashes for a directory structure. While doing this, it builds a catalog in a separate directory to serve as a cache. Subsequent runs of large directories will run much quicker. You can also do simple lookups against an existing catalog and generate/print a report of what has changed since the last run.

Build a test directory:

$ mkdir -p scan_path/subdir1
$ mkdir -p scan_path/subdir2
$ touch scan_path/subdir1/aa
$ touch scan_path/subdir1/bb

Calculate the hash (with reporting enabled):

$ pfhash -s scan_path -c catalog_path -R - 
create file subdir1/aa
create file subdir1/bb
create path subdir1
create path subdir2
create path .
0df9bc5a7657b7d481c219656441f10d21fd5668

Run again with a couple of changes (with reporting enabled):

$ touch scan_path/subdir1/aa
$ touch scan_path/subdir2/new_file

$ pfhash -s scan_path -c catalog_path -R - 
update file subdir1/aa
create file subdir2/new_file
update path subdir2
update path .
e700843c1b5c2f40a68098e1df96ef08b6081fe8

Lookup the hash using the lookup tool:

$ pflookup -c catalog_path
e700843c1b5c2f40a68098e1df96ef08b6081fe8

$ pflookup -c catalog_path -r subdir1
426a98d313a0a740b8445daa5102b3ed6dd7f4ed

$ pflookup -c catalog_path -r subdir1/aa
da39a3ee5e6b4b0d3255bfef95601890afd80709