Infinite, Secure, and Distributed Backups Using Tahoe

Thanks to zooko for this one: A secure, distributed storage-service built on top of S3, which uses Tahoe (see previous article) as its client. Your data is 100% encrypted locally before being pumped into S3. It’s called S4.

It’s $25/month for infinite storage. For those of us with mountains of data to backup, it’s a deal (S3 costs about $30/T, currently, and even Glacier is $10/T).

Once you setup your Tahoe client with the right introducer and share configuration (which is trivial), all you do is call the “backup” subcommand with the path that you want to backup.

Done (well, after potentially several weeks of backing-up it will be).

Very Easy, Pleasant, Secure, and Python-Accessible Distributed Storage With Tahoe LAFS

Tahoe is a file-level distributed filesystem, and it’s a joy to use. “LAFS” stands for “Least Authority Filesystem”. According to the homepage:

Even if some of the servers fail or are taken over by an attacker, the 
entire filesystem continues to function correctly, preserving your privacy 
and security.

Tahoe comes built-in with a beautiful UI, and can be accessed via it’s CLI (using a syntax similar to SCP), via REST (that’s right), or from Python using pyFilesystem (an abstraction layer that also works with SFTP, S3, FTP, and many others). Tahoe It gives you very direct control over how files are sharded/replicated. The shards are referred to as shares.

Tahoe requires an “introducer” node that announces nodes. You can easily do a one-node cluster by installing the node in the default ~/.tahoe directory, the introducer in another directory, and dropping the “share” configurables down to 1.


Just install the package:

$ sudo apt-get install tahoe-lafs

You might also be able to install directly using pip (this is what the Apt version does):

$ sudo pip install allmydata-tahoe

Configuring as Client

  1. Provisioned client:
    $ tahoe create-client
  2. Update ~/.tahoe/tahoe.cfg:
    # Identify the local node.
    nickname = 
    # This is the furl for the public TestGrid.
    introducer.furl = pb://,
  3. Start node:
    $ bin/tahoe start

Web Interface (WUI):

The UI is available at

To change the UI to bind on all ports, update web.port:

web.port = tcp:3456:interface=

CLI Interface (CLI):

To start manipulating files with tahoe, we need an alias. Aliases are similar to anonymous buckets. When you create an alias, you create a bucket. If you misplace the alias (or the directory URI that it represents), you’re up the creek. It’s standard-operating-procedure to copy the private/aliases file (in your main Tahoe directory) between the various nodes of your cluster.

  1. Create an alias (bucket):
    $ tahoe create-alias tahoe

    We use “tahoe” since that’s the conventional default.

  2. Manipulate it:

    $ tahoe ls tahoe:

The tahoe command is similar to scp, in that you pass the standard file management calls and use the standard “colon” syntax to interact with the remote resource.

If you’d like to view this alias/directory/bucket in the WUI, run “tahoe list-aliases” to dump your aliases:

# tahoe list-aliases
  tahoe: URI:DIR2:xyzxyzxyzxyzxyzxyzxyzxyz:abcabcabcabcabcabcabcabcabcabcabc

Then, take the whole URI string (“URI:DIR2:xyzxyzxyzxyzxyzxyzxyzxyz:abcabcabcabcabcabcabcabcabcabcabc”), plug it into the input field beneath “OPEN TAHOE-URI:”, and click “View file or Directory”.

Configuring as Peer (Client and Server)

First, an introducer has to be created to announce the nodes.

Creating the Introducer

$ mkdir tahoe_introducer
$ cd tahoe_introducer/
~/tahoe_introducer$ tahoe create-introducer .

Introducer created in '/home/dustin/tahoe_introducer'

$ ls -l
total 8
-rw-rw-r-- 1 dustin dustin 520 Sep 16 13:35 tahoe.cfg
-rw-rw-r-- 1 dustin dustin 311 Sep 16 13:35 tahoe-introducer.tac

# This is a introducer-specific tahoe.cfg . Set the nickname.
~/tahoe_introducer$ vim tahoe.cfg 

~/tahoe_introducer$ tahoe start .
STARTING '/home/dustin/tahoe_introducer'

~/tahoe_introducer$ cat private/introducer.furl 

Configuring Client/Server Peer

  1. Create the node:
    $ tahoe create-node
  2. Update configuration (~/.tahoe/tahoe.cfg).
    • Set nickname and introducer.furl to the furl of the introducer, just above.
    • Set the shares config. We’ll only have one node for this example, so needed represents the number of pieces required to rebuild a file, happy represents the number of pieces/nodes required to perform a write, and total represents the number of pieces that get created:
      shares.needed = 1
      shares.happy = 1 = 1

      You may also wish to set the web.port item as we did in the client section, above.

  3. Start the node:

    $ tahoe start
    STARTING '/home/dustin/.tahoe'
  4. Test a file-operation:
    $ tahoe create-alias tahoe
    Alias 'tahoe' created
    $ tahoe ls
    $ tahoe cp /etc/fstab tahoe:
    Success: files copied
    $ tahoe ls

Accessing From Python

  1. Install the Python package:
    $ sudo pip install fs
  2. List the files:
    import fs.contrib.tahoelafs
    dir_uri = 'URI:DIR2:um3z3xblctnajmaskpxeqvf3my:fevj3z54toroth5eeh4koh5axktuplca6gfqvht26lb2232szjoq'
    webapi_url = 'http://yourserver:3456'
    t = fs.contrib.tahoelafs.TahoeLAFS(dir_uri, webapi=webapi_url)
    files = t.listdir()

    This will render a list of strings (filenames). If you don’t provide webapi, the local system and default port are assumed.


If the logo in the upper-lefthand corner of the UI doesn’t load, try doing the following, making whatever path adjustments are necessary in your environment:

$ cd /usr/lib/python2.7/dist-packages/allmydata/web/static
$ sudo mkdir img && cd img
$ sudo wget
$ tahoe restart

This is a bug, where the image isn’t being included in the Python package:

logo.png is not found in allmydata-tahoe as installed via easy_install and pip

If you’re trying to do a copy and you get an AssertionError, this likely is a known bug in 1.10.0:

# tahoe cp tahoe:fake_data .
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/allmydata/scripts/", line 156, in run
    rc = runner(sys.argv[1:], install_node_control=install_node_control)
  File "/usr/lib/python2.7/dist-packages/allmydata/scripts/", line 141, in runner
    rc = cli.dispatch[command](so)
  File "/usr/lib/python2.7/dist-packages/allmydata/scripts/", line 551, in cp
    rc = tahoe_cp.copy(options)
  File "/usr/lib/python2.7/dist-packages/allmydata/scripts/", line 770, in copy
    return Copier().do_copy(options)
  File "/usr/lib/python2.7/dist-packages/allmydata/scripts/", line 451, in do_copy
    status = self.try_copy()
  File "/usr/lib/python2.7/dist-packages/allmydata/scripts/", line 512, in try_copy
    return self.copy_to_directory(sources, target)
  File "/usr/lib/python2.7/dist-packages/allmydata/scripts/", line 672, in copy_to_directory
    self.copy_files_to_target(self.targetmap[target], target)
  File "/usr/lib/python2.7/dist-packages/allmydata/scripts/", line 703, in copy_files_to_target
    self.copy_file_into(source, name, target)
  File "/usr/lib/python2.7/dist-packages/allmydata/scripts/", line 748, in copy_file_into
    target.put_file(name, f)
  File "/usr/lib/python2.7/dist-packages/allmydata/scripts/", line 156, in put_file
    precondition(isinstance(name, unicode), name)
  File "/usr/lib/python2.7/dist-packages/allmydata/util/", line 39, in precondition
    raise AssertionError, "".join(msgbuf)
AssertionError: precondition: 'fake_data' <type 'str'>

Try using a destination filename/filepath rather than just a dot.

See Inconsistent ‘tahoe cp’ behavior for more information.