A. Jesse Jiryu Davis

Category: Python

Wasp's Nest: The Read-Copy-Update Pattern In Python

In recent work on PyMongo, I used a concurrency-control pattern that solves a variety of reader-writer problem without mutexes. It's similar to the read-copy-update technique used extensively in the Linux kernel. I'm dubbing it the [...]

In recent work on PyMongo, I used a concurrency-control pattern that solves a variety of reader-writer problem without mutexes. It's similar to the read-copy-update technique used extensively in the Linux kernel. I'm dubbing it the Wasp's Nest. Stick with me—by the end of this post you'll know a neat concurrency pattern, and have a good understanding of how PyMongo handles replica set failovers.

Update: In this post's first version I didn't know how close my code is to "ready-copy-update". Robert Moore schooled me in the comments. I also named it "a lock-free concurrency pattern" and Steve Baptiste pointed out that I was using the term wrong. My algorithm merely solves a race condition without adding a mutex, it's not lock-free. I love this about blogging: in exchange for a little humility I get a serious education.


Paper Wasp © MzePhotos.com, Some Rights Reserved

The Mission

MongoDB is deployed in "replica sets" of identical database servers. A replica set has one primary server and several read-only secondary servers. Over time a replica set's state can change. For example, if the primary's cooling fans fail and it bursts into flames, a secondary takes over as primary a few seconds later. Or a sysadmin can add another server to the set, and once it's synced up it becomes a new secondary.

I help maintain PyMongo, the Python driver for MongoDB. Its MongoReplicaSetClient is charged with connecting to the members of a set and knowing when the set changes state. Replica sets and PyMongo must avoid any single points of failure in the face of unreliable servers and networks—we must never assume any particular members of the set are available.

Consider this very simplified sketch of a MongoReplicaSetClient:

class Member(object):
    """Represents one server in the set."""
    def __init__(self, pool):
        # The connection pool.
        self.pool = pool

class MongoReplicaSetClient(object):
    def __init__(self, seeds):
        self.primary = None
        self.members = {}
        self.refresh()

        # The monitor calls refresh() every 30 sec.
        self.monitor = MonitorThread(self)

    def refresh(self):
        # If we're already connected, use our list of known
        # members. Otherwise use the passed-in list of
        # possible members, the 'seeds'.
        seeds = self.members.keys() or self.seeds

        # Try seeds until first success.
        ismaster_response = None
        for seed in seeds:
            try:
                # The 'ismaster' command gets info
                # about the whole set.
                ismaster_response = call_ismaster(seed)
                break
            except socket.error:
                # Host down / unresolvable, try the next.
                pass

        if not ismaster_response:
            raise ConnectionFailure()

        # Now we can discover the whole replica set.
        for host in ismaster_response['hosts']:
            pool = ConnectionPool(host)
            member = Member(pool)
            self.members[host] = member

        # Remove down members from dict.
        for host in self.members.keys():
            if host not in ismaster_response['hosts']:
                self.members.pop(host)

        self.primary = ismaster_response.get('primary')

    def send_message(self, message):
        # Send an 'insert', 'update', or 'delete'
        # message to the primary.
        if not self.primary:
            self.refresh()

        member = self.members[self.primary]
        pool = member.pool
        try:
            send_message_with_pool(message, pool)
        except socket.error:
            self.primary = None
            raise AutoReconnect()

We don't know which members will be available when our application starts, so we pass a "seed list" of hostnames to the MongoReplicaSetClient. In refresh, the client tries them all until it can connect to one and run the isMaster command, which returns information about all the members in the replica set. The client then makes a connection-pool for each member and records which one is the primary.

Once refresh finishes, the client starts a MonitorThread which calls refresh again every 30 seconds. This ensures that if we add a secondary to the set it will be discovered soon and participate in load-balancing. If a secondary goes down, refresh removes it from self.members. In send_message, if we discover the primary's down, we raise an error and clear self.primary so we'll call refresh the next time send_message runs.

The Bugs

PyMongo 2.1 through 2.5 had two classes of concurrency bugs: race conditions and thundering herds.

The race condition is easy to see. Look at the expression self.members[self.primary] in send_message. If the monitor thread runs refresh and pops a member from self.members while an application thread is executing the dictionary lookup, the latter could get a KeyError. Indeed, that is exactly the bug report we received that prompted my whole investigation and this blog post.

The other bug causes a big waste of effort. Let's say the primary server bursts into flames. The client gets a socket error and clears self.primary. Then a bunch of application threads all call send_message at once. They all find that self.primary is None, and all call refresh. This is a duplication of work that only one thread need do. Depending how many processes and threads we have, it has the potential to create a connection storm in our replica set as a bunch of heavily-loaded applications lurch to the new primary. It also compounds the race condition because many threads are all modifying the shared state. I'm calling this duplicated work a thundering herd problem, although the official definition of thundering herd is a bit different.

Fixing With A Mutex

We know how to fix race conditions: let's add a mutex! We could lock around the whole body of refresh, and lock around the expression self.members[self.primary] in send_message. No thread sees members and primary in a half-updated state.

...and why it's not ideal

This solution has two problems. The first is minor: the slight cost of acquiring and releasing a lock for every message sent to MongoDB, especially since it means only one thread can run that section of send_message at a time. A reader-writer lock alleviates the contention by allowing many threads to run send_message as long as no thread is running refresh, in exchange for greater complexity and cost for the single-threaded case.

The worse problem is the behavior such a mutex would cause in a very heavily multithreaded application. While one thread is running refresh, all threads running send_message will queue on the mutex. If the load is heavy enough our application could fail while waiting for refresh, or could overwhelm MongoDB once they're all simultaneously unblocked. Better under most circumstances for send_message to fail fast, saying "I don't know who the primary is, and I'm not going to wait for refresh to tell me." Failing fast raises more errors but keeps the queues small.

The Wasp's Nest Pattern

There's a better way, one that requires no locks, is less error-prone, and fixes the thundering-herd problem too. Here's what I did for PyMongo 2.5.1, which we'll release next week.

First, all information about the replica set's state is pulled out of MongoReplicaSetClient and put into an RSState object:

class RSState(object):
    def __init__(self, members, primary):
        self.members = members
        self.primary = primary

MongoReplicaSetClient gets one RSState instance that it puts in self.rsstate. This instance is immutable: no thread is allowed to change the contents, only to make a modified copy. So if the primary goes down, refresh doesn't just set primary to None and pop its hostname from the members dict. Instead, it makes a deep copy of the RSState, and updates the copy. Finally, it replaces the old self.rsstate with the new one.

Each of the RSState's attributes must be immutable and cloneable, too, which requires a very different mindset. For example, I'd been tracking each member's ping time using a 5-sample moving average and updating it with a new sample like so:

class Member(object):
    def add_sample(self, ping_time):
        self.samples = self.samples[-4:]
        self.samples.append(ping_time)
        self.avg_ping = sum(self.samples) / len(self.samples)

But if Member is immutable, then adding a sample means cloning the whole Member and updating it. Like this:

class Member(object):
    def clone_with_sample(self, ping_time):
        # Make a new copy of 'samples'
        samples = self.samples[-4:] + [ping_time]
        return Member(samples)

Any method that needs to access self.rsstate more than once must protect itself against the state being replaced concurrently. It has to make a local copy of the reference. So the racy expression in send_message becomes:

rsstate = self.rsstate  # Copy reference.
member = rsstate.members[rsstate.primary]

Since the rsstate cannot be modified by another thread, send_message knows its local reference to the state is safe to read.

A few summers ago I was on a Zen retreat in a rural house. We had paper wasps building nests under the eaves. The wasps make their paper from a combination of chewed-up plant fiber and saliva. The nest hangs from a single skinny petiole. It's precarious, but it seems to protect the nest from ants who want to crawl in and eat the larvae. The queen periodically spreads an ant-repellant secretion around the petiole; its slenderness conserves her ant-repellant, and concentrates it in a small area.

Wasp's Nest in Situ [Source]

I think of the RSState like a wasp's nest: it's an intricate structure hanging off the MongoReplicaSetClient by a single attribute, self.rsstate. The slenderness of the connection protects send_message from race conditions, just as the thin petiole protects the nest from ants.

Since I was fixing the race condition I fixed the thundering herd as well. Only one thread should run refresh after a primary goes down, and all other threads should benefit from its labor. I nominated the monitor to be that one thread:

class MonitorThread(threading.Thread):
    def __init__(self, client):
        threading.Thread.__init__(self)
        self.client = weakref.proxy(client)
        self.event = threading.Event()
        self.refreshed = threading.Event()

    def schedule_refresh(self):
        """Refresh immediately."""
        self.refreshed.clear()
        self.event.set()

    def wait_for_refresh(self, timeout_seconds):
        """Block until refresh completes."""
        self.refreshed.wait(timeout_seconds)

    def run(self):
        while True:
            self.event.wait(timeout=30)
            self.event.clear()

            try:
                try:
                    self.client.refresh()
                finally:
                    self.refreshed.set()
            except AutoReconnect:
                pass
            except:
                # Client was garbage-collected.
                break

(The weakref proxy prevents a reference cycle and lets the thread die when the client is deleted. The weird try-finally syntax is necessary in Python 2.4.)

The monitor normally wakes every 30 seconds to notice changes in the set, like a new secondary being added. If send_message discovers that the primary is gone, it wakes the monitor early by signaling the event it's waiting on:

rsstate = self.rsstate
if not rsstate.primary:
    self.monitor.schedule_refresh()
    raise AutoReconnect()

No matter how many threads call schedule_refresh, the work is only done once.

Any MongoReplicaSetClient method that needs to block on refresh can wait for the "refreshed" event:

rsstate = self.rsstate
if not rsstate.primary:
    self.monitor.schedule_refresh()
    self.monitor.wait_for_refresh(timeout_seconds=5)

# Get the new state.
rsstate = self.rsstate
if not rsstate.primary:
    raise AutoReconnect()

# Proceed normally....

This pattern mitigates the connection storm from a heavily-loaded application discovering that the primary has changed: only the monitor thread goes looking for the new primary. The others can abort or wait.

The wasp's nest pattern is a simple and high-performance solution to some varieties of reader-writer problem. Compared to mutexes it's easy to understand, and most importantly it's easy to program correctly. For further reading see my notes in the source code.

Paper wasp and nest [Source]

Another Thing About Python's Threadlocals

As the maintainer of the connection pool for PyMongo, the official MongoDB driver for Python, I've gotten far more intimate knowledge of Python threads than I'd ever wanted. One of the challenges I face is: if the connect pool assigns a [...]

Dammit

As the maintainer of the connection pool for PyMongo, the official MongoDB driver for Python, I've gotten far more intimate knowledge of Python threads than I'd ever wanted.

One of the challenges I face is: if the connect pool assigns a socket to a thread and the thread dies, how do we reclaim the socket for the general pool? I thought I nailed it last year, using a weakref callback to a threadlocal, but there's a bug in that method. Justin Patrin of Idle Games discovered it while testing a PyMongo contribution he's making. I'm going to describe the bug, its impact, the cause, and the fix. I'll conclude by kvetching about supporting archaic versions of Python.

The Bug

Here's some code to start 1000 threads and register to be notified when they're kaput. At the end I assert no thread has died unmourned:

import threading
import weakref

nthreads = 10000
ncallbacks = 0
ncallbacks_lock = threading.Lock()
local = threading.local()
refs = set()

class Vigil(object):
    pass

def run():
    def on_thread_died(ref):
        global ncallbacks
        ncallbacks_lock.acquire()
        ncallbacks += 1
        ncallbacks_lock.release()

    vigil = Vigil()
    local.vigil = vigil
    refs.add(weakref.ref(vigil, on_thread_died))

threads = [threading.Thread(target=run)
           for _ in range(nthreads)]
for t in threads: t.start()
for t in threads: t.join()
getattr(local, 'c', None)  # Trigger cleanup in <= 2.7.0
assert ncallbacks == nthreads, \
    'only %d callbacks run' % ncallbacks

This is the method I presented in "Knowing When A Python Thread Has Died". Each thread creates a "vigil" object and sticks it in a threadlocal. Since only the threadlocal refers to the vigil, the vigil should be destroyed when the thread dies. I make a weakref to the vigil and register a weakref callback. If all goes well, the callback is run as the thread dies. A quirk of Python 2.7.0 or lesser is that the callback is run when the next thread accesses the threadlocal. This oddity is a consequence of Python Issue 1868, fixed by Antoine Pitrou in late 2010 and released in Python 2.7.1.

Note also that I synchronize ncallbacks += 1 with a mutex, since += is not atomic in Python. This innocent-looking mutex harbors a dark intent, as we shall soon discover.

In Python 2.7.1 and newer, the code above works as expected: ncallbacks is equal to 1000 immediately after all the threads are joined. In Python 2.7.0, ncallbacks should be 999 after the threads are joined, and then 1000 after the main thread does the final getattr to trigger cleanup.

The bug is: In Python 2.7.0 and older, ncallbacks is sometimes a few callbacks shy of a thousand. A few threads have been buried in unmarked graves....

Its Impact

I found that an application running Python 2.7.0 or older, if it creates and destroys very large numbers of threads continuously for a long time, and if each thread calls end_request at least once and start_request more times than end_request, will occasionally leave a socket tied to a dead thread. These sockets will eventually exceed the process's ulimit or MongoDB's.

This application pattern would be as weird and unusual as it sounds, which is why no one's complained of the bug.

The Fix

Once I'd written the test code above, I spent a few hours futzing with it—Dammit, I thought this worked! I tried various techniques to force Python 2.7.0 to run the callback a thousand times reliably. Late in the day a divine voice intoned, "synchronize assignment to the threadlocal." So I added a lock:

local_lock = threading.Lock()
# ...
    vigil = Vigil()
    local_lock.acquire()
    local.vigil = vigil
    local_lock.release()
    refs.add(weakref.ref(vigil, on_thread_died))

It worked! Now I was angrier. How can assigning to a threadlocal not be thread-safe?

The Cause

Let's again consider the example code above. The bytecode for assigning vigil to local.vigil is:

28 LOAD_FAST        1 (vigil)
31 LOAD_GLOBAL      3 (local)
34 STORE_ATTR       4 (vigil)

STORE_ATTR calls PyObject_SetAttr, which calls local_setattro, defined in Modules/threadmodule.c:

static int
local_setattro(localobject *self, PyObject *name, PyObject *v)
{
    PyObject *ldict;

    ldict = _ldict(self);
    if (ldict == NULL)
        return -1;

    return PyObject_GenericSetAttr((PyObject *)self, name, v);
}

At the highlighted line it calls _ldict. The _ldict function is, as I've known for some time, a pathetic piece of poo in Python 2.7.0 and older. Here's the turd, edited down a bit:

static PyObject *
_ldict(localobject *self)
{
    PyObject *tdict, *ldict;

    tdict = PyThreadState_GetDict();
    ldict = PyDict_GetItem(tdict, self->key);
    if (ldict == NULL) {
        ldict = PyDict_New(); /* we own ldict */

        PyDict_SetItem(tdict, self->key, ldict);
        Py_DECREF(ldict); /* now ldict is borrowed */
        if (i < 0)
            return NULL;

        Py_CLEAR(self->dict);
        Py_INCREF(ldict);
        self->dict = ldict; /* still borrowed */
    }

    /* The call to tp_init above may have caused
       another thread to run.
       Install our ldict again. */
    if (self->dict != ldict) {
        Py_CLEAR(self->dict);
        Py_INCREF(ldict);
        self->dict = ldict;
    }

    return ldict;
}

We haven't seen any use of the Py_BEGIN_ALLOW_THREADS macro, so one thread's had the GIL the whole time. Locking around the assignment shouldn't have any effect, right?

Well, take a look at the highlighted Py_CLEAR(self->dict) statement—there's the perpetrator. That statement gets the ldict of the last thread that accessed this threadlocal, swaps it with NULL and decrefs it. If this is the last reference to ldict (because the last thread has died) then decref'ing destroys it, and the weakref callback to vigil runs. The callback does ncallbacks_lock.acquire, which releases the GIL before trying to get the mutex.

So here's the kind of scenario I prevented by locking around assignment to the threadlocal:

  1. Thread A starts, assigns to the threadlocal, dies.
  2. Thread A's ldict is now the threadlocal's self->dict and has a refcount of 1.
  3. Thread B starts, begins assigning to the threadlocal, enters the _ldict function.
  4. _ldict sets self->dict to NULL and decrefs Thread A's ldict, which runs on_thread_died, which calls ncallbacks_lock.acquire and releases the GIL.
  5. Now Thread C starts, begins assigning to the threadlocal, enters _ldict.
  6. Thread C finds self->dict is NULL, increfs its own local ldict and assigns it to self->dict. It exits _ldict.
  7. Thread B resumes at Py_CLEAR(self->dict), increfs its own ldict and assigns it to self->dict.

Thread B has now replaced a pointer to Thread C's ldict with a pointer to its own, but it didn't decref Thread C's ldict first. (_ldict wasn't written to survive interruption during Py_CLEAR.) Thread C's ldict will never be destroyed, and a weakref callback to its vigil attribute will never be called.

Locking around assignment to the threadlocal prevents _ldict from running concurrently for any one threadlocal object, and prevents the refleak. In Python 2.7.1 and newer, the whole misguided self->dict system is removed from threadlocals and the lock's not needed.

This scenario applies to PyMongo's connection pool because the pool does need to acquire a lock in its weakref callback. Even if it didn't, there's a possibility of interruption whenever a thread is running Python code.

A Kvetch

This testing, the bug it revealed, the investigation, the fix: all this effort was spent to support entirely obsolete versions of Python. The Python core developers stopped maintaining them years ago, but PyMongo supports all Pythons going back to 2.4, mainly because there are "long-term support" Linux distros like Ubuntu and RHEL that once shipped with them. I have very savvy friends writing new applications on Python 2.6. Our children will have flying cars before we're done debugging these steam-powered versions of Python.

It's particularly frustrating because there's no point even filing bugs against Pythons before 2.7. "We fixed it," the developers will reply. "Upgrade." In Python 2.6, no one can hear you scream.

The Green Matrix

For a year and a half I've been part of the team maintaining PyMongo, the Python MongoDB driver. It's one of the most widely used Python packages with 1.5 million lifetime downloads. The code itself is only moderately complex; about 8300 [...]

For a year and a half I've been part of the team maintaining PyMongo, the Python MongoDB driver. It's one of the most widely used Python packages with 1.5 million lifetime downloads. The code itself is only moderately complex; about 8300 source lines. What makes it a tiny horror to work on is the range of environments we support. Here's our test matrix in Jenkins:

PyMongo test matrix

That's 72 test configurations. (It looks like more than that, but we don't test Jython and PyPy with C extensions compiled since that currently doesn't make sense.) The dimensions are:

  • Python version: We support CPython 2.4 through 3.3. On each commit we test just the highlight versions: 2.4, 2.7, and 3.3. We also support the latest Jython and PyPy. We test the intermediate versions like 2.5 and 2.6 before a release.

  • C extensions: we have a few key parts of PyMongo implemented in C for speed, with pure-Python versions as a fallback. We test both modes.

  • MongoDB Version: We test the latest development branch of MongoDB (2.5) plus the last two production versions.

  • MongoDB Configuration: We set up a single server, a master-slave pair, and a three-node replica set, and run mostly the same tests against all.

In each test configuration, PyMongo's test suite has about 430 individual test functions.

This covers the main test matrix, but there are some auxiliary tests we run in Jenkins on every commit. We have a mod_wsgi test that runs a few thousand web requests (first serial, then parallel) against a web app using mod_wsgi in a range of configurations:

  • Python 2.4, 2.5, 2.6, and 2.7

  • mod_wsgi 2.8, 3.2, and 3.3

  • The latest production MongoDB as a single server or replica set

The mod_wsgi tests are there to ensure we never recreate a connection leak like the apocalyptic "unbounded connection growth with Apache mod_wsgi 2.x" bug to which I lost some of the best weeks of my life.

I've also set up some tests for Motor, my non-blocking MongoDB driver for Tornado: I run in Python 2.6, 2.7, and 3.3 against a single MongoDB server and a replica set, running the three most recent versions of MongoDB. I have a separate Motor test that connects to MongoDB over SSL, and finally I have a test of "Synchro," which wraps Motor inside a resynchronization layer and checks it can pass all the same tests as PyMongo. In all, Jenkins runs 33 test configurations for each Motor commit.

Jenkins automatically tests our main configurations, but we periodically hand-test some additional configurations, like sharded clusters, beta releases of Jython and PyPy, and Windows. We'll put some of these in Jenkins too.

For a team of three people to build and maintain this volume of test infrastructure is a huge effort. It's clearly worth it, because the test matrix is so large. But it's not much fun.

Lessons learned:

  • Test code is a liability: Too much testing code is as bad as too much of any other kind of code. Write as few tests as possible to cover the cases you need to test. Over-testing comforts the novice but impedes agility. For example, when we renamed PyMongo's Connection class to MongoClient, I had to change over 1000 lines in 32 files in the test suite. A commit that huge is a barrier in the repository's history, across which no commit can be moved without conflicts. I hope to never do anything like it again. The test suite should be smaller and better factored.

  • Tests must be very reliable: It needs to be not only minimal but also very reliable. Tests should fail if and only if the behavior they test breaks. When I joined the team, PyMongo's tests often failed "just cuz." Fixing them all took months: We'd observe an intermittent failure in Jenkins due to some race condition that we couldn't reproduce on our laptops (an EC2 "medium" instance runs a three-node MongoDB cluster slower than you could possibly imagine). We'd think real hard and finally understand and fix the failure. Then we'd do the same for some other test. It was a costly exercise but necessary: It's not until our tests always passed that we took them seriously when they didn't.

There are other dicta that I find negotiable: tests should be fast, sure, but I can live with a test suite that takes a few minutes to run per configuration. Perhaps test methods should include only one assert, but I can live with several asserts in some methods.

I'm implacably opposed to mocking when it comes to testing PyMongo: what our tests verify is primarily our understanding of how to talk to MongoDB. If we mocked any aspect whatsoever of the MongoDB server, our tests would be worse than useless. Virtually every test of PyMongo is an integration test, so we make no distinction between "unit tests" and "integration tests."

I'm curious what others have learned from maintaining a driver's test suite. It seems to be a lot of hard work no matter what.

Slides From My Talk On Python Coroutines

Here's the slides from tonight's NYC Python Meetup talk on coroutines in Tornado and Tulip. The slides are a bit inscrutable on their own—it's my style to just show code, then talk a lot to explain the code. Still, if you were there [...]

Here's the slides from tonight's NYC Python Meetup talk on coroutines in Tornado and Tulip. The slides are a bit inscrutable on their own—it's my style to just show code, then talk a lot to explain the code. Still, if you were there tonight you may find these useful.

Python Coroutines, Present and Future from emptysquare

Toro Rewritten for Tornado 3.0

Speaking of my package Toro, I've just released version 0.5. Toro provides semaphores, queues, and so on, for advanced control flows with Tornado coroutines. Version 0.5 is a rewrite, motivated by two recent events. First, the release [...]

Toro

Speaking of my package Toro, I've just released version 0.5. Toro provides semaphores, queues, and so on, for advanced control flows with Tornado coroutines.

Version 0.5 is a rewrite, motivated by two recent events. First, the release of Tornado 3.0 has introduced a much more convenient coroutine API, and I wanted Toro to support the modern style. Second, I contributed a version of Toro's queues to Tulip, and the queues changed a bit in the process. As much as possible, I updated Toro to match the API of Tulip's locks and queues, for consistency's sake.

In previous versions, most Toro methods had to be wrapped in gen.Task, which made for weird-looking code. But using Toro is now quite graceful. For example, a producer-consumer pair:

q = toro.Queue()

@gen.coroutine
def producer():
    for item in range(5):
        print 'Sending', item
        yield q.put(item)

@gen.coroutine
def consumer():
    while True:
        item = yield q.get()
        print '\t\t', 'Got', item

consumer()
producer()
IOLoop.current().start()

Another nice new feature: Semaphore.acquire and Lock.acquire can be used with the with statement:

lock = toro.Lock()

@gen.coroutine
def f():
   with (yield lock.acquire()):
       print "We're in the lock"

   print "Out of the lock"

More examples are in the docs. Enjoy!

My PyCon Lightning Talk About Toro

The lightning talk I gave at PyCon is now online. I talked for 4½ minutes on Toro, the package I wrote to provide locks, events, conditions, semaphores, and queues for Tornado. Watch for a quick intro on advanced control flow with [...]

The lightning talk I gave at PyCon is now online. I talked for 4½ minutes on Toro, the package I wrote to provide locks, events, conditions, semaphores, and queues for Tornado. Watch for a quick intro on advanced control flow with coroutines:

reStructuredText in PyCharm, Firefox, and Anger

I spend a lot of time writing Python package documentation in reST. Nevertheless, I find reST's markup permanently unlearnable, so I format docs by trial and error: I type a few backticks and colons and angle-brackets and random crap, [...]

I spend a lot of time writing Python package documentation in reST. Nevertheless, I find reST's markup permanently unlearnable, so I format docs by trial and error: I type a few backticks and colons and angle-brackets and random crap, sphinx-build the docs as HTML, and see if they look okay.

Here's some tools to support this expert workflow.

PyCharm: My favorite Python IDE has basic syntax-highlighting and auto-completion for reST. It's not much, but it far exceeds the amount of reStructuredText syntax that can fit in my tiny brain. It really shines when I'm embedding Python code examples in my docs: PyCharm gives me full IDE support, including automatically adding imports, auto-completing method names and parameters, and nearly all the help I get when editing normal Python files.

There's a file-watcher plugin for PyCharm that seems like a nice way to rebuild docs when the source files change, but it's not yet compatible with the latest version of PyCharm. So instead:

Watchdog: I install the watchdog Python package, which watches files and directories for changes. Watchdog gives me a command-line tool called watchmedo. (I find this fact unlearnable, too; why isn't the tool called watchdog the same as the package?) I tell it to watch my package's files for changes and rebuild the docs whenever I save a file:

watchmedo shell-command --command="sphinx-build doc build" .

Now that I can regenerate HTML automatically, I need a way to reload the browser window automatically:

auto-reload is a Firefox extension that detects any tab with a file:// URL and reloads it when the file changes. In my testing it seems to detect changes in linked files (CSS and Javascript) too. A nice little bar slides down to tell me when it's reloading. That way I know that the reason the page is still a mess is because my reST is still wrong, not because it hasn't reloaded:

Auto reload

This little suite of tools deals well with invoking Sphinx and reloading my web page, so I can focus on the task at hand: trying to write reStructuredText, which is a loathsome afterbirth expelled from the same womb as XML and TeX.

Review of "MongoDB Applied Design Patterns" by Rick Copeland

There's a lot of bad advice out there regarding MongoDB. As I wrote in my last review, even smart sources can encourage risky methods. Soon, I hope, there will be as much good MongoDB instruction from experts outside 10gen as there is good [...]

There's a lot of bad advice out there regarding MongoDB. As I wrote in my last review, even smart sources can encourage risky methods. Soon, I hope, there will be as much good MongoDB instruction from experts outside 10gen as there is good third-party SQL instruction. For now, know that you can trust Rick Copeland.

Copeland's new O'Reilly book on MongoDB complements O'Reilly's other five: the majestic Definitive Guide (due for a second edition in June), Scaling MongoDB, 50 Tips and Tricks, and the MongoDB books for Python and PHP.

After you've read the Definitive Guide, a good candidate for your second MongoDB book is Applied Design Patterns. (Disclosure: I was paid to critique an early draft.) Copeland's intended audience has basic MongoDB competence and wants application examples that optimize either for scalability or maintainability, plus the principles to guide new designs. Copeland also assumes basic SQL knowledge, and presents most examples in contrast to conventional SQL solutions, a method I find distracting and irrelevant. He identifies some common application types (product catalog, CMS, analytics, etc.) and provides for each a schema and application logic. He goes far beyond prior works when he discusses performance, consistency guarantees, and sharding considerations for every application.

MongoDB Applied Design Patterns

In Part 1, Copeland discusses the basic questions about MongoDB schemas. Right away, he identifies what makes nonrelational design different:

There is no longer a "garden path" of normalized database design to go down, and the go-to answer when faced with general schema design problems in MongoDB is "it depends".

MongoDB requires optimization up front, more often than SQL schema design does. (Armin Ronacher noticed this too a few months ago.) Most often the question is whether to embed or to link, and what data should be normalized or denormalized. Copeland uses an extensive description of disk seek times to explain the motivations for embedding and denormalization, better than prior MongoDB schema-design materials have.

Many presentations, my own included, have claimed that you can migrate your schema lazily with MongoDB: your application can start writing data in a new format, and read data in both new and old formats, while a batch job slowly migrates old data. MongoDB Applied Design Patterns finally presents a complete example of lazy migration, including example code (in Python) for reading data in both formats while the migration is in progress.

Without general-purpose transactions, MongoDB requires new techniques to guarantee that a series of changes is atomic: that is, to guarantee that in the long run your data either reflects all the changes or none of them. The simple approach is to put all related data in one document and use update operators to modify all the data in one shot. If there's no way to restrict your atomic operation to one document, your next best bet is optimistic concurrency control: try to complete the operation, check if another process overwrote your changes, and if so retry them. There are a number of examples of this in the wild (the MongoDB Manual, Dan Crosta, Scott Hernandez); Copeland's contribution is unusually complete, with example code for handling every case that can arise.

Part 2 of the book is much longer, and covers six kinds of application in depth, both conventional (a social network) and unusual (a role-playing game). Here Copeland excels. Where he covers well-tread ground his designs are more detailed and better thought out than prior authors', and where he innovates he chooses interesting problems to solve. In the Operational Intelligence chapter he explains compound indexes clearly and correctly. He presents a complete design for an analytics application using the MongoDB aggregation framework, and covers the interactions between aggregation, indexes, and sharding.

The final example of the book is an online Zork-style game. This is less widely applicable than E-Commerce or content management, but way more fun. Copeland chooses to radically denormalize his schema: when a player enters a room, the room's entire data structure is copied into the player's document so the game can display the player's state without querying for the room again. As with the other examples, this application is considered in depth: each query is carefully indexed, and when a player picks up an item, Copeland's code prevents another player from picking it up concurrently. Most of the game's intelligence is expressed in Python code rather than in MongoDB queries. Developers using Oracle or Microsoft SQL Server tend to push all the logic and complexity into their schema, their queries, and stored procedures. With MongoDB's simpler feature-set, coders have to move more logic out of the database and into their application. If a SQL refugee hasn't yet learned this lesson, the gaming chapter will drive it home.

Slides from my PyCon lightning talk on Toro

Here's the 8 slides for my 4½-minute talk on Toro this morning. Toro is a package I wrote last year that provides objects something like locks, events, conditions, semaphores, and queues for Tornado coroutines. PyCon lightning [...]

Here's the 8 slides for my 4½-minute talk on Toro this morning. Toro is a package I wrote last year that provides objects something like locks, events, conditions, semaphores, and queues for Tornado coroutines.

PyCon lightning talk on my Toro module for Tornado from emptysquare