How To Do An Isolated Install of Brubeck

Brubeck

I wanted to install James Dennis’s Brubeck web framework, but lately I’ve become fanatical about installing nothing, nothing, in the system-wide directories. A simple rm -rf brubeck/ should make it like nothing ever happened.

So that I remember this for next time, here’s how I did an isolated install of Brubeck and all its dependencies on Mac OS Lion.

Install virtualenv and virtualenvwrapper (but of course you’ve already done this, because you’re elite like me).

Make a virtualenv

mkvirtualenv brubeck; cdvirtualenv

ZeroMQ

wget http://download.zeromq.org/historic/zeromq-2.1.9.tar.gz
tar zxf zeromq-2.1.9.tar.gz
cd zeromq-2.1.9
./autogen.sh
./configure --prefix=.. # Don't install system-wide, just in your virtualenv's directory
make
cd ..

Mongrel2

git clone https://github.com/zedshaw/mongrel2.git
cd mongrel2
emacs Makefile

Add a line like this to the top of the Makefile, so the compiler can find where you’ve installed ZeroMQ’s header and lib files:

OPTFLAGS += -I/Users/emptysquare/.virtualenvs/brubeck/include -L/Users/emptysquare/.virtualenvs/brubeck/lib

and replace PREFIX?=/usr/local with something like: PREFIX?=/Users/emptysquare/.virtualenvs/brubeck

(If you can get this to work with relative instead of absolute paths, please tell me in the comments!)

make
make install
cd ..

Python Packages

Now we need our isolated include/ and lib/ directories available on the path when we install Brubeck’s Python package dependencies. Specifically, the gevent_zeromq package has some C code that needs to find zmq.h and libzmq in order to compile. We’ll do that by setting the LIBRARY_PATH and C_INCLUDE_PATH environment variables:

cd brubeck
export LIBRARY_PATH=/Users/emptysquare/.virtualenvs/brubeck/lib
export C_INCLUDE_PATH=/Users/emptysquare/.virtualenvs/brubeck/include
pip install -I -r ./envs/brubeck.reqs
pip install -I -r ./envs/gevent.reqs

How nice is that?

Brubeck

git clone https://github.com/j2labs/brubeck.git
cd brubeck

I plan to do a little hacking on Brubeck itself soon, so rather than running python setup.py install here, I’m simply including my copy of Brubeck’s source code on my PYTHONPATH.

Next

Once you’re here, you have a completely isolated install of ZeroMQ, Mongrel2, Brubeck, and all its package dependencies. Continue with James’s Brubeck installation instructions at the “A Demo” portion.

Photos of Old Animals

The Times the other day linked to an excellent photo project, Isa Leshko’s “Elderly Animals”. The Times says,

Ms. Leshko was inspired to carry out her project after spending a year caring for her mother, who has Alzheimer’s disease and is now in a nursing home. She considered documenting the experience through pictures but soon decided against it. “A number of fine-art photographers have gone that route and produced really powerful work,” she said. “It just didn’t feel like the appropriate response for me. I didn’t think my mother could provide consent, and I wanted to be present as her daughter and caregiver.”

Richard Avedon’s upsetting photos of his sick father come to mind, and Phillip Toledano’s “Days With My Father”, which began as an award-winning website and was published last year as a book.

I’m particularly interested in Leshko’s “Elderly Animals” because it’s a novel subject, and because I typically find animals childlike, and the old animals in her photos are both recognizably old and still cute and innocent.

Zencation

Winter Ango

Photo: A. Jesse Jiryu Davis

I’m back in NYC tonight. I spent a few days in Chicago with my girlfriend’s family for Christmas, and we’re leaving tomorrow for our habitual weeklong Zen retreat with the Village Zendo. I’ve had a pretty good year—I ran my first half-marathon, I did a street retreat, I spent August as assistant Zen cook. I had my photos exhibited in two shows, and I took my first regular fulltime job in years, the most exciting job I’ve ever had, as Python Evangelist and software developer for 10gen / MongoDB. My girlfriend and I are finishing our second year together, a relationship of remarkably uninterrupted serenity and love.

This week I’m going to put it all down. I want to just disappear into zazen. My only effort will be the effort I make to make no effort at all. I’ll pick it all up again in the new year. ;-)

Tornado Unittesting: Eventually Correct

Time was, time is ...

Photo: Tim Green

I’m a fan of Tornado, one of the major async web frameworks for Python, but unittesting async code is a total pain. I’m going to review what the problem is, look at some klutzy solutions, and propose a better way. If you don’t care what I have to say and you just want to steal my code, get it on GitHub.

The problem

Let’s say you’re working on some profoundly complex library that performs a time-consuming calculation, and you want to test its output:

# test_sync.py
import time
import unittest

def calculate():
    # Do something profoundly complex
    time.sleep(1)
    return 42

class SyncTest(unittest.TestCase):
    def test_find(self):
        result = calculate()
        self.assertEqual(42, result)

if __name__ == '__main__':
    unittest.main()

See? You do an operation, then you check that you got the expected result. No sweat.

But what about testing an asynchronous calculation? You’re going to have some troubles. Let’s write an asynchronous calculator and test it:

# test_async.py
import time
import unittest
from tornado import ioloop

def async_calculate(callback):
    """
    @param callback:    A function taking params (result, error)
    """
    # Do something profoundly complex requiring non-blocking I/O, which
    # will complete in one second
    ioloop.IOLoop.instance().add_timeout(
        time.time() + 1,
        lambda: callback(42, None)
    )

class AsyncTest(unittest.TestCase):
    def test_find(self):
        def callback(result, error):
            print 'Got result', result
            self.assertEqual(42, result)

        async_calculate(callback)
        ioloop.IOLoop.instance().start()

if __name__ == '__main__':
    unittest.main()

Huh. If you run python test_async.py, you see the expected result is printed to the console:

Got result 42

… and then the program hangs forever. The problem is that ioloop.IOLoop.instance().start() starts an infinite loop. You have to stop it explicitly before the call to start() will return.

A Klutzy Solution

Let’s stop the loop in the callback:

        def callback(result, error):
            ioloop.IOLoop.instance().stop()
            print 'Got result', result
            self.assertEqual(42, result)

Now if you run python test_async.py everything’s copacetic:

$ python test_async.py
Got result 42
.
----------------------------------------------------------------------
Ran 1 test in 1.001s

OK

Let’s see if our test will actually catch a bug. Change the async_calculate() function to produce the number 17 instead of 42:

def async_calculate(callback):
    """
    @param callback:    A function taking params (result, error)
    """
    # Do something profoundly complex requiring non-blocking I/O, which
    # will complete in one second
    ioloop.IOLoop.instance().add_timeout(
        time.time() + 1,
        lambda: callback(17, None)
    )

And run the test:

$ python foo.py
Got result 17
ERROR:root:Exception in callback 
Traceback (most recent call last):
  File "/Users/emptysquare/.virtualenvs/blog/lib/python2.7/site-packages/tornado/ioloop.py", line 396, in _run_callback
    callback()
  File "foo.py", line 14, in 
    lambda: callback(17, None)
  File "foo.py", line 22, in callback
    self.assertEqual(42, result)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/unittest/case.py", line 494, in assertEqual
    assertion_func(first, second, msg=msg)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/unittest/case.py", line 487, in _baseAssertEqual
    raise self.failureException(msg)
AssertionError: 42 != 17
.
----------------------------------------------------------------------
Ran 1 test in 1.002s

OK

An AssertionError is raised, but the test still passes! Alas, Tornado’s IOLoop suppresses all exceptions. The exceptions are printed to the console, but the unittest framework thinks the test has passed.

A Better Way

We’re going to perform some minor surgery on Tornado to fix this up, by creating and installing our own IOLoop which re-raises all exceptions in callbacks. Luckily, Tornado makes this easy. Add import sys to the top of test_async.py, and paste in the following:

class PuritanicalIOLoop(ioloop.IOLoop):
    """
    A loop that quits when it encounters an Exception.
    """
    def handle_callback_exception(self, callback):
        exc_type, exc_value, tb = sys.exc_info()
        raise exc_value

Now add a setUp() method to AsyncTest which will install our puritanical loop:

    def setUp(self):
        super(AsyncTest, self).setUp()

        # So any function that calls IOLoop.instance() gets the
        # PuritanicalIOLoop instead of the default loop.
        if not ioloop.IOLoop.initialized():
            loop = PuritanicalIOLoop()
            loop.install()
        else:
            loop = ioloop.IOLoop.instance()
            self.assert_(
                isinstance(loop, PuritanicalIOLoop),
                "Couldn't install PuritanicalIOLoop"
            )

This is a bit over-complicated for our simple case—a call to PuritanicalIOLoop().install() would suffice—but this will all come in handy later. In our simple test suite, setUp() is only run once, so the check for IOLoop.initialized() is unnecessary, but you’ll need it if you run multiple tests. The call to super() will be necessary if we inherit from a TestCase with a setUp() method, which is exactly what we’re going to do below. For now, just run python test_async.py and observe that we get a proper failure:

$ python foo.py
Got result 17
F
======================================================================
FAIL: test_find (__main__.SyncTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "foo.py", line 49, in test_find
    ioloop.IOLoop.instance().start()
  File "/Users/emptysquare/.virtualenvs/blog/lib/python2.7/site-packages/tornado/ioloop.py", line 263, in start
    self._run_callback(timeout.callback)
  File "/Users/emptysquare/.virtualenvs/blog/lib/python2.7/site-packages/tornado/ioloop.py", line 398, in _run_callback
    self.handle_callback_exception(callback)
  File "foo.py", line 25, in handle_callback_exception
    raise exc_value
AssertionError: 42 != 17

----------------------------------------------------------------------
Ran 1 test in 1.002s

FAILED (failures=1)

Lovely. Change async_calculate() back to the correct version that produces 42.

An Even Better Way

So we’ve verified that our test catches bugs in the calculation. But what if we have a bug that prevents our callback from ever being called? Add a return statement at the top of async_calculate() so we don’t execute the callback:

def async_calculate(callback):
    """
    @param callback:    A function taking params (result, error)
    """
    # Do something profoundly complex requiring non-blocking I/O, which
    # will complete in one second
    return
    ioloop.IOLoop.instance().add_timeout(
        time.time() + 1,
        lambda: callback(42, None)
    )

Now if we run the test, it hangs forever, because IOLoop.stop() is never called. How can we write a test that asserts that the callback is eventually executed? Never fear, I’ve written some code:

class AssertEventuallyTest(unittest.TestCase):
    def setUp(self):
        super(AssertEventuallyTest, self).setUp()

        # Callbacks registered with assertEventuallyEqual()
        self.assert_callbacks = set()

    def assertEventuallyEqual(
        self, expected, fn, msg=None, timeout_sec=None
    ):
        if timeout_sec is None:
            timeout_sec = 5
        timeout_sec = max(timeout_sec, int(os.environ.get('TIMEOUT_SEC', 0)))
        start = time.time()
        loop = ioloop.IOLoop.instance()

        def callback():
            try:
                self.assertEqual(expected, fn(), msg)
                # Passed
                self.assert_callbacks.remove(callback)
                if not self.assert_callbacks:
                    # All asserts have passed
                    loop.stop()
            except AssertionError:
                # Failed -- keep waiting?
                if time.time() - start < timeout_sec:
                    # Try again in about 0.1 seconds
                    loop.add_timeout(time.time() + 0.1, callback)
                else:
                    # Timeout expired without passing test
                    loop.stop()
                    raise

        self.assert_callbacks.add(callback)

        # Run this callback on the next I/O loop iteration
        loop.add_callback(callback)

This class lets us register any number of functions which are called periodically until they equal their expected values, or time out. The last function that succeeds or times out stops the IOLoop, so your test definitely finishes. The timeout is configurable, either as an argument to assertEventuallyEqual() or as an environment variable TIMEOUT_SEC. Setting a very large timeout value in your environment is useful for debugging a misbehaving unittest—set it to a million seconds so you don’t time out while you’re stepping through the code.

(My code’s inspired by the Scala world’s “eventually” test, which Brendan W. McAdams showed me.)

Paste AssertEventuallyTest into test_async.py and fix up your test case to inherit from it:

class AsyncTest(AssertEventuallyTest):
    def setUp(self):
        < ... snip ... >

    def test_find(self):
        results = []
        def callback(result, error):
            print 'Got result', result
            results.append(result)

        async_calculate(callback)

        self.assertEventuallyEqual(
            42,
            lambda: results and results[0]
        )

        ioloop.IOLoop.instance().start()

The call to IOLoop.stop() is gone from the callback, and we’ve added a call to assertEventuallyEqual() just before starting the IOLoop.

There are two details to note about this code:

Detail the First: assertEventuallyEqual()‘s first argument is the expected value, and its second argument is a function that should eventually equal the expected value. Hence the lambda.

Detail the Second: callback() needs a place to store its result so that lambda can find it, but here we run into a nasty peculiarity of Python. Python functions can assign to variables in their own scope, or the global scope (with the global keyword), but inner functions can’t assign to values in outer functions’ scope. Python 3 introduces a nonlocal keyword to solve this, but meanwhile we can hack around the problem by creating a results list in the outer function and appending to it in the inner function. This is a common idiom that you’ll use a lot when you write callbacks in asynchronous unittests.

Conclusion

I’ve packed up PuritanicalIOLoop and AssertEventuallyTest on GitHub; go grab the code. Your test cases can choose to inherit from PuritanicalTornadoTest, AssertEventuallyTest, or both. Just make sure your setUp methods call super(MyTestCaseClass, self).setUp(). Go forth and test!

The Rise of Developeronomics

Server Room

I have some thoughts about Venkatesh Rao’s Forbes article, “The Rise of Developeronomics”. The article, in brief, argues that “software is now the core function of every company, no matter what it makes,” and that, as “software eats the world,” maintaining relationships with excellent software developers is a prerequisite for survival for all firms.

One of the article’s insights is, “while other industries have come up with systems to (say) systematically use mediocre chemists or accountants in highly leveraged ways, the software industry hasn’t.” This is certainly true, and the most successful firms realize it. Again and again, I’ve worked for companies that try to save money, or accelerate development, by adding teams of mediocre (typically offshore) developers to a staff of great hackers. It almost never works, either because very few managers know how to use mediocre developers efficiently, or because it’s impossible.

But I’m not sure that’s always going to be so. I kept thinking as I read this article, “each year we write software that prevents us from having to write more software.” WordPress means we don’t have to make CMSes any more. Hadoop means we don’t need to spend months writing ETLs like we did a few years ago. MongoDB makes it much easier to create and deploy a scalable data store. The list goes on—won’t there come an inflection point when we’ve made so much software that the need for new code levels off?

And yet each time we discover a new thing software can do (mobile apps, social networks, big data, …) it accelerates the growth of demand for software. I think this article might be roughly right about the trends for the foreseeable future. Carlo Cabanilla pointed out to me on Facebook that “as more and more software exists to solve common problems, Ops will become more and more valuable because you’ll always need a scalable, cost efficient way to manage these things. You can have the best app in the world, but if it’s always going down, it’s like it doesn’t exist.” He should know, since he works at DataDog, which is trying to solve this problem.

Ken Young, who’s solving big-data problems over at Mortar Data, thinks that “until the world has been faithfully modeled in software to the last degree there will be new need to predict and manipulate the real world in all its complexity. And since we are no closer to understanding the world than we were in Newton’s time (or so it seems)….”

Right, and even if we did model the whole world, we’d need another system to model all the software we’ve written so we know whether it’s running correctly, and so we can keep it running correctly. And as Turing proved, we can only get asymptotically close to that goal.

Save the Monkey: Reliably Writing to MongoDB

Photo: Kevin Jones

MongoDB replica sets claim “automatic failover” when a primary server goes down, and they live up to the claim, but handling failover in your application code takes some care. I’ll walk you through writing a failover-resistant application in Python using a new feature in PyMongo 2.1: the ReplicaSetConnection.

Setting the Scene

Mabel the Swimming Wonder Monkey is participating in your cutting-edge research on simian scuba diving. To keep her alive underwater, your application must measure how much oxygen she consumes each second and pipe the same amount of oxygen to her scuba gear. In this post, I’ll only cover writing reliably to Mongo. I’ll get to reading later.

MongoDB Setup

Since Mabel’s life is in your hands, you want a robust Mongo deployment. Set up a 3-node replica set. We’ll do this on your local machine using three TCP ports, but of course in production you’ll have each node on a separate machine:

1
2
3
4
5
$ mkdir db0 db1 db2
$ mongod --dbpath db0 --logpath db0/log --pidfilepath db0/pid --port 27017 --replSet foo --fork
$ mongod --dbpath db1 --logpath db1/log --pidfilepath db1/pid --port 27018 --replSet foo --fork
$ mongod --dbpath db2 --logpath db2/log --pidfilepath db2/pid --port 27019 --replSet foo --fork

(Make sure you don’t have any mongod processes running on those ports first.)

Now connect up the nodes in your replica set. My machine’s hostname is ‘emptysquare.local’; replace it with yours when you run the example:

1
2
3
4
5
6
7
8
9
10
11
12
$ hostname
emptysquare.local
$ mongo
> rs.initiate({
_id: 'foo',
members: [
{_id: 0, host:'emptysquare.local:27017'},
{_id: 1, host:'emptysquare.local:27018'},
{_id: 2, host:'emptysquare.local:27019'}
]
})

The first _id, ‘foo’, must match the name you passed with –replSet on the command line, otherwise Mongo will complain. If everything’s correct, Mongo replies with, “Config now saved locally. Should come online in about a minute.” Run rs.status() a few times until you see that the replica set has come online—the first member’s stateStr will be “PRIMARY” and the other two members’ stateStrs will be “SECONDARY”. On my laptop this takes about 30 seconds.

Voilà: a bulletproof 3-node replica set! Let’s start the Mabel experiment.

Definitely Writing

Install PyMongo 2.1 and create a Python script called mabel.py with the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
import datetime, random, time
import pymongo
mabel_db = pymongo.ReplicaSetConnection(
'localhost:27017,localhost:27018,localhost:27019',
replicaSet='foo'
).mabel
while True:
time.sleep(1)
mabel_db.breaths.insert({
'time': datetime.datetime.utcnow(),
'oxygen': random.random()
}, safe=True)
print 'wrote'

mabel.py will record the amount of oxygen Mabel consumes (or, in our test, a random amount) and insert it into Mongo once per second. Run it:

1
2
3
4
5
$ python mabel.py
wrote
wrote
wrote

Now, what happens when our good-for-nothing sysadmin unplugs the primary server? Let’s simulate that in a separate terminal window by grabbing the primary’s process id and killing it:

1
2
$ kill `cat db0/pid`

Switching back to the first window, all is not well with our Python script:

1
2
3
4
5
6
7
8
9
Traceback (most recent call last):
File "mabel.py", line 10, in <module>
'oxygen': random.random()
File "/Users/emptysquare/.virtualenvs/pymongo/mongo-python-driver/pymongo/collection.py", line 310, in insert
continue_on_error, self.__uuid_subtype), safe)
File "/Users/emptysquare/.virtualenvs/pymongo/mongo-python-driver/pymongo/replica_set_connection.py", line 738, in _send_message
raise AutoReconnect(str(why))
pymongo.errors.AutoReconnect: [Errno 61] Connection refused

This is terrible. WTF happened to “automatic failover”? And why does PyMongo raise an AutoReconnect error rather than actually automatically reconnecting?

Well, automatic failover does work, in the sense that one of the secondaries will quickly take over as a new primary. Do rs.status() in the mongo shell to confirm that:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
$ mongo --port 27018 # connect to one of the surviving mongod's
PRIMARY> rs.status()
// edited for readability ...
{
    "set" : "foo",
    "members" : [ {
            "_id" : 0,
            "name" : "emptysquare.local:27017",
            "stateStr" : "(not reachable/healthy)",
            "errmsg" : "socket exception"
        }, {
            "_id" : 1,
            "name" : "emptysquare.local:27018",
            "stateStr" : "PRIMARY"
        }, {
            "_id" : 2,
            "name" : "emptysquare.local:27019",
            "stateStr" : "SECONDARY",
        }
    ]
}

Depending on which mongod took over as the primary, your output could be a little different. Regardless, there is a new primary, so why did our write fail? The answer is that PyMongo doesn’t try repeatedly to insert your document—it just tells you that the first attempt failed. It’s your application’s job to decide what to do about that. To explain why, let us indulge in a brief digression.

Brief Digression: Monkeys vs. Kittens

If what you’re inserting is voluminous but no single document is very important, like pictures of kittens or web analytics, then in the extremely rare event of a failover you might prefer to discard a few documents, rather than blocking your application while it waits for the new primary. Throwing an exception if the primary dies is often the right thing to do: You can notify your user that he should try uploading his kitten picture again in a few seconds once a new primary has been elected.

But if your updates are infrequent and tremendously valuable, like Mabel’s oxygen data, then your application should try very hard to write them. Only you know what’s best for your data, so PyMongo lets you decide. Let’s return from this digression and implement that.

Trying Hard to Write

Let’s bring up the mongod we just killed:

1
2
$ mongod --dbpath db0 --logpath db0/log --pidfilepath db0/pid --port 27017 --replSet foo --fork

And update mabel.py with the following armor-plated loop:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
while True:
time.sleep(1)
data = {
'time': datetime.datetime.utcnow(),
'oxygen': random.random()
}
# Try for five minutes to recover from a failed primary
for i in range(60):
try:
mabel_db.breaths.insert(data, safe=True)
print 'wrote'
break # Exit the retry loop
except pymongo.errors.AutoReconnect, e:
print 'Warning', e
time.sleep(5)

Now run python mabel.py, and again kill the primary. Do either “kill `cat db1/pid`” or “kill `cat db2/pid`”, depending on which mongod is the primary right now. mabel.py’s output will look like:

1
2
3
4
5
6
7
8
9
10
wrote
Warning [Errno 61] Connection refused
Warning emptysquare.local:27017: [Errno 61] Connection refused, emptysquare.local:27019: [Errno 61] Connection refused, emptysquare.local:27018: [Errno 61] Connection refused
Warning emptysquare.local:27017: not primary, emptysquare.local:27019: [Errno 61] Connection refused, emptysquare.local:27018: not primary
wrote
wrote
.
.
.

mabel.py goes through a few stages of grief when the primary dies, but in a few seconds it finds a new primary, inserts its data, and continues happily.

What About Duplicates?

Leaving monkeys and kittens aside, another reason PyMongo doesn’t automatically retry your inserts is the risk of duplication: If the first attempt caused an error, PyMongo can’t know if the error happened before Mongo wrote the data, or after. What if we end up writing Mabel’s oxygen data twice? Well, there’s a trick you can use to prevent this: generate the document id on the client.

Whenever you insert a document, Mongo checks if it has an “_id” field and if not, it generates an ObjectId for it. But you’re free to choose the new document’s id before you insert it, as long as the id is unique within the collection. You can use an ObjectId or any other type of data. In mabel.py you could use the timestamp as the document id, but I’ll show you the more generally applicable ObjectId approach:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
from pymongo.objectid import ObjectId
while True:
time.sleep(1)
data = {
'_id': ObjectId(),
'time': datetime.datetime.utcnow(),
'oxygen': random.random()
}
# Try for five minutes to recover from a failed primary
for i in range(60):
try:
mabel_db.breaths.insert(data, safe=True)
print 'wrote'
break # Exit the retry loop
except pymongo.errors.AutoReconnect, e:
print 'Warning', e
time.sleep(5)
except pymongo.error.DuplicateKeyError:
# It worked the first time
pass

We set the document’s id to a newly-generated ObjectId in our Python code, before entering the retry loop. Then, if our insert succeeds just before the primary dies and we catch the AutoReconnect exception, then the next time we try to insert the document we’ll catch a DuplicateKeyError and we’ll know for sure that the insert succeeded. You can use this technique for safe, reliable writes in general.


Bibliography

Apocryphal story of Mabel, the Swimming Wonder Monkey

More likely true, very brutal story of 3 monkeys killed by a computer error

August Sander and Seydou Keïta

August sander

Photo: August Sander

I trudged to Chelsea through the disgusting rain tonight for a lecture on August Sander and Seydou Keïta at The Walther Collection. Art historians Shelley Rice and Lisa Binder gave a quick, entertaining intro to the two photographers.

August Sander, who worked in Germany in the 1910s and 20s, set out to be a social documentarian, but as Shelley Rice pointed out, his photos have a majesty beyond a mere census-taking. Each archetypal subject—the doctor, the bricklayer, the revolutionary—presents himself to the camera as a representative of his class, and ultimately the German civilization. The farmer or librarian in the photo is practically anonymous, but he trades his personal identity for the force of his whole nation.

Although Sander was constantly on my mind when I made Strangers, I hadn’t seen his original prints before. Old prints are so small! But the tonality of the ancient negatives and papers is astonishing.

Seydou Keïta

Photo: Seydou Keïta

Seydou Keïta was a commercial photographer in Mali in the 1940s and 50s. Only in the last two decades have his portraits been promoted to fine-art status. Lisa Binder delightfully summarized Keïta’s mythos as a “discovered” artist and hinted at the reality behind the myths, though she didn’t have time tonight to give the details. Each time I encounter Keïta I seem to hear the same conversation about the blurry distinctions between European art and African commercial photography, and I hear the same open-ended questions about whether we’re honoring Keïta or in some insidious way re-colonizing him. Regardless, Lisa Binder provided more interesting ideas about Keïta’s subjects’ effortless blending of African tradition and global modernity. The gallery walls displayed recent reprints of his negatives in the contemporary style: high-contrast, clean, and large. A small room on the side showed a handful of Keïta’s original contact prints for his clients. The originals were small, cracked, yellowed, and heart-breakingly intimate.

When I took the “Photographing Communities” class at ICP a few years ago with Dina Kantor, day one was August Sander and Seydou Keïta. If you want to photograph people and their culture, these two photographers are the starting point. To see them juxtaposed in a white Chelsea gallery will make your head spin—it’s worth the trip.

Walther Collection’s essay on Sander and Keïta

Book: MongoDB In Action

MongoDB in Action

My colleague at 10gen Kyle Banker has published his book MongoDB in Action:

MongoDB in Action is a comprehensive guide to MongoDB for application developers. The book begins by explaining what makes MongoDB unique and describing its ideal use cases. A series of tutorials designed for MongoDB mastery then leads into detailed examples for leveraging MongoDB in e-commerce, social networking, analytics, and other common applications.

Get it from Manning Publications.