The Trouble with Chernoff

Above: From Howard Wainer (1979).

In 1973, applied statistician Herman Chernoff proposed one of the most strange and ingenious ideas in the history of information visualization – symbolizing data using faces.

In the intervening years, the so called ‘Chernoff Face’ has become one of the most curious artifacts in the world of information visualization. Despite the best effort of academics to find a way to make the faces ‘work’, the usual response to Chernoff faces from the cartographic community is a combination of fascination and loathing.

Let’s back up though, and look at why on earth you would want to plaster a statistical map with cartoon faces.

Chernoff’s idea was certainly novel. Human beings have some ridiculously strong hard wiring for identifying faces. We sport an entire part of our brain dedicated to process images of faces, which snaps on faster than you can probably blink when a face comes into view. Our ability to recognize faces is famously overzealous: ‘pareidolia’ is the phenomenon of seeing faces in inanimate objects (and can make for some pretty amusing photographs). Think of the hundreds of different faces of friends, family members, celebrities, etc that you can differentiate and recognize in an instant. Chernoff was curious if he could co-opt our powerful face-spotting abilities to help humanity deal with something our brains are notoriously bad at comprehending: numbers. Thus, the Chernoff face: a face-like symbol where the various proportions of the facial features each correspond to some sort of data item.

image

Above: From Herman Chernoff (1973)

The immediate issue with Chernoff faces is one of “correspondence”, if you want to use academic terms. Correspondence is simply how intuitive the relationship is between any given symbol and the thing its symbolizing. In other words, we don’t associate human faces with anything other than, well, human faces, so using them for numerical data seems silly. The faces above are Chernoff’s original formulation, which are used to visualize the qualities of various minerals, if you can believe it. Authors have since wisened up and now usually (… usually) stick to using Chernoff faces for socio-economic data. It makes a lot more sense to see a human face stand in for actual humans, or at least. In Eugene Turner’s famous Los Angeles Chernoff map, he has carefully constructed his face symbols such that poor neighborhoods are represented with emaciated, scowling faces, and wealthy neighborhoods are represented by grinning ones. The result is actually relative intuitive to read.

image

Above: Eugene Turner - Life in Los Angeles (1977)

For other uses, though, good correspondence is elusive. Take a look at the example legend for faces below, dug up by colleague Daniel Huffman: noses and ears are used to represent divorce rates and crime rates, respectively; with perhaps one exception, I can think of no one who believes that the size and shape of someone’s ear can speak to their propensity for crime. There are more symbolization issues here, as well. It’s sensible that low unemployment is represented by a happy grin, and high unemployment with a frown. But why are areas with a high proportion of women in the workforce represented with angry eyes? Are they supposing that workplace equality should frustrate us?

image

Above: From Joseph Spinelli and Yu Zhou (2004)

Chernoff faces hold a unique fascination for me. Essentially, they’re the combination of two of my most arcane bits of knowledge: cartooning, which I’ve studied and practiced for as long as I can remember, and multivariate data symbolization, which was the subject of my Master’s thesis. With those two bases, here’s couple more suggestions for why even the most carefully designed Chernoff faces get things fundamentally wrong:

  1. Facial features are not (usually) ordinal.

In data theory, something is ordinal if it can be arranged into meaningfully orderable groups. “High crime” is more than “medium crime” is more than “low crime”. Contrast with categorical data like “British”, “German”, “Australian”… which can’t be intuitively arranged into any sort of sliding scale.

Anyway, when visualizing data, ordinal data (like high/medium/low crime) should be represented in a way that is also ordinal (like a color ramp of black/gray/white). Do Chernoff maps meet this criteria? Not always. Let’s look at how Chernoff maps like to use eyebrows. Usually Chernoff maps vary the angle of the eyebrows as one of their data representations. In Turner’s L.A. map, the angle of the eyebrows represents the degree of urban stress, for example.

Unfortunately, the apparent emotions created by varying eyebrow slant are not particularly ordinal. Look at this example legend:

The apparent faces being made by these symbols are angry, neutral/bored, and sad. These emotions aren’t orderable: an angry person is not “more bored” than a sad person. A sad person is not “less angry” then a person with a blank expression. These emotions don’t fit along a sliding scale, so pairing them with data that does (e.g., the high/medium/low urban stresses in Turner’s map) is inappropriate at best and misleading at worst.

  1. Faces are, visually, more than the sum of their parts.

Quick, what does this face look like to you?

It looks like somebody with a mischievous grin, right? But not if you’re following the logic of a Chernoff map. They- albeit implicitly- presume that this face is happy for one reason, but angry for another. If this were on Turner’s map, for instance, it would represent high urban stresses but low unemployment (serendipitously for Turner, this ‘mixing of opposites’ never actually occurs within any of the data points in his map, so he lucked out of this issue. Most other Chernoff makers wouldn’t be so lucky).

The point is: facial expressions are, visually, more than the sum of their parts. Place a happy grin and angry eyebrows onto a face, and the result is not “both happy and angry”, but rather an entirely new emotion (impishness) that is not relatable to the components that created it. This creates problems in the context of the map, because mixed-up facial expressions like these A) just plain look silly and B) obscures the actual goings-on of the data.

Louisville is looking a little tipsy. Must be because of its satisfactory employment in the information sector.

  1. Only some facial features carry emotion.

Why articulate this point when I can let Ren & Stimpy creator John K. do it for me:

image

Above: From John Kricfalusi (2009)

Eyes
Eyebrows

Mouth Shape

Cheeks

Those 4 basic parts can add up to an infinite amount of expressions.”

Eyes, eyebrows, and mouths are the predominant facial features that convey emotion. It’s no surprise that text-based emoticons ( think ‘ >:D’ or ‘ ^_^ ’ or ‘ ;( ‘ ), for instance, will always represent the eyes and mouth, but not noses or ears or hairlines. Those latter features rarely provide information about how a person is feeling. This fact is literally cartooning 101, but Chernoff makers are still trying to come to terms with it, as evidenced by the fact that I’ve read published infovis literature on the subject.

When a feature doesn’t convey emotion, it’s hard for it to achieve a meaningful correspondence to any set of data. This is why it seems so silly when ear or nose size is used to represent things like crime, as mentioned previously. There are a few examples of non-emotive facial features that provide an exception: looking at Turner again, the emaciated-shaped faces provide a sensible proxy for low affluence. For the most part, though, relying on any facial feature that isn’t the eyes, eyebrows, or mouth is not going to be intuitive way to represent any sort of real-world phenomena.

That was a lot of ultra-specific griping, so let’s try to sum things up. What’s the trouble with Chernoff faces? I would argue that their trouble stems from their origin as a mashing together of facial features and data items into a rigid, highly schematized classification structure. The biggest fallacy of Chernoff faces is believing that a meaningful illustration of a human face can be constructed, piecemeal, out of prearranged parts. Think of it as a Mr. Potato Head style approach to representation. With a graphical approach like that, is it any surprise that the results feel so silly?

The question of the hour, then, is whether these troubles can be overcome, and we can actually craft Chernoff faces that actually meet expectations. To be sure, there’s a lot of pratfalls to overcome: as we’ve seen, the data probably needs to be at least a) socio-economic in nature, b) devoid of ‘mixed’ attributes (item 2), and c) plottable to germane facial features- eyebrows, eyes, mouth, and maybe head silhouette. That’s a lot of barriers to be overcome, and even if they are, the product is still going to have to tangle with all the emotional and cultural and biological baggage that humans have when it comes to looking at faces. I honestly don’t know if truly good Chernoff faces can ever be pulled off: maybe some brilliant alternative to the ‘Mr. Potato Head’ modus will solve all the issues, or maybe those silly little faces will interminably be more trouble than they’re worth.