Subscribe via RSS Feed Connect on LinkedIn

From measures to metrics: A fresh look at automated facial coding

March 4, 2015 2 Comments

emotient-dashboardRecently one of my favorite neuromarketing methodologies, automated facial coding (AFC), seems to have burst into the research mainstream. Within one week in January, Rana el Kaliouby, co-founder of Affectiva, was the recipient of a glowing profile in the New Yorker and Paul Ekman, undisputed guru of facial coding and scientific advisor to Emotient, received similar treatment in the Wall Street Journal. All of a sudden, facial coding is everywhere.

This isn’t all that surprising to those of us who have been predicting for awhile that the combination of scalability, intuitiveness, low cost, and quantitative panache offered by AFC and its kissing-cousin, automated eye-tracking, would eventually inflict a classic disruptive innovation on traditional neuromarketing vendors who insisted on clinging to their multi-million dollar Neuro-Labs (as described here, for example) in the hopes that these “toy” techniques would go away. Disruptors don’t go away, they just get better until they bypass the incumbents. So it seems to be happening in neuromarketing.

We can expect AFC to continue to grow in popularity as the technology improves and research buyers become comfortable with the lower price point, and adjust their cost-benefit expectations accordingly.

As automated facial coding grows, it also provides an excellent case study for addressing another important lesson that neuromarketing needs to learn – about the relationship between basic and applied science and the difference between selling measures and selling metrics.

There is a big debate going on in the facial analysis world these days about whether emotions are really universal and objectively identifiable across cultures, as Ekman and many others have claimed, or whether they are categories we construct and learn as a result of experience, a position taken most prominently by psychologist Lisa Barrett. The arguments and evidence offered by both sides, as well as some positions in between, are quite illuminating to anyone interested in appreciating the complexities of emotion and its impact on human behavior, including but hardly limited to consumer behavior.

seven-basic-emotions-matsumotoThis debate is currently animating a lot of discussion about facial coding methodologies, both manual and automatic, because it relates to the question of whether you (or a piece of software) can accurately infer someone’s “true” emotional state by viewing their facial expression. This is a variation of the classic “reverse inference” problem that looms over any effort to connect brain states to body states (see neuroscientist Russell Poldrack’s classic 2006 paperCan cognitive processes be inferred from neuroimaging data?”). In this case, we are essentially asking the reverse inference question:

If your face is smiling, does that mean your brain is happy?

I discussed reverse inference and its role in bridging the divide between basic and applied research in an earlier post, and I think those comments apply here as well. Asking whether facial expressions are hard-wired to emotional primitives in the brain or are learned and deployed strategically to facilitate social interaction is an important question worth arguing over.

But I would submit that assuming the emotional categories output by an AFC tool (e.g., “surprise” or “joy” or “disgust”) are necessarily measures of innate basic emotions is neither required nor recommended for the kinds of questions marketers and market researchers usually ask, and is probably not the best way to interpret these findings for commercial research buyers. So I would offer the following commentary on the above reverse inference question:

If your face is smiling, I don’t care what your brain is doing. I only care whether that particular behavior, perhaps when combined with other behaviors, makes it easier for me to predict what you’re going to do next.

Here we get to the distinction between measures and metrics. What is great about AFC is that it provides an objective measure of facial expressions. If you twitch those muscles, in just that way, the software is going to recognize this as an expression of “surprise,” and it’s going to make that designation every time your face moves those muscles in that way – that is, it’s going to have high reliability. Does the software do this with 100% accuracy? No. Is it going to get better as it classifies more and more faces, as networks get faster and broader, and as computing power continues to grow exponentially? You betcha.

What is important for applied market research at this point in the game is not the construct validity of the measure – whether you were “really” surprised when you made that face – but rather the predictive validity of the measure – whether, for example, people who make that expression in the first five seconds of watching an online ad are more likely to watch that ad to completion. Commercial vendors should leave the first question to the academic researchers, but can find real competitive differentiation by focusing on the second.

Metrics are built out of combinations of measures to achieve predictive results. Once you have a reliable measure, you can start looking for ways to turn it into a valid metric. A metric, unlike a measure, is a calculation or formula using one or more measures that relates to performance of some kind. Metrics an AFC vendor might consider testing as predictors of ad likability, for example, might be:

  • Presence of a surprise expression in the first five seconds of the ad.
  • The number of surprise expressions during the ad.
  • The average intensity of surprise expression times the number of smile expressions during the ad.
  • The combination of surprise in the first 10 seconds plus at least one smile in the last five seconds of the ad.

Most of the best examples of using AFC to predict consumer behavior have developed new metrics, often combining facial expression measures with other types of measures, to make successful predictions. For example, in a study of online “ad zapping” (abandoning a pre-roll ad before completion), Thales, Wedel, and Pieters found that higher rates of viewer retention were achieved by a combination of the magnitude of surprise expressions, the pattern of change in joy expressions, and the concentration of visual attention (as measured with eye-tracking). Does it matter whether those “surprise expressions” represented “true” surprise? I think not, at least for answering the questions this study set out to answer.

Metrics can also be derived from machine learning algorithms that don’t easily translate into simple formulas. Rossi, Fasel, and Sanfey, for example, trained a facial coding system to predict accept-or-reject decisions in a simple economic game with over 75% accuracy based on facial responses to offers. Machine learning is a tool behind all the top AFC vendors, and its outputs are only going to get more precise as the learning algorithms gorge on larger and larger datasets of facial responses to all kinds of marketing stimuli. As both the New Yorker and Wall Street Journal articles make clear, AFC is becoming the focal point where Big Data and neuromarketing are coming together.

The point I want to make about measures and metrics is this:

Neuromarketing today has too many measures, and not enough metrics.

This is something I hope to see change as vendors begin to compete on the quality of their results, not just the sophistication of their methods. Most neuromarketing vendors, whatever technology they deploy, offer measures of what we called in Neuromarketing for Dummies the “three master variables” of neuromarketing: Attention, Emotion, and Memory. But they too often fail to tell clients what these measures mean – that is, they don’t build them into performance-predicting metrics. Attention, for example, may be good in some circumstances, but not good in others (as Robert Heath has been arguing for over a decade). As we see from studies like Teixeira’s, maybe it’s concentration of attention, or timing of attention, or duration of attention, or pulsing of attention, or even absence of attention, that predicts the results that research buyers care most about. Neuromarketers have to start competing on their metrics, not their measures, if they want to attract the mainstream buyers who so far have awarded the field less than one percent of their marketing research budgets.

I believe web-based facial coding, along with web-based eye tracking, and perhaps web-based implicit association testing (a subject for another post), may be emerging as the most fertile grounds for developing, testing, and deploying predictive metrics in neuromarketing.

Image 1 from

Image 2 from, (c) David Matsumoto, 2008.

About the Author:

Steve is a writer, speaker, researcher, and marketing consultant. He is author of Intuitive Marketing (2019), a study of persuasion and influence in marketing theory and practice, and co-author of Neuromarketing for Dummies (2013), a comprehensive overview of neuromarketing science, applications, methodologies, and ethics. He is Managing Partner at Intuitive Consumer Insights, where he focuses on marketing education and consulting.

Comments (2)

Trackback URL | Comments RSS Feed

  1. Thomas Stewart says:

    This article is informative as well as insightful.

  2. Kőszegi Bálint says:

    Awsome article!
    I subscribed to the blog.

Leave a Reply

Prove you're human, please * Time limit is exhausted. Please reload CAPTCHA.