Here is a question for you: can public opinion data be used to make predictions about Oscar winners? Is it possible that the opinions of the masses could somehow match up with the opinions of the much-vaunted and oft-thanked 'Members Of The Academy'? The answer, it seems, is yes -- with caveats.
I became inadvertently caught up in the Oscars this year. It is far from an area of expertise for me, and I've only seen one of the nominated films. Yet we did some polling on it, and now I sit here a few days after the big day and find myself totally engrossed in the manner and measure of predicting Oscar winners.
As you probably know, the Oscars are voted on by members of the Academy of Motion Picture Arts and Sciences, a group of a few thousand individuals who -- to my knowledge -- cannot practically or ethically be 'polled' or surveyed in advance of the big day. So how do we make guesses about winners, should we care to?
There are two main sources of Oscar Prediction data that I can find -- and here I'm focused on hard data rather than intuition, guesses or 'expert opinion.' These two data sources are general public polling and predictive modeling based on prior information ('priors'). For the former I will, naturally, use Ipsos/Reuters data, and for the latter I turn to data wizard Nate Silver.
A brief explanation of the two approaches:
Ipsos undertook public opinion polling for our client Thomson Reuters, who wanted to know more about which Oscar-nominated films the public had seen, who they wanted to win, and who they thought would win. Of course, since the Oscars winners are decided on by Academy members, with no input from the public, our polls are a simple exercise rather than any attempt to predict or project outcomes. As such, we did not try to pick winners; rather, the data was used for Reuters' entertainment journalists to help contextualize the pre-Oscars buzz. Any accuracy on the part of our polls would be -- I assumed -- pure luck.
Nate Silver of the New York Times' FiveThirtyEight blog had a different approach: he developed a prediction model that looks at all the other film awards preceding the Oscars, and identifies which have the best historic success rate at predicting an Oscar win. For example, the Directors Guild award has a very high probability of predicting an Oscar winner, while the Golden Globes have a much lower probability. This, Silver points out, is not mere coincidence; many of the other awards that precede the Oscars are also voted on by people who are members of the Academy, who of course vote again for the Oscars.
So how do the two approaches compare on their ability to pick the correct winner?
Aside from Best Picture, the findings are nearly identical! The results were achieved by very different means, and by analyzing completely different audiences, but yielded similar outcomes.
This is interesting because, by any accounting, Silver's model should be by far the more accurate: his data contains very strong priors about the success of the films and actors at other awards events. More importantly, these prior inputs contain data collected among actual members of the Academy, who would never consent to be polled about their Oscars votes before the big day. Our polling, on the other hand, is merely a survey of the general public who have no input at all into the Oscars voting process.
So why does our poll data -- based on a sample of nationally representative Americans who almost certainly are not members of the Academy -- come so close to Silver's more sophisticated model? And why is there such a disparity when it comes to predicting Best Picture as opposed to actors/actresses?
One could argue that Academy members and the public aren't that different when it comes to what moves they like. If this is the case, proxies for a film's public popularity (such as box office success) should give strong guidance in predicting a winner. Or perhaps Academy members' choices are (consciously or unconsciously) influenced by the box office success of these films. This would mean that the views of the America public -- by virtue of 'voting' with their wallets by paying to see films -- are influencing this Oscar voting process, however tangentially. To test these ideas, I conducted a simple rank order assessment of best picture winners for the past three years by annual ranking and box office success.
In the last three years, the Best Picture Oscar winner has been ranked either 4th or 7th (out of 10) among fellow nominees when it comes to box office success. So we can't assume that public opinion -- or, more accurately, public pocketbook -- is a key influencing factor on Academy members' decisions for Best Picture.
The Reuters/Ipsos poll and the FiveThirtyEight analysis are different animals entirely: one measured public opinion, and the other measured award outcomes based on votes cast by tiny, elite industry groups. So why are the two metrics so well aligned for the Director/Actor/Actress categories -- and so poorly aligned for Best Picture?
I have a little theory: that the public and members of the Academy are driven by a similar set of decision parameters when choosing a person, compared to when they choose a film.
It makes sense that the public and the Academy members would agree more when it comes to picking a person; even when under the auspices of Art and Science, choosing humans in almost any talent-based competition comes down in large part to popularity. This has been reiterated by articles of late detailing the 'lobbying' process that goes on within Hollywood to secure a win. Nominees show up at events, make calls, 'kiss the hand', etc -- to the point that when a nominee declines to do these things, it is worthy of notice.
It is also possible -- and likely -- that the 'lobbying' process and accompanying PR activities that go into trying to secure an Oscar win impact the public and members in the same way. More events and paparazzi shots mean more blog entries, articles, and magazine covers. It is simple self-promotion.
So why is Silver's model so much more accurate than a public poll when it comes to Best Picture? My theory is that choosing a film involves a different set of decision-making parameters, and here expert opinion diverges more from lay opinions. Or, perhaps the type of 'lobbying' undertaken for this particular award differs and impacts more on members' finely attuned ears, while falling deaf on the general public, who use simple proximity heuristics -- e.g. familiarity and favorability ("Did I see the film? Did I like the film?") to choose a winner.
As a final check, I also looked for nationally representative polling data from previous Oscar years. I didn't find much, although I'm sure there is more out there. The problem with the old polls I found is that they asked who Americans thought should win (rather than who would win), and so the comparison to my findings above is less useful, because the polls are measuring something slightly different. These older polls were also conducted by phone, and the Reuters/Ipsos polling is done online - again, this affects comparability (perhaps the online population is also more 'tuned in' to entertainment and Hollywood than the offline one?).
Regardless, however, the findings aren't bad: in 2007 the Best Actor and Best Actress awards went to the nominees ranked second in the Harris polls, and in 2005 the same awards went to the nominees ranked first. So while the comparison is imperfect, at least it doesn't wildly contradict my theory above.
What does this tell us about how to place your Oscar bets next year? It is fairly straightforward: public polling (or models) will get you most of the way there for the awards for people -- assuming a few wildcards -- but Best Picture is best predicted via the use of strong priors and not public polling data.
However, the point of this article is not necessarily to help Oscar punters, or to question the process; I know basically nothing about this industry apart from what I have gleaned from my lunchtime perusal of a few gossip blogs. The point to underline is that it is possible to predict or model event outcomes with wide-ranging sets of data, and to do so fairly accurately -- as long as you understand the parameters and limitations of each input to your model.
When, for example, Ipsos conducts online polls for our client Thomson Reuters, we are fully aware that the 10 percent-15 percent of Americans who aren't online cannot -- by definition -- be included in our poll. But, we've also learned a lot about how our data acts, and what the parameters are for political decision-making by the public and by the people we interview online. It is this process that allows us to make accurate predictions despite -- as some critics would claim -- the lack of a probability-based sample.
Neither polling nor modeling are the sole future of election (and Oscar!) predictions; it will be an evolving marriage between these two approaches. Both are now essential in the Brave New World of Big Data and advanced analytics.
For now, the next question is... how can we use this information to predict the next Pope?!