02/04/2013 02:58 pm ET | Updated Apr 06, 2013

What Is Predictive of the Oscars?

I spent several weeks this winter immersed in spreadsheets full of historical Oscar data to explore methods of using fundamentals to predict Oscar winners. Fundamental models work really well in forecasting political elections, where significant categories of data include: past election results, incumbency, presidential approval, ideology, economic indicators, and biographical data. Yet, fundamental models are much less efficient in forecasting awards shows, where they would include categories such as: studio inputs, box office success, subjective ratings, Oscar nominations, and biographical data. The reason is simple, prior to the other awards shows, there is a dearth of variables that properly identify individual award categories, as most data is just movie specific.

But, there are two goals of fundamental models: forecasting and determining which variables have predictive power. While fundamental models do not make great forecasts for the Oscars relative to other data including prediction markets, they can still provide insight into which variables we should follow.

All of the insights in this column are into the predictive power of variables, conditional on a movie getting a nomination for an Oscar, at the time of the nomination. How well a movie does in the box office, especially after a few weeks, the popular ratings, and how many nominations the movie receives are all significant predictive variables.

Studio Inputs: This category includes variables like: budget, release date, genre, and when the movie goes to wide release. Some of these variables correlated strongly with whether a movie gets a nomination, but conditional on being a nominee, they are not predictive of the eventual winner. For example, movies released late in year are more likely to get a nomination for an Oscar relative to movies released in the spring, but, conditional on getting nomination, they are no more likely to win the Oscar.

Box Office Success: This category includes variables like: gross revenue, screens, average gross revenue per screen, these values on the first week of wide release and the first four weeks of wide release, and many other combinations. Between gross revenue and number of screens there are some really interesting variables to consider here. This is further complicated by the staggered opening of many Oscar nominated movies. After much investigation, the predictive power in this category is highly correlated with the change that happens over the first few weeks. A key inflection point appears to be between weeks four and five. For Best Picture I follow this variable closely: *Gross Week 5 -- Gross Week 4.

From week four to week five, Argo went from $13.3 million to $9.0 million, while Lincoln went from $18.0 million to $12.4 million. Thus, from this rubric, Lincoln has a slightly healthier $6.8 million to $4.7 million, but this is a not a significant difference.

Subjective Rating: This category includes variables like: popular and critical ratings, along with the MPSAA rating. In the battle between popular and critical ratings the people win! Popular ratings dwarf the critical ratings in predictive power.

Interestingly, Lincoln and Argo are tied in critical ratings, but Argo leads Lincoln 93 to 86 in popular ratings.

Oscar Nominations: It is no surprise that the Oscar voters value their own judgment, and movies with more nominations tend to do well in winning Oscars! There is significant and meaningful predictive power in the number of Oscar nominations a movie receives.

In this category, Lincoln dominates with 12 nominations to Argo's 7 nominations.

Biographical Data: This category includes variables like: age, previous nominations, previous wins, and lifetime wins. Nominations and wins certainly have predictive power in the four main categories of: actor, actress, supporting actor and supporting actress. For these categories more nominations are a positive predictive sign. While not the case in the main categories, in less well-known categories, repeated victories by the same people are more common and, correlate significantly with victory.

This column syndicates with my personal website: