Co-authored with Gregory Macnamara
The madness of March is ever upon us with Harvard, yet again, busting brackets of many and Mercer possibly spoiling many others billion dollar dreams. This March, my research and teaching in the area of bracketology got picked up by the national press with articles and interviews by the New York Times, CBS Evening News, USA Today, ESPN, Bloomberg Business Week, The Atlantic, Associated Press, National Public Radio, among others. My research is often done collaboratively with others, almost always with Amy Langville of the College of Charleston and generally with students and alums of Davidson College. For instance, this article is coauthored with Greg Macnamara, a Davidson alum who now works as an Economic Consultant at Analysis Group in Washington, DC. Greg was instrumental, along with Andrew Liu a senior at Davidson, at getting our website March MATHness up and ready for public use.
With the various media attention on radio, television and in print, the website received great national, and to our surprise, international attention. Google Analytics showed the bracket software ran, not only in the United States, but in over 25 countries including Canada, the United Kingdom, Germany, Australia, India, Venezuela, China and Japan. In all, the code was run over 16,000 times comprising over 2,000 unique rankings. As an educator, how delightful to see work from my research and codes created collaboratively with current and former students used so broadly.
The purpose of the code is to allow others to explore the mathematics of ranking and possibly and hopefully improve their brackets. Remember, some of the brackets were people, like my daughter who is in early elementary school, running the codes for their first time and experimenting with the parameters. As people ran their codes, we kept track of the parameters so we could look at the resulting brackets. What can we learn? More to the point, what can we learn as people learned with the code. Anything?
In all, we constructed 5 meta-brackets combining the results of everyone's rankings. On the website, there are two options of ranking methods, Colley, a linear system that incorporates only wins and losses, or Massey, a linear system that includes scores of games. Interestingly, Massey was chosen about twice as often as Colley. The methods provide both ratings and rankings, so our 5 meta-brackets consist of an average of each rating, an average of each ranking, and an overall bracket that takes the average ranking from all Massey and Colley runs.
How are brackets created? Keep in mind. We create rankings of the teams and the higher ranked team is predicted to win. Further, while two methods may produce the same brackets, they may not have the same underlying rankings of the teams. Further, the closer the ratings between teams the closer the outcome of the game.
What do we find? First, all 5 brackets picked the same 16 teams. Keep in mind, aggregated information, in a certain sense, finds the trend in data.
On the first evening of games, the meta-bracket accurately predicted 13 out of the 16 games. As such, they beat over 90 percent of the over 11 million brackets submitted to ESPN. They missed the big three upsets, ND State, Dayton, and Harvard. Did every bracket miss the first round predictions? No. Some of my students and researchers got the upsets - just not all of them. For example, 1.9 percent picked Dayton, 0.2 percent picked Harvard, and 9.5 percent picked ND St. Overall there was a 0.0002 percent chance of picking perfect bracket based on percent of brackets that picked each team.
On the second day of games, the 5 brackets differed on Tennessee/Massachusetts and Memphis/GW. The bracket incorporating both Massey and Colley rankings correctly chose both of those games, helping it to correctly predict 13 games again for a total of 26 correct in the first round which was good enough to beat 97.7 percent of brackets on ESPN. The other 4 meta-brackets correctly chose 12 games for a total of 25 correct predictions in the first round, which was still good enough to beat 91.8 percent of brackets on ESPN. Once again, while no bracket picked them all correct, some did pick the big upsets as 0.04 percent picked Mercer and 8.1 percent picked Stephen F. Austin. In a win for math, almost 70 percent of brackets correctly picked Tennessee!
Suppose you could be 90 percent certain who would win any given game. Even then, you'd only have a perfect first day of predictions once every 5 years! You'd only get the first round perfect once every 30 years!
Suppose we take our ratings and treat them as probabilities. The bigger the difference, the bigger the probability that the higher rated team will win. For this, we find that the probability of getting the first round right, based on the crowd-sourced Massey method based on ratings is 1 in 12.7 million and the crowd-sourced Colley ratings, 1 in over 150 million. Our probability estimates put it at 1 in about 7.5 million.
For a comparison, the National Bracket on ESPN (crowdsourcing non-math brackets) gives odds of picking a perfect first round at 1 in about 9.2 million. However, the best probabilities we've seen belong to Ken Massey himself as the odds of picking a perfect bracket through the first round based on probabilities on his website were 1 in roughly 2.4 million.
This weekend, we begin Round 2 and we are most certain to see more exciting and for those with not-yet-busted-brackets possibly quite maddening or most exhilarating results. For our group in quaint Davidson, NC, there is the athletic drama and also an opportunity to continue to learn and search for the wisdom in the crowd of March MATHness.