THE BLOG
11/10/2014 03:31 pm ET Updated Jan 10, 2015

Dubious Digits: Is This Data Really That Accurate?

When numbers of any sort are presented, whether in mathematics, science, business, government or finance, the default assumption is that the data presented are reasonably reliable to the last digit presented. Thus, if a light bulb is listed as using 3.14 watts, then its actual usage is presumably between 3.13 and 3.15 watts, and certainly not 2.8 or 4.2 watts. Or if the average interest rate paid on a set of securities is listed as 2.718 percent, then a reasonable reader presumes that the actual figure is between 2.717 and 2.719 percent.

The total number of significant digits can vary widely, depending on context. Some studies require enormous precision -- the present authors have published research studies requiring numbers to be computed to tens of thousands of digits. In other contexts, only one or two digit accuracy is appropriate. In all these fields, presenting data to more digits of accuracy than is appropriate from the context can be deeply misleading, conveying much more reliability than is really present in the data.

Oil prices in 2040

The current authors got a good chuckle recently at a November 6, 2014 press report on the latest release of OPEC's World Oil Outlook. This press report noted that

By 2025, the nominal price will have hit $123.90, rising steadily to $177.40 by 2040. In real or inflation-adjusted terms the price will fall to $95.40 by 2020 and hit $101.60 by 2040, OPEC predicts.

Similarly over-precise figures from the OPEC report were presented in some other press reports, including at least one in a U.S. source. To double-check these figures, we checked the actual 2014 OPEC report, World Oil Outlook. Sure enough, these figures (including those in the quote above) are given there, on page 32.

With all due respect to the researchers in the OPEC organization, and acknowledging the considerable effort they have devoted to their analyses, it seems to us that such impressive precision in prices projected to 2020, 2025 and 2040 is simply not defensible. It is hard enough to anticipate the price of oil even a few months ahead -- for example, few if any analysts foresaw the huge drop in oil prices in October 2014. Any such predictions of future commodity prices (or stock prices, for that matter) are dependent on a large number of factors, from costs of exploration, refining and shipping, to highly hard-to-quantify effects such as natural disasters, international political events and economic reversals.

What's more, the technology of energy generation is changing rapidly and could drastically affect future oil prices. Already, fracking technology has dramatically increased U.S. oil and natural gas output, and is a major factor for the recent paradoxical drop in oil prices, in spite of horrific political developments in the Middle East. And if any of the incipient developments in fusion energy pan out commercially, which now appears significantly more likely than even a few months ago, then all bets are off -- by 2040, consumption of oil and other fossil fuels may be much lower than today.

In any event, we question the wisdom of repeating these figures, to the same precision, in press reports. An honest and well trained journalist would write something like

By 2025, the nominal price may exceed $120, rising steadily to over $170 by 2040. In real or inflation-adjusted terms the price is projected to fall to about $90 by 2020 and be over $100 by 2040, according to OPEC's estimates.

Press reports of government data

Another reminder of the vagaries of data in the national and international arena was the November 7, 2014 monthly release of employment data by the U.S. Bureau of Labor Statistics. Lost in the good news of the addition of 214,000 nonfarm jobs to the U.S. economy in October, as well as the drop in unemployment rate to just 5.8 percent (in stark contrast to much of Europe, by the way), was the fact, given at the bottom of the report, that the nonfarm employment figure for August had been revised from 180,000 to 203,000, and the figure for September had been revised from 248,000 to 256,000. In other words, an additional 31,000 persons had found work, a fact not mentioned in most press reports we have read. While such adjustments are routine for U.S. employment reports, they underscore the futility in reading too much precision into the monthly released figures -- they invariably will be further refined. So shouldn't this fact be more clearly communicated by the press?

Along this line, on November 4, 2014 the U.S. Bureau of Economic Analysis announced that in the three months ending in September, exports of goods and services averaged $197.4 billion. Perhaps we do not fully understand the U.S. government's methodology here, but four significant figures for something as fluid as exports of goods and services seems a bit more than what can be statistically justified, particularly in a brief press report that lacks the full context of these figures.

National and international budgets

Citing figures to more precision than is justified, particularly in public news releases that can never be placed in proper context, is hardly a disease limited to North America. For example, the European Commission's financial framework for 2014-2020 lists its total budget as 959,988 million Euros, the sum of similarly precise figures for each of the seven years from 2014 to 2020. These figures are compared with similar figures for the period 2007-2013, also given to five- and six-digit precision. Is this data ever really reliable to this level of accuracy? And even if it is (which we doubt), what is the point of presenting such data to six-digit accuracy in a public overview statement?

Similarly, the Australian recently reported a bank's chief economist as predicting that GDP growth in Australia in the second half of 2014 to average "0.45 per cent" per quarter. Or, with a wink of the eye, should that be 0.46 percent?...

Other dubious digits in the news

It is not hard to find other examples. Here are a few:
  1. On 14 Aug 2012, the U.S. Census Bureau soberly reported that at 2:29 pm, the U.S. population had reached 314,159,265 residents. While on one hard, the present authors were pleased to see the number Pi (= 3.1415926535...) once again in the news, nonetheless it is completely unrealistic to think that the American population can truly be pinned down even to within one million persons, much less to a single soul. Census figures are notoriously disputable, due to factors ranging from the influx or outflow of undocumented workers to the reluctance of some ethnic groups to respond to any census data collection.
  2. Similarly, on 31 Oct 2011, the New York Times reported that the seven billionth human had arrived on the planet, according to the United Nations Statistics Division. At least the Times report quickly added that other organizations, such as the U.S. Census Bureau, had pegged this milestone four months later, roughly in February 2012.
  3. On 5 Nov 2014, the California Policy Center, in a review of several financially stressed California cities, stated that the median income of Ione, California is $34,514. Are they sure this isn't $34,515 (smile)?
  4. On 4 Nov 2014, the Dallas Morning News, quoting the National Association of Realtors, reported that there would be 5.38 million sales of preowned homes in the U.S. in 2016. Or should that be 5.37 million?...
  5. On 5 Mov 2014, the Center for Responsive Politics reported that $113,479,706 was spent in the North Carolina Senate race. Maybe that is the current total, but why present it to nine significant digits?

Summary

While we can all be amused by examples such as those listed above, presenting data to appropriate levels of precision is serious business, particularly in contexts, such as public press communications, science policy, business or finance, where readers may not fully understand the full context, and might reasonably be misled. In particular, falsely precise predictions and/or projections undermine the whole rationale of scientific estimation. It is thus incumbent on the authors and producers of such data to only present data to levels of accuracy that can truly be rigorously justified. Otherwise, one is engaging pseudoscience, at the least, or worse.