Impact of Software Language Choice and Architecture on Energy Efficiency Abstract
Hidden in the discussion of rising energy costs and consumption in datacenters is the selection of software language. A physical server is an easy concept for the general public to understand since it compares so closely with a personal computer. Even the concept of virtualization, in which a host operating system, such as OS X, runs a guest operating system like Windows is somewhat intuitive, since it allows a "Mac person", to run their "Windows software on their Mac". Virtualization of servers in datacenters isn't that big of a leap from this concept, since it allows the running of multiple operating systems on one physical server.1
Energy efficiency gains with virtualization in a data center seem very logical since an operator can place multiple guest operating systems onto a physical server until that physical server is running at close to 100% utilization. Conceptually this is not much different than carpooling, in which a driver decided to add 3 more passengers to a 4-passenger car. Operating at a layer below the operating system level, software languages and software architecture offer a complex set of tradeoffs where efficiency is often unknowingly comprised by default choices of an architecture or language.
Amdahl's Law: Virtualization's "Inconvenient Truth"
While virtualization is a significant energy efficiency friendly technology, it is only one part of the story in the datacenter, and it can be a very coarse, imprecise solution to software efficiency. In 1967, Gene Amdahl presented at the AFIPS Sprint Joint Computer Conference a way to find the maximum expected improvement for a system using multiple processers. 2, In the diagram below 3, Amdahl's Law shows how the parallel portions of a program mixed with the sequential portions of a program lead to different levels of efficiency, depending on the ratio. In a more technical article, I go into some of the details of how this affects the choices a software developer makes. 4 In layman's terms, energy efficiency in software is much more complex than simply virtualizing servers in a datacenter.
In the real world, outside of Virtualization's ivory tower, this creates an energy efficiency problem that virtualization cannot completely solve. A very simple example of this is highlighted by Amazon EC2, which sells computing resources by "computing units". There are very few single processer choices and often the few offered do not have enough clock speed, memory, and disk size (other important resources needed to run software).5 The end result of this from a real world perspective, it may not be possible to actually buy a single CPU server that fits the performance characteristics needed for an application. As a result, a web application that could in theory, run on a machine with only one core, can in practice only run on a machine with 2 or more processers. As a result, a larger energy footprint is used then is necessary, because of constraints offered by the real world, and by the language itself.
To add even more complexity to the problem, energy can be wasted in scripting languages from other problems as well. One problem is that they often do not have true operating system threads, which means that they cannot dispatch parallel portions of their work to other cores easily. Another problem is that they are often using event loop architectures, with event loop architectures; the skill level of the developer required to do distributed computing is often so high, that in practice, wasted efficiency is created. 4 Finally, high level, interpreted scripting languages often run orders of magnitude slower than compiled languages.
Erlang: An Inherently Parallel Language
One potential solution to the problem of writing web applications in scripting languages that are inefficient is to use a language that is inherently parallel. Erlang is a language that was originally developed to support "distributed, fault-tolerant, soft-real-time, non-stop applications" by Ericsson. 6 One of the accidental successes of this language is that this original design features enable it to transparently scale to all of the processors and to distribute work to all of the processors on all of the servers in a datacenter. The practical result of using Erlang in lieu of Python, Ruby or Node.js is that a more efficient utilization of all of the processors in a distributed system is accomplished.
Companies such as Whatsapp, AdRoll, Spilgames, Cloud Mafia, Facebook, Klarna, T-Mobile, and Sqor have caught onto Erlang and using the language in their backend. Sqor, a social media app/platform is built to measure the social media ranking of athletes by measuring their content creation, curation, and engagement. Sqor has used Erlang since the beginning because of Erlangs scalability, fault tolerant, faster results, and reliability. Erlang is definitely not a magic bullet, and it won't be the best solution for many software applications. But, in the niche of multi-server distributed applications and controlling multi-node networks, Erlang Rules.
Photo above demonstrates Sqor software platform running on a virtualized datacenter.
In the short-term there is an energy efficiency problem in using high level scripting languages and in some software architectures, which Virtualization alone does not solve. A potential solution is to use the Erlang language if energy efficiency is desired in large scale distributed systems. Alternately, other choices like the .NET runtime and Java offer some of the same, but not all, of the advantages because they offer parallel tasks distribution frameworks.
Longer-term, new software languages are being developed like Scala and Go, which aim to emulate some of the advantages of inherent parallelism of Erlang and possibly improve upon it. Additionally, it seems logical that better utilization of multiple core systems by future software languages offers a compelling avenue for research into energy efficiency considering the widespread growth of software into almost every avenue of modern existence. If the very language we use is grossly inefficient, and uses the wrong paradigms, then the world may be optimizing, futilely at the wrong level.
Sqor - https://sqor.com
Reduce Power Consumption: http://www.vmware.com/files/pdf/WhitePaper_ReducePowerConsumption.pdf
Build a scalable architecture in the cloud: http://www.ibm.com/developerworks/cloud/library/cl-optimizepythoncloud2/index.html
Amazon EC2 Instance Types: http://aws.amazon.com/ec2/instance-types/
Erlang Language: http://en.wikipedia.org/wiki/Erlang_(programming_language)