The performance advances of supercomputers in these past decades have been remarkable. The machines I used as a student in the 1960s probably had a peak performance of a few million calculations per second or megaflops. Gigaflops (billions) peak speeds were achieved in 1985, teraflops (trillions) in 1997, and petaflops (a 1 followed by fifteen zeros) in 2008.
The supercomputing community is now aiming for exascale computing, - 1,000,000,000,000,000,000 calculations per second. The pursuit of exascale-class systems was a hot topic at the recent SC09 supercomputing conference.In the quest for the fastest machines, supercomputers have always been at the leading edge of advances in IT, identifying the key barriers to overcome and experimenting with technologies and architectures that generally then appear in more commercial products a few years later.
Through the 1970s and 1980s, the fastest supercomputers were based on vector architectures and used highly sophisticated technologies and liquid cooling methods to remove the large amounts of heat they generated.
By the late 1980s, these complex and expensive technologies ran out of gas. As the microprocessors used in personal computers and technical workstations were becoming increasingly powerful, you could now build supercomputers using these CMOS micros and parallel architectures at a much lower price than the previous generations of vector machines. A similar transition to microprocessor components and parallel architectures took place in the mainframes used in commercial applications.Massively parallel architectures, using tens to hundreds of thousands of processors from the PC and Unix markets have dominated supercomputing over the past twenty years. They got us into the terascale and petascale ranges. But, they will not get us to exascale. Another massive technology and architectural transition now looms for supercomputing and the IT industry in general.
Anticipating the major challenges involved in the transition to exascale, the Department of Energy (DOE) and DARPA launched a series of activities around three years ago to start planning for such systems.
This DARPA ExaScale Computing Study provides a very good overview of the key technology challenges. The study identified four major challenges where current trends are insufficient, and disruptive technology breakthroughs are needed to make exascale computing a reality.
The Energy and Power Challenge is pervasive, affecting every part of the system. Today’s leading edge petascale systems consume between 2 - 3 Megawatts (MW) per petaflop. It is generally agreed that an exaflop system must consume around 20 MWs, otherwise their operating costs would be prohibitively expensive. The 1000-fold increase in performance from petascale to exascale must thus be achieved at no more than a 10-fold increase in power consumption.
Such stretch targets were actually achieved in the transition from terascale in the late 1990s to petascale now. But no one believes it can be done again with today’s technologies, hence the assumption that a technology and architectural transition as profound as the one two decades ago ago is now required.
The Memory and Storage Challenge is a major consequence of the power challenge. The currently available main memories (DRAM) and disk drives (HDD) that have dominated computing in the last decade consume way too much power. New technologies are needed.
The Concurrency and Locality Challenge is another consequence of the power challenge. Over the past twenty years we have been able to achieve performance increases through a combination of faster processes and higher levels of parallelism. But, we are no longer able to increase the performance of a single processing element by turning up the clock rate due to power and cooling issues. We now have to rely solely on increased concurrency.
The top terascale systems of ten years ago had roughly 10,000 processing elements. Today’s petscale system are up in the low 100,000s. But, because, the only way to now increase performance toward an exascale system is massive parallelism, an exaflop supercomputer might have 100s of millions of processing elements or cores. Such massive parallelism will require major innovations in the architecture, software and applications for exascale systems. This DARPA Exascale Software Study provides a good overview of the software breakthroughs required.
Finally, we have the Resiliency Challenge, that is, the ability of a system with such huge number of components to continue operations in the presence of faults. An exascale system must be truly autonomic in nature, constantly aware of its status, and optimizing and adapting itself to rapidly changing conditions, including failures of its individual components. The exascale resiliency challenges are discussed in this DARPA report on System Resiliency at Extreme Scales.
There are vast business implications to such a massive technology and architectural transition. For one, the ecosystem of the past twenty years, where PCs have provided the components for parallel supercomputers, is now giving way to a new business ecosystem. Consumer electronics, mobile devices and embedded sensors are now the new partners of the extreme scale supercomputing community, because they share the same requirements for plentiful, powerful and inexpensive components that consume little power.
This transition to a new ecosystem already started about five years ago. IBM’s Blue Gene family uses relatively low power, embedded cores as its processing elements, and Roadrunner’s hybrid design includes the Cell processors originally developed for Sony’s PlayStation 3.
The most powerful supercomputers in the US have generally been developed for, and first installed at DOE’s national labs, either as part of its Advanced Scientific Computing Research (ASCR) program in support of energy and environmental research, or the Advanced Simulation and Computing program in support of nuclear weapons research. These DOE labs typically work closely with the vendors in the requirements and design of such leading edge supercomputers.
To begin to understand the requirements for exascale machines, ASCR sponsored a series of town hall meetings, which held open discussions on the most critical and challenging problems in energy, the environment and basic science. These meetings where then followed by a series of technical workshops, each focusing on a specific scientific domain.
The DOE town halls and workshops have identified the opportunity for exascale computing to revolutionize the way we approach challenges in energy research, environmental sustainability and national security. They also identified the impact of exascale computing on key science areas like biology, astrophysics, climate science and nuclear physics.
One of their most compelling conclusions is that with exascale computing, we are reaching a tipping point in predictive science, an area with potentially major implications across a whole range of new, massively complex problems. Let me explain.
High end supercomputers are generally designed for either capability or capacity computing. Capability supercomputers dedicate the whole (or most of the) machine to solve a very large problem in the shortest amount of time. Capacity supercomputers, on the other hand, support large numbers of users solving different kinds of problems simultaneously.
While both kinds of supercomputing are very important, initiatives designed to push the envelope, like DOE’s exascale project, tend to focus on the development of capability machines to address Grand Challenge problems like those mentioned above, that could not be solved in any other way.
Capability computing has been primarily applied to what is sometimes referred to as heroic computations, where just about the whole machine is applied to a single task. And, without a doubt, there are quite a number of problems that we will be able to address with machines 1000 times more powerful.
But, at least as exciting, is the potential for exascale computing to address a class of highly complex problems that have been beyond our reach, not just due to their sheer size, but because of their inherent uncertainties and unpredictability. The way to deal with such uncertainty is to simultaneously run multiple ensembles or copies of the same applications, using many different combinations of parameters, and thus be able to explore the solution space of these otherwise unpredictable problems. This will let us search for optimal solutions to many problems in science and engineering, as well as enable us to calculate the probabilities of extreme events.
This new style of predictive modeling will help us apply more scientific methodologies to many kinds of problems, from climate studies to the design of safe nuclear reactors. Beyond science and engineering, there are many disciplines that will benefit from such predictive capabilities, from economics and medicine to business and government.
Ensemble computing has attributes of both capability and capacity computing. It devotes the whole machine to one problem, but it does so by running many copies of the problem in parallel with different initial conditions. Innovative techniques are already emerging to help developers better program and manage such ensemble-oriented applications.
Finally, it is important not to underestimate the impact of exascale breakthroughs to more capacity oriented machines, as well as to smaller machines that share the same technologies, architecture, software and applications. Many of the innovation that will enable us to develop exascale class supercomputers will yield relatively inexpensive petascale class systems as well as smaller ones. The wider the access to such families of systems, the richer the overall ecosystem including applications, users and technologies.In addition, these same exascale innovations will find wide usage in the more commercially oriented cloud computing systems. The technology requirements are quite similar, especially the need for low power, low cost components. They also share similar requirements for highly efficient, autonomic system management. One can actually view cloud-based systems as a kind of exascale class supercomputers designed to support embarrassingly parallel workloads, such as massive information analysis or huge numbers of sensors and mobile devices.
In its Strategy for American Innovation, the Obama administration listed exascale computing among the Grand Challenges of the 21st century in science, technology and innovation, that “will allow the nation to set and meet ambitious goals that will improve our quality of life and establish the foundation for the industries and jobs of the future.” It explicitly called for
“An exascale supercomputer capable of a million trillion calculations per second – dramatically increasing our ability to understand the world around us through simulation and slashing the time needed to design complex products such as therapeutics, advanced materials, and highly-efficient autos and aircraft.”
This is a truly exciting and important challenge.
Some quick thoughts. The number of processors being talked about here is getting ever closer to human neuronal levels. So we should expect to take leads from knowledge in that area. Our brains work slowly with very high interconnectivity. Thus, getting really low power processors harnessed in their billions seems to be something to explore. Perhaps the nodes need to store little apart from state information and the explicit representations of data that we are used in today's digital systems can give way to a more distributed 'system' memory. One can also envisage hybrid situations with a bit of both. The organisations and reoganisability of such machines itself needs exploration, and perhaps genetic algorithms have an important role here. Thay can drive solution methods from the simple to the complex without humans being directly involved, although explaining how results were obtained could be interesting. One would think that the reproducability of results, and their sensitivity to variations in input, might be important criteria for system utility value.
Posted by: Jon Duke | February 12, 2010 at 05:01 PM
Interesting comments from Jon Duke. But human neurons are in the multi-billion number range rather than hundreds of millions.
But it is an interesting model to consider. Neurons aren't very fast. Yet what our brains can do cannot be matched by the largest numbers of the fastest supercomputers.
Are we pursuing the wrong goal? Your excellent post talks about developing faster processors that are very low power consuming.
Should we be instead focusing on the software rather than the speed of the individial processors?
We could build a "neuron-scale" supercomputer running at just 2,000 calories a day that is 1,000 times as powerful as an exascale supercomputer. That's a big power saving :)
Posted by: Tomforemski | February 14, 2010 at 12:48 AM
There are about 100 bn neurons in a human brain and maybe a quadrillion synapses. I doubt that the brain is a very good model for a supercomputer, any more than a horse is a good model for a sports car. The design constraints for a brain -- such as being part of a mobile self-reproducing system -- are very different than for a supercomputer.
Posted by: Bernard Finucane | February 14, 2010 at 04:27 AM
Clearly, a horse is nothing like a sports car, but a brain is very much like a computer. Surely, we can learn a lot from hundreds of millions of years of evolution/experimentation.
Posted by: Tomforemski | February 15, 2010 at 12:48 PM
Science is all about quantities and measures and brain is poor at both. However, if a problem needs exascale computing then only a human brain can find a smarter work-around to avoid it. Some of the numerical techniques invented during the infancy of computing are being given a go-by in the rush to increase computing power.
Posted by: Ravindra Bachalli | April 15, 2010 at 02:01 AM
Our ideas are too often over-connected with our jobs. It is very hard to exclude it and in the same time to keep connection between those two domains. Anyway, here we talk about such two different things. One is “supercomputing”, and the other is artificial intelligence (AI). Although, those two concepts are close in many details, they are not the same thing. Supercomputing is a domain where we can achieve extreme amount of symbolic (mathematical) operations, and AI is a domain where we look for a creativity and new way of processing information. If we compare any computer with the human brain, we can easily find out that human brain is not a computer (meaning=it does not operate on formal symbolic level of operate mnemonics). So, possible integration or closing of those two concepts will arrive, but only if we accept the fact that speed is not absolute criteria of functioning. There are many criteria more. Egg. human is not fastest biological being (it is one special fly), but he can move faster than any other biological creature. Think of it.
Posted by: dbonacin | May 10, 2010 at 07:07 PM
"Consumer electronics, mobile devices and embedded sensors are now the new partners of the extreme scale supercomputing community, because they share the same requirements for plentiful, powerful and inexpensive components that consume little power."
I wonder if this community aiming to perform embarrassing parallel workloads in the Exascale range before the end of the decade will focus their efforts in the first place on mobile chips over graphics chips? While consuming a considerable amount of power and costing in the range of $600 p. board they can perform an amazing amount of FLOPs.
The AMD HD 5970 has a processing power of respectively 4.64TFlops (single precision), 0.93TFlops (double precision) respectively, while have a maximum power consumption of 294W.
By reading quotes like
"It is completely clear that GPUs are now general purpose parallel computing processors with amazing graphics, and not just graphics chips anymore" (Jen-Hsun Huang, co-founder and CEO of NVIDIA)
I wonder how ARM chipsets (do they even support Double precision?) will be able to come even close to these numbers considering power efficiency and costs?
Posted by: Maximilian | May 12, 2010 at 12:37 PM
Dr. Wladawsky-Berger,
I'm no expert in this field, but maybe that is an advantage. Here is a blog post of mine that was largely inspired as a response to yours:
"ExaFLOPS Computing, Is it a Foolhardy Pursuit Headed for a Painful Belly-FLOPS?"
http://qbnets.wordpress.com/2010/06/24/exaflops-computing-is-it-a-foolhardy-pursuit-headed-for-a-painful-belly-flops/
Posted by: Dr. Robert R. Tucci | June 25, 2010 at 05:35 PM
It goes to wonder if the current pace in scaling will ever level off. There have been such vast advancements over the past 20 years that it almost seems like it would be impossible to keep that pace up. Do you think it will level off at some point?
Posted by: Rich | July 26, 2010 at 10:03 AM
As long as the market demands for computing continues, I think that the hardware and software technologies will be able to keep up. The architectures will likely evolve, more centralized or more distributed depending on the problems and economics. But, for the foreseeable future, I believe that one way or another the current pace in scaling will continue.
Posted by: Irving Wladawsky-Berger | July 27, 2010 at 06:57 AM
However, at least in the exciting, is exascale computing to solve a highly complex and beyond our capacity, not only because of its large-scale potential of being kind of problem, but because of its inherent uncertainty and unpredictability. Way to deal with these uncertainties is to run multiple bands or the same copy of the application, and use many different parameters, which can explore these issues, can not predict the solution space. This will let us in science and engineering the best solution to many problems, and enable us to calculate the probability of extreme events.
Posted by: sto credits | July 30, 2010 at 03:58 AM
I have to agree that computing has become so complex. However, it's what makes the world go round and is becoming something that society has become dependent on. However, the complex task is left to those that can keep up with this demand for higher technologies as we become more dependent than we ever have been. Look at 10 years ago and where we are today. Just imagine what 10 year from now will deliver.
Posted by: Shawn Montgomery | July 30, 2010 at 04:23 PM
Supercomputing is a domain where we can achieve extreme amount of symbolic (mathematical) operations, and AI is a domain where we look for a creativity and new way of processing information. If we compare any computer with the human brain, we can easily find out that human brain is not a computer (meaning=it does not operate on formal symbolic level of operate mnemonics). So, possible integration or closing of those two concepts will arrive
Posted by: ffxiv gil | August 02, 2010 at 04:17 AM
The Extreme Scale Systems Center's (ESSC) primary goal is to help enable the best and most productive use possible of emerging peta-/exa-scale high-performance computers. Of particular interest are the systems expected from the DARPA High Productivity Computing Systems (HPCS) program. The ESSC is intended to foster long-term collaborative relationships and interactions between DoD, DoE, DARPA, NRL and ORNL technical staff that will lead to improved and potentially revolutionary approaches to reducing time to solution of extreme-scale computing and computational science problems. The ESSC will support the major thrust areas required to accomplish this goal.
Posted by: tera gold | August 02, 2010 at 04:19 AM