Yellowstone Supercomputer Sports Massive Xeon E5 Array Share your comment!

A supercomputer with a massive array of Intel Xeon E5 cores will debut next month, multiplying by 29  times the computing power at the National Center for Atmospheric Research (NCAR).

Called Yellowstone, the 1.5 petaFLOPS supercomputer at NCAR’s Wyoming Supercomputing Center (NWSC, Cheyenne) will be used by the Computational and Information Systems Lab for weather forecasting, the development of detailed climate models and other critical environmental research.

Yellowstone has been under development for several years at NCAR–long before Intel announced its Xeon Phi 50+ multi-core coprocessors due this fall–but already NCAR is planning to evaluate Intel’s Many Integrated Core (MIC) architecture for follow-on systems to Yellowstone.

“Yellowstone is based on our InfiniBand connected network of 72,288 Xeon E5 cores,” said Anke Kamrath, National Center for Atmospheric Research director for computing operations and services. “In 2013 we will be evaluating a standalone MIC based system using Xeon Phi’s, which could form the basis for post-Yellowstone follow-on systems circa 2015.”

Yellowstone’s petaFLOPS high-performance computing (HPC) applications will have access to over 144 terabytes of main memory, and will be integrated into its Globally Accessible Data 

Environment–GLADE–which boasts a 12-fold expansion in total storage capacity and a 15-fold improvement in sustained input/output (I/O) bandwidth over its current Bluefire supercomputer. A 20-fold increase in Yellowstone’s data analysis and visualization resources will be provided by subsystems named after volcanic geological features at nearby Yellowstone National Park–Geyser and Caldera.  Geyser consists of 16 IBM x3850 nodes using 40-cores–Intel Westmere-EX processors with one NVIDIA Tesla graphic-processor units 

(GPUs) per node and a terabyte of memory. Caldera will use 16 IBM x360 M4 nodes using Intel Sandy Bridge EP/AVX processors with 16 cores and 64 Gbytes of memory plus two NVIDIA GPUs per node.

Yelllowstone is based on IBM’s iDataPlex architecture with  4,518 nodes each of which harnesses dual 2.6-GHz Intel Xeon E5s with eight cores per socket. Each core will have access to 2-Gbytes–16-Gbytes per node–all taking advantage of 1.6-GHz double data rate type three synchronous dynamic random access memory (DDR3). Interconnections among nodes is handled by a fourteen data rate (FDR) InfiniBand full fat-tree switch fabric from Mellanox Technologies (Sunnyvale, Calif.) with over 13 Gbyte per second bidirectional bandwidth per node–over 31 terabyte-per-second peak bidirectional bisection bandwidth–and 2.5 microseconds maximum latency.

The GLADE storage subsystem will use 76 IBM DCS3700s each with 90 drives for a total of 6,840 2-terabyte disk drives. With a total capacity of over 13 petabytes, GLADE will offer more than 91 Gbytes per second of aggregate I/O bandwidth. Dual StorageTek SL8500 tape libraries will offer the ability to house massive data sets in excess of 100 petabytes.

The standalone MIC based system, to debut in November 2012, will use 16 Xeon Phi dual-socket processor nodes–8-core per processor–with 64 Gbytes of 1.6-GHz DDR3 memory per node interconnected with Mellanox’s FDR full fat tree InfiniBand.

By basing Yellowstone on Intel Xeon E5s instead of the IBM Power6 cores used by its previous Bluefire supercomputer, Yellowstone will lower from 7.5- to 1.3 its peak watt-per-gigaFLOPS  performance. CISL estimates that Yellowstone’s sustained MFLOPS per watt performance will top 43, compared to 6 for Bluefire, which will continue to operate through November 2012 after which it will be shut down.


Colin Johnson is a Geeknet contributing editor and veteran electronics journalist, writing for publications from McGraw-Hill’s Electronics to UBM’s EETimes. Colin has written thousands of technology articles covered by a diverse range of major media outlets, from the ultra-liberal National Public Radio (NPR) to the ultra-conservative Rush Limbaugh Show. A graduate of the University of Michigan’s Computer, Control and Information Engineering (CICE) program, his master’s project was to “solve” the parallel processing problem 20 years ago when engineers thought it would only take a few years. Since then, he has written extensively about the challenges of parallel processors, including emulating those in the human brain in his John Wiley & Sons book Cognizers – Neural Networks and Machines that Think.

Posted on by R. Colin Johnson, Geeknet Contributing Editor