While the massively parallel Xeon Phi coprocessor faces supercomputers leveraging Nvidia’s graphic processing units (GPUs), Intel’s many-integrated-core (MIC) architecture will prevail, according to the Senior Research Scientist at the National Center for Supercomputing Applications (NCSA) Innovative Systems Laboratory at the University of Illinois, Urbana-Champaign
In a presentation at the Fifth International Workshop on Parallel Programming Models and Systems Software for High-End Computingn (P2S2) held in Pittsburgh Sept. 10-12, Volodymyr Kindratenko asserted that GPU accelerators will eventually lose out to Intel’s MIC because its architecture only requires a fine-tuning of parallel x86 code already running on supercomputers today.
“The software development effort for a MIC is more comparable to a performance tuning effort, rather than with a code reimplementation as is needed to use GPUs,” he told attendees at his session entitled Hardware/Software Divergence in Accelerator Computing. “MIC architecture will eventually win the battle.”
“You can just recompile with MIC and start optimizing, but with GPUs you cannot simply recompile–you have to rewrite your algorithm just to get it to run before you can even start optimizing it.”
<a href="http://goparallel.sourceforge.net/wp-content/uploads/2012/10/Untitled1.jpg"><img class="size-full wp-image-1927 aligncenter" title="Untitled1" src="http://goparallel.sourceforge.net/wp-content/uploads/2012/10/Untitled1.jpg" alt="" width="300" height="300" /></a>NCSA’s Lincoln Cluster used Intel Xeon main processors and Nividia Tesla graphic processing units (GPUs).
NCSA has built several large-scale GPU-based supercomputers, including “Lincoln”, which houses 1536 cores (384 Xeon quad-core “Hapertown” processors) connected over PCIe to 96 Nvidia Tesla S1070 coprocessors, each with four 512-bit wide GPUs with 240 treads each. NCSA more recently got advance access to Intel’s first MIC processor, the Xeon Phi, which it has been testing by running the same parallel molecular dynamics algorithms and other benchmarks already rewritten to run on the GPU-based LIncoln.
Kindratenko, however, emphasized his opinions come from practical considerations, not from any tests performed at NCSA. According to Kindratenko, the MIC architecture just fits-in better overall to existing supercomputer architectures, rather than GPUs, which are vector-based.
“To get any advantage with a GPU, your code has to be vectorizable,” said Kindratenko. “The MIC architecture is broader, with many cores, wide vector units and high bandwidth to memory.”
Kindratenko concluded his presentation, made at the P2S2 panel session entitled “Battle of the Accelerator Stars,” by reflecting on how the MIC architecture is only considered an accelerator today could go mainstream.
“When the ‘war’ is over, what we consider today to be an accelerator, will be in our mainstream processor,” he predicted.
The “Battle of the Accelerator Stars” panel discussion targeted hardware/software divergence in accelerator computing as its theme, according to panel moderator, professor Yong Chen at Texas Tech University.
“The panel discussion generally concluded that accelerators will play a critical role in the future computing systems from high-performance systems, high-end servers, to desktop,” said Chen. “Programmability remains a critical issue for the wide adoption and success of accelerator computing. The hardware and software platforms that ease programmability, while not sacrificing performance, are most likely to win the battle.”
P2S2 was held in conjunction with The 41st International Conference on Parallel Processing (ICPP).
Colin Johnson is a Geeknet contributing editor and veteran electronics journalist, writing for publications from McGraw-Hill’s Electronics to UBM’s EETimes. Colin has written thousands of technology articles covered by a diverse range of major media outlets, from the ultra-liberal National Public Radio (NPR) to the ultra-conservative Rush Limbaugh Show. A graduate of the University of Michigan’s Computer, Control and Information Engineering (CICE) program, his master’s project was to “solve” the parallel processing problem 20 years ago when engineers thought it would only take a few years. Since then, he has written extensively about the challenges of parallel processors, including emulating those in the human brain in his John Wiley & Sons book Cognizers – Neural Networks and Machines that Think.