Intel’s Parallel Programming Tools for 2013 6 Comments

Intel has just released its latest parallel programming tools–Parallel Studio XE for shared memory applications and Cluster Studio XE for message passing interface (MPI) and hybrid architectures. The bundles  include support for all 3rd Generation Intel Core processors, the 50+ core Xeon Phi coprocessor due out this fall, and Haswell processors due out in 2013. 

“Intel’s Parallel Studio XE and Cluster Studio XE are designed to help the C++ and Fortran programmer squeeze more performance out of their multi-core processors, including the latest Xeon Phi coprocessor and Intel’s next-generation Haswell processors,” explains James Reinder, Director Software Products and Multi-core Evangelist at Intel. “We include all our optimized compilers, libraries and analysis tools in a single package for shared memory applications with Parallel Studio, with Cluster Studio being a superset which adds distributed memory capabilities including an MPI library, an MPI checker and MPI analysis tools.”

Intel’s parallel processing suites aim to increase performance anywhere C++ and Fortran are running on multi-core processors: desktops, workstations, servers, clusters , even Ultrabooks. Parallel Studio can optimize threaded algorithms all way down to single-socket Ultrabooks, but can scale all the way up to multi-socket applications with scores of cores. And Cluster Studio adds MPI’s message-passing capability, which can extend parallel algorithms and hybrid architectures across any number of cores on connected servers. And a dozen new profiling features–including low-overhead Java support for improved mixed-mode profiling–provide intelligent guidance for how to tune parallel applications for optimal performance.

Analysis tools dissect applications before they are fully implemented, allowing programmers to do “what-ifs” with all popular methods of increasing multi-core performance. The Intel Advisor XE, for instance, has check boxes for site-, task- and lock-overhead as well as lock-contention reduction, plus a task chunking analyzer helps partition tasks for optimal performance.

“Intel Advisor XE is like having a trusted mentor inspecting your parallel code and suggesting time-proven ways to improve it,” said Reinder.

Support for up to 512-bit vectors also harnesses the latest ultra-wide advanced vector execution (AVX) units for single-instruction multiple-data (SIMD) algorithms. Specialized tools, such as a pointer checker to detect and cure buffer overflows, as well as memory-growth analysis for reclaiming unallocated space, makes code developed with Parallel- and Cluster-Studio XE more stable, reliable and less vulnerable to security breeches. Parallel code is more fault tolerant in general, cluster code is more reliable in particular, and improved compilers and libraries boosts the performance of even serial algorithms. 

Niceties include a conditional numerical reproducibility module that insures that algorithms produce the same identical results from run-to-run even if they are using floating-point numbers that are subject to round-off errors.

Parallel Studio XE 2013 (available immediately) and Cluster-Studio XE 2013 (available later in 2012) support C/C#/C++, Fortran 2008, Linux, Windows and the latest MPI 2.2 standards. And besides supporting all Core and Xeon processors, including the massively parallel Xeon Phi’s, Intel is also including advanced support for the successor to the current Ivy Bridge processors, the high-speed Haswell micro-architecture due next year. 

________________________________________________________________

Colin Johnson is a Geeknet contributing editor and veteran electronics journalist, writing for publications from McGraw-Hill’s Electronics to UBM’s EETimes. Colin has written thousands of technology articles covered by a diverse range of major media outlets, from the ultra-liberal National Public Radio (NPR) to the ultra-conservative Rush Limbaugh Show. A graduate of the University of Michigan’s Computer, Control and Information Engineering (CICE) program, his master’s project was to “solve” the parallel processing problem 20 years ago when engineers thought it would only take a few years. Since then, he has written extensively about the challenges of parallel processors, including emulating those in the human brain in his John Wiley & Sons book Cognizers – Neural Networks and Machines that Think.

Posted on by R. Colin Johnson, Geeknet Contributing Editor
6 comments
Richard Rankin
Richard Rankin

How about easing up on the parallel tools prices? This has got to be a microscopic portion of your total revenue and the sale of these tools will have a direct effect on the applications available to drive sales of the hardware. These tools are an investment for Intel if you truly believe in the direction you are taking and you should be practically giving them away - especially at this stage of the game.

Richard Rankin
Richard Rankin

To a certain extent I imagine it can. It would depend on the skill of the author of the interpreter. You can use some parallel techniques in Java but you are at the mercy of the interpreter. Who knows what evil lurks in the heart of Java? Perhaps if you compiled natively and your compiler had some parallel options. But I'm not sure why you should "feel the need for speed" and use an interpreted language which is by design slower than a compiled language? To write web apps? How many people will be using a browser on a system with this kind of horsepower? Perhaps it would help disperse the old saying about Java: "Write once, run slow everywhere." Given the possible processor combinations, the interpreter would have to have a huge number of compile modules to respond to processor topology enumeration and other system specific configurations. I won't go on a Java rant except to say it is an execrable language. There is a reason Intel only provides FORTRAN and C/C++ (and I'm not fond of C++). Java has had its moment of glory and will now fade away. Even with the most recent Oracle patches Java is incredibly insecure. Perhaps you should consider C++ which is closest to Java (or as it is sometimes called, C++--)

Reza
Reza

Is there any chance that the Java bytecode can take advantage of these parallel goodies ?

Richard Rankin
Richard Rankin

I must agree. I think that automatic parallelization is decades off but Intel's compiler and toolset are by far the best available on any hardware platform. I can always find something to gripe about but this one is obvious.Gosh I wish I had some hardware to work with.

R. Colin Johnson
R. Colin Johnson

When I started doing parallel programming--way back in college--there were only a bunch of scholarly papers to suggest methods for optimizing shared memory and message passing schemes. The field has come a long way, and now there are lots of books available that illucidate proven techniques. Parallel Studio XE and Cluster Studio XE go one step further by giving you tools to not only to code those techniques, but to actually analyze your algorithms and suggest techniques that you may not have thought of, plus they predict the speed-up you'll gain before you code them. There is still no tool to automatically parallelize code, but Parallel Studio XE and Cluster Studio XE make the job a whole lot easier.